Re: revised and updated new-if-converter patch… [PATCH] fix PR46029: reimplement if conversion of loads and stores

Jeff Law law@redhat.com
Tue Aug 4 05:05:00 GMT 2015


On 07/17/2015 01:57 PM, Abe wrote:
> Dear all,
>
> Relative to the previous submission of this same patch, the below
> corrects some minor spacing and/or indentation issues,
> misc. other formatting fixes, and makes the disabled vectorization tests
> be disabled via "xfail" rather than by adding spaces to
> deliberately cause the relevant scanned-for text to not be found by
> DejaGNU so as to prevent the DejaGNU line from being interpreted.
>
> The below is also based on a Git checkout that was rebased to the latest
> upstream check-in from today,
> so it should merge cleanly with trunk as of today.
>
> Regards,
>
> Abe
>
>
>
>
>
>
>
>
> 2015-06-12  Sebastian Pop  <s.pop@samsung.com>
>              Abe Skolnik  <a.skolnik@samsung.com>
>
>      PR tree-optimization/46029
>      * tree-data-ref.c (struct data_ref_loc_d): Moved...
>      (get_references_in_stmt): Exported.
>      * tree-data-ref.h (struct data_ref_loc_d): ... here.
>      (get_references_in_stmt): Declared.
>
>      * tree-if-conv.c (struct ifc_dr): Removed.
>      (IFC_DR): Removed.
>      (DR_WRITTEN_AT_LEAST_ONCE): Removed.
>      (DR_RW_UNCONDITIONALLY): Removed.
>      (memrefs_read_or_written_unconditionally): Removed.
>      (write_memrefs_written_at_least_once): Removed.
>      (ifcvt_could_trap_p): Does not take refs parameter anymore.
>      (ifcvt_memrefs_wont_trap): Removed.
>      (has_non_addressable_refs): New.
>      (if_convertible_gimple_assign_stmt_p): Call has_non_addressable_refs.
>      Removed use of refs.
>      (if_convertible_stmt_p): Removed use of refs.
>      (if_convertible_gimple_assign_stmt_p): Same.
>      (if_convertible_loop_p_1): Removed use of refs.  Remove initialization
>      of dr->aux, DR_WRITTEN_AT_LEAST_ONCE, and DR_RW_UNCONDITIONALLY.
>      (insert_address_of): New.
>      (create_scratchpad): New.
>      (create_indirect_cond_expr): New.
>      (predicate_mem_writes): Call create_indirect_cond_expr.  Take an extra
>      parameter for scratch_pad.
>      (combine_blocks): Same.
>      (tree_if_conversion): Same.
>
>      testsuite/
>      * g++.dg/tree-ssa/ifc-pr46029.C: New.
>      * gcc.dg/tree-ssa/ifc-5.c: Make it exactly like the FFmpeg kernel.
>      * gcc.dg/tree-ssa/ifc-8.c: New.
>      * gcc.dg/tree-ssa/ifc-9.c: New.
>      * gcc.dg/tree-ssa/ifc-10.c: New.
>      * gcc.dg/tree-ssa/ifc-11.c: New.
>      * gcc.dg/tree-ssa/ifc-12.c: New.
>      * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: Disabled.


> diff --git a/gcc/common.opt b/gcc/common.opt
> index 6b2ccbc..49f6b9f 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1413,7 +1413,7 @@ Common Report Var(flag_tree_loop_if_convert)
> Init(-1) Optimization
>   Convert conditional jumps in innermost loops to branchless equivalents
>
>   ftree-loop-if-convert-stores
> -Common Report Var(flag_tree_loop_if_convert_stores) Optimization
> +Common Report Var(flag_tree_loop_if_convert_stores) Init(-1) Optimization
>   Also if-convert conditional jumps containing memory writes
>
>   ; -finhibit-size-directive inhibits output of .size for ELF.
I don't see this change mentioned anywhere in the ChangeLog.  That seems 
to be a relatively common situation.  I called out some of those issues, 
but didn't try to catch them all.  Please make sure all changes are 
reflected in the ChangeLog.




> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cunroll-10.c
> b/gcc/testsuite/gcc.dg/tree-ssa/cunroll-10.c
> index f392fbe..775fcd5 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/cunroll-10.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/cunroll-10.c
This change isn't mentioned in the ChangeLog.


  diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c
> b/gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c
> index 875d2d3..fc69ca2 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c
> @@ -1,5 +1,5 @@
>   /* { dg-do compile } */
> -/* { dg-options "-c -O2 -ftree-vectorize -fdump-tree-ifcvt-stats" {
> target *-*-* } } */
> +/* { dg-options "-c -O2 -ftree-vectorize -ftree-loop-if-convert-stores
> -fdump-tree-ifcvt-stats" { target *-*-* } } */
ISTM this really should be two tests, one with this code as-is, another 
that exactly matches the ffmpeg kernel.




> diff --git a/gcc/testsuite/gcc.dg/vect/vect-mask-loadstore-1.c
> b/gcc/testsuite/gcc.dg/vect/vect-mask-loadstore-1.c
> index 11e9533..cbd3378 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-mask-loadstore-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-mask-loadstore-1.c
I don't see this mentioned in the ChangeLog.  It also doesn't look like 
you actually disabled the test.  Obviously this will need to be 
addressed before your patch could go in.

> diff --git a/gcc/testsuite/gcc.target/i386/avx2-gather-6.c
> b/gcc/testsuite/gcc.target/i386/avx2-gather-6.c
> index 180b490..aedc66a 100644
> --- a/gcc/testsuite/gcc.target/i386/avx2-gather-6.c
> +++ b/gcc/testsuite/gcc.target/i386/avx2-gather-6.c
Not mentioned in the ChangeLog.   xfail needs to be fixed.

Similarly for the others were you added xfails.


> diff --git a/gcc/tree-data-ref.c b/gcc/tree-data-ref.c
> index e6ff4ef..2a1e27d 100644
> --- a/gcc/tree-data-ref.c
> +++ b/gcc/tree-data-ref.c
> @@ -4363,22 +4363,11 @@ compute_all_dependences (vec<data_reference_p>
> datarefs,
>     return true;
>   }
>
> -/* Describes a location of a memory reference.  */
> -
> -typedef struct data_ref_loc_d
> -{
> -  /* The memory reference.  */
> -  tree ref;
> -
> -  /* True if the memory reference is read.  */
> -  bool is_read;
> -} data_ref_loc;
Moving data_ref_loc_d could potentially be split out and merged in 
immediately assuming we've agreed that you're likely going to need 
data_ref_loc from within tree-data-ref.c and tree-if-conv.c.

Similarly for exporting get_references_in_stmt.

Picking out these things that are necessary prerequisites, but not part 
of the core changes you need to make will make the overall review 
process simpler.



> +ifcvt_could_trap_p (gimple stmt)
Presumably you're able to consider something with a vuse which does not 
otherwise trap as non-trapping because of the use of the scratchpad?

Perhaps you should clarify the comment to indicate what the "refined for 
the needs of the if-conversion" actually means.  Remember, someone other 
than you might be reading this code in the future.


> @@ -1459,8 +1309,7 @@ is_cond_scalar_reduction (gimple phi, gimple
> *reduc, tree arg_0, tree arg_1,
>     if (PHI_ARG_DEF_FROM_EDGE (header_phi, latch_e) != PHI_RESULT (phi))
>       return false;
>
> -  if (gimple_code (stmt) != GIMPLE_ASSIGN
> -      || gimple_has_volatile_ops (stmt))
> +  if (gimple_code (stmt) != GIMPLE_ASSIGN || gimple_has_volatile_ops
> (stmt))
>       return false;
This looks like a gratutious change.  It probably doesn't matter either 
way.  If you really want to make this change, please do so independent 
of your patch.


>
>     if (!flow_bb_inside_loop_p (loop, gimple_bb (stmt)))
> @@ -1882,8 +1731,7 @@ insert_gimplified_predicates (loop_p loop, bool
> any_mask_load_store)
>         stmts = bb_predicate_gimplified_stmts (bb);
>         if (stmts)
>       {
> -      if (flag_tree_loop_if_convert_stores
> -          || any_mask_load_store)
> +      if (flag_tree_loop_if_convert_stores || any_mask_load_store)
Similarly.  Pure whitespace/formatting changes should be handled 
independently.

> @@ -1925,6 +1773,80 @@ mask_exists (int size, vec<int> vec)
>     return -1;
>   }
>
> +/* Inserts at position GSI a statement "ADDRESS_OF_AI = &AI;" and
> +   returns the ADDRESS_OF_AI.  */
> +
> +static tree
> +insert_address_of (tree ai, gimple_stmt_iterator *gsi)
Presumably we've verified ai is addressable prior to calling this 
routine :-)


> +
> +/* Insert at the beginning of the first basic block of the current
> +   function the allocation on the stack of N_BYTES of memory and
> +   return a pointer to this scratchpad memory.  */
> +
> +static tree
> +create_scratchpad (unsigned int n_bytes)
> +{
> +  basic_block bb = single_succ ( ENTRY_BLOCK_PTR_FOR_FN(cfun) );
Whitespace is wrong.  Should be
  = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));


> +/* Returns a memory reference to the pointer defined by the
> +   conditional expression: pointer = cond ? &A[i] : scratch_pad; and
> +   inserts this code at GSI.  */
> +
> +static tree
> +create_indirect_cond_expr (tree ai, tree cond, tree *scratch_pad,
> +               gimple_stmt_iterator *gsi, bool swap)
> +{
> +  tree cond_expr;
> +  tree pointer;
> +  gimple pointer_stmt;
> +
> +  /* address_of_ai = &A[i];  */
> +  tree address_of_ai = insert_address_of (ai, gsi);
> +
> +  /* Allocate the scratch pad only once per function.  */
> +  if (!*scratch_pad)
> +    *scratch_pad = create_scratchpad (64);
Obviously the hardcoded 64 needs to change.


> +
> +  /* pointer = cond ? address_of_ai : scratch_pad;  */
> +  pointer = create_tmp_reg (TREE_TYPE (address_of_ai), "_ifc_");
> +
> +  cond_expr = build3 (COND_EXPR, TREE_TYPE (address_of_ai),
> +              unshare_expr (cond),
> +              swap ? *scratch_pad  : address_of_ai,
> +              swap ? address_of_ai : *scratch_pad);
Note that the types of scratch_pad and address_of_ai might be different. 
  I know that's allowed at the generic level, but I'm not sure that's 
allowed at the gimple level.  Do you need to do a pointer conversion to 
stay safe WRT the gimple typesystem here?

One of the things that's unclear to me is how this all interacts with 
the alias oracle.

Richi?  Comments about that?


Overall I like what I see and I think you're heading in the right 
direction.  There's some things we ought to break out and commit now 
which will keep the review work easier in the long term.

JEff



More information about the Gcc-patches mailing list