This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: revised and updated new-if-converter patchâ [PATCH] fix PR46029: reimplement if conversion of loads and stores


On 07/17/2015 01:57 PM, Abe wrote:
Dear all,

Relative to the previous submission of this same patch, the below
corrects some minor spacing and/or indentation issues,
misc. other formatting fixes, and makes the disabled vectorization tests
be disabled via "xfail" rather than by adding spaces to
deliberately cause the relevant scanned-for text to not be found by
DejaGNU so as to prevent the DejaGNU line from being interpreted.

The below is also based on a Git checkout that was rebased to the latest
upstream check-in from today,
so it should merge cleanly with trunk as of today.

Regards,

Abe








2015-06-12  Sebastian Pop  <s.pop@samsung.com>
             Abe Skolnik  <a.skolnik@samsung.com>

     PR tree-optimization/46029
     * tree-data-ref.c (struct data_ref_loc_d): Moved...
     (get_references_in_stmt): Exported.
     * tree-data-ref.h (struct data_ref_loc_d): ... here.
     (get_references_in_stmt): Declared.

     * tree-if-conv.c (struct ifc_dr): Removed.
     (IFC_DR): Removed.
     (DR_WRITTEN_AT_LEAST_ONCE): Removed.
     (DR_RW_UNCONDITIONALLY): Removed.
     (memrefs_read_or_written_unconditionally): Removed.
     (write_memrefs_written_at_least_once): Removed.
     (ifcvt_could_trap_p): Does not take refs parameter anymore.
     (ifcvt_memrefs_wont_trap): Removed.
     (has_non_addressable_refs): New.
     (if_convertible_gimple_assign_stmt_p): Call has_non_addressable_refs.
     Removed use of refs.
     (if_convertible_stmt_p): Removed use of refs.
     (if_convertible_gimple_assign_stmt_p): Same.
     (if_convertible_loop_p_1): Removed use of refs.  Remove initialization
     of dr->aux, DR_WRITTEN_AT_LEAST_ONCE, and DR_RW_UNCONDITIONALLY.
     (insert_address_of): New.
     (create_scratchpad): New.
     (create_indirect_cond_expr): New.
     (predicate_mem_writes): Call create_indirect_cond_expr.  Take an extra
     parameter for scratch_pad.
     (combine_blocks): Same.
     (tree_if_conversion): Same.

     testsuite/
     * g++.dg/tree-ssa/ifc-pr46029.C: New.
     * gcc.dg/tree-ssa/ifc-5.c: Make it exactly like the FFmpeg kernel.
     * gcc.dg/tree-ssa/ifc-8.c: New.
     * gcc.dg/tree-ssa/ifc-9.c: New.
     * gcc.dg/tree-ssa/ifc-10.c: New.
     * gcc.dg/tree-ssa/ifc-11.c: New.
     * gcc.dg/tree-ssa/ifc-12.c: New.
     * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: Disabled.


diff --git a/gcc/common.opt b/gcc/common.opt
index 6b2ccbc..49f6b9f 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1413,7 +1413,7 @@ Common Report Var(flag_tree_loop_if_convert)
Init(-1) Optimization
  Convert conditional jumps in innermost loops to branchless equivalents

  ftree-loop-if-convert-stores
-Common Report Var(flag_tree_loop_if_convert_stores) Optimization
+Common Report Var(flag_tree_loop_if_convert_stores) Init(-1) Optimization
  Also if-convert conditional jumps containing memory writes

  ; -finhibit-size-directive inhibits output of .size for ELF.
I don't see this change mentioned anywhere in the ChangeLog. That seems to be a relatively common situation. I called out some of those issues, but didn't try to catch them all. Please make sure all changes are reflected in the ChangeLog.




diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cunroll-10.c
b/gcc/testsuite/gcc.dg/tree-ssa/cunroll-10.c
index f392fbe..775fcd5 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/cunroll-10.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/cunroll-10.c
This change isn't mentioned in the ChangeLog.


 diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c
b/gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c
index 875d2d3..fc69ca2 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-5.c
@@ -1,5 +1,5 @@
  /* { dg-do compile } */
-/* { dg-options "-c -O2 -ftree-vectorize -fdump-tree-ifcvt-stats" {
target *-*-* } } */
+/* { dg-options "-c -O2 -ftree-vectorize -ftree-loop-if-convert-stores
-fdump-tree-ifcvt-stats" { target *-*-* } } */
ISTM this really should be two tests, one with this code as-is, another that exactly matches the ffmpeg kernel.




diff --git a/gcc/testsuite/gcc.dg/vect/vect-mask-loadstore-1.c
b/gcc/testsuite/gcc.dg/vect/vect-mask-loadstore-1.c
index 11e9533..cbd3378 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-mask-loadstore-1.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-mask-loadstore-1.c
I don't see this mentioned in the ChangeLog. It also doesn't look like you actually disabled the test. Obviously this will need to be addressed before your patch could go in.

diff --git a/gcc/testsuite/gcc.target/i386/avx2-gather-6.c
b/gcc/testsuite/gcc.target/i386/avx2-gather-6.c
index 180b490..aedc66a 100644
--- a/gcc/testsuite/gcc.target/i386/avx2-gather-6.c
+++ b/gcc/testsuite/gcc.target/i386/avx2-gather-6.c
Not mentioned in the ChangeLog.   xfail needs to be fixed.

Similarly for the others were you added xfails.


diff --git a/gcc/tree-data-ref.c b/gcc/tree-data-ref.c
index e6ff4ef..2a1e27d 100644
--- a/gcc/tree-data-ref.c
+++ b/gcc/tree-data-ref.c
@@ -4363,22 +4363,11 @@ compute_all_dependences (vec<data_reference_p>
datarefs,
    return true;
  }

-/* Describes a location of a memory reference.  */
-
-typedef struct data_ref_loc_d
-{
-  /* The memory reference.  */
-  tree ref;
-
-  /* True if the memory reference is read.  */
-  bool is_read;
-} data_ref_loc;
Moving data_ref_loc_d could potentially be split out and merged in immediately assuming we've agreed that you're likely going to need data_ref_loc from within tree-data-ref.c and tree-if-conv.c.

Similarly for exporting get_references_in_stmt.

Picking out these things that are necessary prerequisites, but not part of the core changes you need to make will make the overall review process simpler.



+ifcvt_could_trap_p (gimple stmt)
Presumably you're able to consider something with a vuse which does not otherwise trap as non-trapping because of the use of the scratchpad?

Perhaps you should clarify the comment to indicate what the "refined for the needs of the if-conversion" actually means. Remember, someone other than you might be reading this code in the future.


@@ -1459,8 +1309,7 @@ is_cond_scalar_reduction (gimple phi, gimple
*reduc, tree arg_0, tree arg_1,
    if (PHI_ARG_DEF_FROM_EDGE (header_phi, latch_e) != PHI_RESULT (phi))
      return false;

-  if (gimple_code (stmt) != GIMPLE_ASSIGN
-      || gimple_has_volatile_ops (stmt))
+  if (gimple_code (stmt) != GIMPLE_ASSIGN || gimple_has_volatile_ops
(stmt))
      return false;
This looks like a gratutious change. It probably doesn't matter either way. If you really want to make this change, please do so independent of your patch.



    if (!flow_bb_inside_loop_p (loop, gimple_bb (stmt)))
@@ -1882,8 +1731,7 @@ insert_gimplified_predicates (loop_p loop, bool
any_mask_load_store)
        stmts = bb_predicate_gimplified_stmts (bb);
        if (stmts)
      {
-      if (flag_tree_loop_if_convert_stores
-          || any_mask_load_store)
+      if (flag_tree_loop_if_convert_stores || any_mask_load_store)
Similarly. Pure whitespace/formatting changes should be handled independently.

@@ -1925,6 +1773,80 @@ mask_exists (int size, vec<int> vec)
    return -1;
  }

+/* Inserts at position GSI a statement "ADDRESS_OF_AI = &AI;" and
+   returns the ADDRESS_OF_AI.  */
+
+static tree
+insert_address_of (tree ai, gimple_stmt_iterator *gsi)
Presumably we've verified ai is addressable prior to calling this routine :-)


+
+/* Insert at the beginning of the first basic block of the current
+   function the allocation on the stack of N_BYTES of memory and
+   return a pointer to this scratchpad memory.  */
+
+static tree
+create_scratchpad (unsigned int n_bytes)
+{
+  basic_block bb = single_succ ( ENTRY_BLOCK_PTR_FOR_FN(cfun) );
Whitespace is wrong.  Should be
 = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));


+/* Returns a memory reference to the pointer defined by the
+   conditional expression: pointer = cond ? &A[i] : scratch_pad; and
+   inserts this code at GSI.  */
+
+static tree
+create_indirect_cond_expr (tree ai, tree cond, tree *scratch_pad,
+               gimple_stmt_iterator *gsi, bool swap)
+{
+  tree cond_expr;
+  tree pointer;
+  gimple pointer_stmt;
+
+  /* address_of_ai = &A[i];  */
+  tree address_of_ai = insert_address_of (ai, gsi);
+
+  /* Allocate the scratch pad only once per function.  */
+  if (!*scratch_pad)
+    *scratch_pad = create_scratchpad (64);
Obviously the hardcoded 64 needs to change.


+
+  /* pointer = cond ? address_of_ai : scratch_pad;  */
+  pointer = create_tmp_reg (TREE_TYPE (address_of_ai), "_ifc_");
+
+  cond_expr = build3 (COND_EXPR, TREE_TYPE (address_of_ai),
+              unshare_expr (cond),
+              swap ? *scratch_pad  : address_of_ai,
+              swap ? address_of_ai : *scratch_pad);
Note that the types of scratch_pad and address_of_ai might be different. I know that's allowed at the generic level, but I'm not sure that's allowed at the gimple level. Do you need to do a pointer conversion to stay safe WRT the gimple typesystem here?

One of the things that's unclear to me is how this all interacts with the alias oracle.

Richi?  Comments about that?


Overall I like what I see and I think you're heading in the right direction. There's some things we ought to break out and commit now which will keep the review work easier in the long term.

JEff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]