[PR63185][RFC] Improve DSE with branches

Richard Biener rguenther@suse.de
Tue May 15 13:11:00 GMT 2018


On Tue, 15 May 2018, Richard Biener wrote:

> On Tue, 15 May 2018, Richard Biener wrote:
> 
> > On Tue, 15 May 2018, Richard Biener wrote:
> > 
> > > On Mon, 14 May 2018, Kugan Vivekanandarajah wrote:
> > > 
> > > > Hi,
> > > > 
> > > > Attached patch handles PR63185 when we reach PHI with temp != NULLL.
> > > > We could see the PHI and if there isn't any uses for PHI that is
> > > > interesting, we could ignore that ?
> > > > 
> > > > Bootstrapped and regression tested on x86_64-linux-gnu.
> > > > Is this OK?
> > > 
> > > No, as Jeff said we can't do it this way.
> > > 
> > > If we end up with multiple VDEFs in the walk of defvar immediate
> > > uses we know we are dealing with a CFG fork.  We can't really
> > > ignore any of the paths but we have to
> > > 
> > >  a) find the merge point (and the associated VDEF)
> > >  b) verify for each each chain of VDEFs with associated VUSEs
> > >     up to that merge VDEF that we have no uses of the to classify
> > >     store and collect (partial) kills
> > >  c) intersect kill info and continue walking from the merge point
> > > 
> > > in b) there's the optional possibility to find sinking opportunities
> > > in case we have kills on some paths but uses on others.  This is why
> > > DSE should be really merged with (store) sinking.
> > > 
> > > So if we want to enhance DSEs handling of branches then we need
> > > to refactor the simple dse_classify_store function.  Let me take
> > > an attempt at this today.
> > 
> > First (baby) step is the following - it arranges to collect the
> > defs we need to continue walking from and implements trivial
> > reduction by stopping at (full) kills.  This allows us to handle
> > the new testcase (which was already handled in the very late DSE
> > pass with the help of sinking the store).
> > 
> > I took the opportunity to kill the use_stmt parameter of
> > dse_classify_store as the only user is only looking for whether
> > the kills were all clobbers which I added a new parameter for.
> > 
> > I didn't adjust the byte-tracking case fully (I'm not fully understanding
> > the code in the case of a use and I'm not sure whether it's worth
> > doing the def reduction with byte-tracking).
> > 
> > Your testcase can be handled by reducing the PHI and the call def
> > by seeing that the only use of a candidate def is another def
> > we have already processed.  Not sure if worth special-casing though,
> > I'd rather have a go at "recursing".  That will be the next
> > patch.
> > 
> > Bootstrap & regtest running on x86_64-unknown-linux-gnu.
> 
> Applied.
> 
> Another intermediate one below, fixing the byte-tracking for
> stmt with uses.  This also re-does the PHI handling by simply
> avoiding recursion by means of a visited bitmap and stopping
> at the DSE classify stmt when re-visiting it instead of failing.
> This covers Pratamesh loop case for which I added ssa-dse-33.c.
> For the do-while loop this still runs into the inability to
> handle two defs to walk from.
> 
> Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Ok, loop handling doesn't work in general since we run into the
issue that SSA form across the backedge is not representing the
same values.  Consider

 <bb 3>
 # .MEM_22 = PHI <.MEM_12(D)(2), .MEM_13(4)>
 # n_20 = PHI <0(2), n_7(4)>
 # .MEM_13 = VDEF <.MEM_22>
 bytes[n_20] = _4;
 if (n_20 > 7)
   goto <bb 5>;

 <bb 4>
 n_7 = n_20 + 1;
 # .MEM_15 = VDEF <.MEM_13>
 bytes[n_20] = _5;
 goto <bb 3>;

then when classifying the store in bb4, visiting the PHI node
gets us to the store in bb3 which appears to be killing.

       if (gimple_code (temp) == GIMPLE_PHI)
-       defvar = PHI_RESULT (temp);
+       {
+         /* If we visit this PHI by following a backedge then reset
+            any info in ref that may refer to SSA names which we'd need
+            to PHI translate.  */
+         if (gimple_bb (temp) == gimple_bb (stmt)
+             || dominated_by_p (CDI_DOMINATORS, gimple_bb (stmt),
+                                gimple_bb (temp)))
+           /* ???  ref->ref may not refer to SSA names or it may only
+              refer to SSA names that are invariant with respect to the
+              loop represented by this PHI node.  */
+           ref->ref = NULL_TREE;
+         defvar = PHI_RESULT (temp);
+         bitmap_set_bit (visited, SSA_NAME_VERSION (defvar));
+       }

should be a workable solution for that.  I'm checking that, but
eventually you can think of other things that might prevent us from
handling backedges.  Note the current code tries to allow
looking across loops but not handle backedges of loops the
original stmt belongs to.

Richard.


> Richard.
> 
> 2018-05-15  Richard Biener  <rguenther@suse.de>
> 
> 	* tree-ssa-dse.c (dse_classify_store): Track cycles via a visited
> 	bitmap of PHI defs and simplify handling of in-loop and across
> 	loop dead stores.  Handle byte-tracking with multiple defs.
> 
> 	* gcc.dg/tree-ssa/ssa-dse-32.c: New testcase.
> 	* gcc.dg/tree-ssa/ssa-dse-33.c: Likewise.
> 
> commit 7129756be201db1bd90e4191ed32084cbb22130d
> Author: Richard Guenther <rguenther@suse.de>
> Date:   Tue May 15 14:03:32 2018 +0200
> 
>     dse#2
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-32.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-32.c
> new file mode 100644
> index 00000000000..eea0d8d5cf0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-32.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-dse1-details" } */
> +
> +void f(int n)
> +{
> +  char *p = __builtin_malloc (1);
> +  int i;
> +  do
> +    *p = 0;
> +  while (++i < n);
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Deleted dead store" 1 "dse1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-33.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-33.c
> new file mode 100644
> index 00000000000..02e3e6a582c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-33.c
> @@ -0,0 +1,22 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-dse1-details" } */
> +
> +void f(char *p, int n)
> +{
> +  for (int i = 0; i < n; ++i)
> +    *p = 0;  /* Removed by DSE.  */
> +  *p = 1;
> +}
> +
> +void g(char *p, int n)
> +{
> +  int i = 0;
> +  do
> +    *p = 0;  /* DSE cannot yet handle this case.  */
> +  while (++i < n);
> +  *p = 1;
> +}
> +
> +
> +/* { dg-final { scan-tree-dump-times "Deleted dead store" 2 "dse1" { xfail *-*-* } } } */
> +/* { dg-final { scan-tree-dump-times "Deleted dead store" 1 "dse1" } } */
> diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
> index 126592a1d13..8836e84dd63 100644
> --- a/gcc/tree-ssa-dse.c
> +++ b/gcc/tree-ssa-dse.c
> @@ -528,6 +528,7 @@ dse_classify_store (ao_ref *ref, gimple *stmt,
>  {
>    gimple *temp;
>    unsigned cnt = 0;
> +  auto_bitmap visited;
>  
>    if (by_clobber_p)
>      *by_clobber_p = true;
> @@ -539,7 +540,7 @@ dse_classify_store (ao_ref *ref, gimple *stmt,
>    temp = stmt;
>    do
>      {
> -      gimple *use_stmt, *defvar_def;
> +      gimple *use_stmt;
>        imm_use_iterator ui;
>        bool fail = false;
>        tree defvar;
> @@ -550,47 +551,31 @@ dse_classify_store (ao_ref *ref, gimple *stmt,
>  	return DSE_STORE_LIVE;
>  
>        if (gimple_code (temp) == GIMPLE_PHI)
> -	defvar = PHI_RESULT (temp);
> +	{
> +	  defvar = PHI_RESULT (temp);
> +	  bitmap_set_bit (visited, SSA_NAME_VERSION (defvar));
> +	}
>        else
>  	defvar = gimple_vdef (temp);
> -      defvar_def = temp;
>        auto_vec<gimple *, 10> defs;
>        FOR_EACH_IMM_USE_STMT (use_stmt, ui, defvar)
>  	{
>  	  cnt++;
>  
> -	  /* If we ever reach our DSE candidate stmt again fail.  We
> -	     cannot handle dead stores in loops.  */
> +	  /* We have visited ourselves already so ignore STMT for the
> +	     purpose of chaining.  */
>  	  if (use_stmt == stmt)
> -	    {
> -	      fail = true;
> -	      BREAK_FROM_IMM_USE_STMT (ui);
> -	    }
> +	    ;
>  	  /* In simple cases we can look through PHI nodes, but we
>  	     have to be careful with loops and with memory references
>  	     containing operands that are also operands of PHI nodes.
>  	     See gcc.c-torture/execute/20051110-*.c.  */
>  	  else if (gimple_code (use_stmt) == GIMPLE_PHI)
>  	    {
> -	      /* Make sure we are not in a loop latch block.  */
> -	      if (gimple_bb (stmt) == gimple_bb (use_stmt)
> -		  || dominated_by_p (CDI_DOMINATORS,
> -				     gimple_bb (stmt), gimple_bb (use_stmt))
> -		  /* We can look through PHIs to regions post-dominating
> -		     the DSE candidate stmt.  */
> -		  || !dominated_by_p (CDI_POST_DOMINATORS,
> -				      gimple_bb (stmt), gimple_bb (use_stmt)))
> -		{
> -		  fail = true;
> -		  BREAK_FROM_IMM_USE_STMT (ui);
> -		}
> -	      /* Do not consider the PHI as use if it dominates the
> -	         stmt defining the virtual operand we are processing,
> -		 we have processed it already in this case.  */
> -	      if (gimple_bb (defvar_def) != gimple_bb (use_stmt)
> -		  && !dominated_by_p (CDI_DOMINATORS,
> -				      gimple_bb (defvar_def),
> -				      gimple_bb (use_stmt)))
> +	      /* If we already visited this PHI ignore it for further
> +		 processing.  */
> +	      if (!bitmap_bit_p (visited,
> +				 SSA_NAME_VERSION (PHI_RESULT (use_stmt))))
>  		defs.safe_push (use_stmt);
>  	    }
>  	  /* If the statement is a use the store is not dead.  */
> @@ -600,25 +585,20 @@ dse_classify_store (ao_ref *ref, gimple *stmt,
>  		 structure for USE_STMT and in doing so we find that the
>  		 references hit non-live bytes and thus can be ignored.  */
>  	      if (byte_tracking_enabled
> -		  && (!gimple_vdef (use_stmt) || defs.is_empty ()))
> +		  && is_gimple_assign (use_stmt))
>  		{
> -		  if (is_gimple_assign (use_stmt))
> +		  ao_ref use_ref;
> +		  ao_ref_init (&use_ref, gimple_assign_rhs1 (use_stmt));
> +		  if (valid_ao_ref_for_dse (&use_ref)
> +		      && use_ref.base == ref->base
> +		      && known_eq (use_ref.size, use_ref.max_size)
> +		      && !live_bytes_read (use_ref, ref, live_bytes))
>  		    {
> -		      /* Other cases were noted as non-aliasing by
> -			 the call to ref_maybe_used_by_stmt_p.  */
> -		      ao_ref use_ref;
> -		      ao_ref_init (&use_ref, gimple_assign_rhs1 (use_stmt));
> -		      if (valid_ao_ref_for_dse (&use_ref)
> -			  && use_ref.base == ref->base
> -			  && known_eq (use_ref.size, use_ref.max_size)
> -			  && !live_bytes_read (use_ref, ref, live_bytes))
> -			{
> -			  /* If this statement has a VDEF, then it is the
> -			     first store we have seen, so walk through it.  */
> -			  if (gimple_vdef (use_stmt))
> -			    defs.safe_push (use_stmt);
> -			  continue;
> -			}
> +		      /* If this is a store, remember it as we possibly
> +			 need to walk the defs uses.  */
> +		      if (gimple_vdef (use_stmt))
> +			defs.safe_push (use_stmt);
> +		      continue;
>  		    }
>  		}
>  
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)



More information about the Gcc-patches mailing list