[PATCH] Fix PR56756

Tue Apr 16 16:39:00 GMT 2013

On Thu, 28 Mar 2013, Richard Biener wrote:

> 
> The domwalker fix to order dom children after inverted postorder
> exposed the latent issue that LIM relies on domwalk to walk
> all blocks defining SSA names before all blocks using them ...
> which is what the following patch tries to fix using the
> dependency information it already tracks (but incompletely so,
> thus the fixes).
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> 
> I'm not entirely happy with this - it should use a worklist
> of stmts instead of recursing and handling basic-blocks.
> But well ...
> 
> Leaving for comments.  (inverted_post_order_compute visits
> loop nodes in "weird" order because it visits loop exit
> nodes last)

Ok, I wasn't really happy with the above.  The following simpler
patch avoids creating intermediate IL that has SSA uses where
their definition does not dominate the use.  Simply by placing
the store-motion load before a random previous load/store where
obviously all SSA names used to compute the load/store address
have dominating definitions.  That isn't guaranteed when
inserting into the loop latch block as the testcase shows.

Bootstrap and regtest pending on x86_64-unknown-linux-gnu.

Richard.

2013-04-16  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/56756
	* tree-ssa-loop-im.c (struct first_mem_ref_loc_1): New functor.
	(first_mem_ref_loc): New.
	(execute_sm): Place the load temporarily before a previous
	access instead of in the latch edge to ensure its SSA dependencies
	are defined at points dominating the load.

	* gcc.dg/torture/pr56756.c: New testcase.

Index: gcc/tree-ssa-loop-im.c
===================================================================
*** gcc/tree-ssa-loop-im.c	(revision 197997)
--- gcc/tree-ssa-loop-im.c	(working copy)
*************** rewrite_mem_refs (struct loop *loop, mem
*** 1718,1723 ****
--- 1718,1749 ----
    for_all_locs_in_loop (loop, ref, rewrite_mem_ref_loc (tmp_var));
  }

+ /* Stores the first reference location in LOCP.  */
+ 
+ struct first_mem_ref_loc_1
+ {
+   first_mem_ref_loc_1 (mem_ref_loc_p *locp_) : locp (locp_) {}
+   bool operator()(mem_ref_loc_p loc);
+   mem_ref_loc_p *locp;
+ };
+ 
+ bool
+ first_mem_ref_loc_1::operator()(mem_ref_loc_p loc)
+ {
+   *locp = loc;
+   return true;
+ }
+ 
+ /* Returns the first reference location to REF in LOOP.  */
+ 
+ static mem_ref_loc_p
+ first_mem_ref_loc (struct loop *loop, mem_ref_p ref)
+ {
+   mem_ref_loc_p locp = NULL;
+   for_all_locs_in_loop (loop, ref, first_mem_ref_loc_1 (&locp));
+   return locp;
+ }
+ 
  /* The name and the length of the currently generated variable
     for lsm.  */
  #define MAX_LSM_NAME_LENGTH 40
*************** execute_sm (struct loop *loop, vec<edge>
*** 2022,2030 ****
    unsigned i;
    gimple load;
    struct fmt_data fmt_data;
!   edge ex, latch_edge;
    struct lim_aux_data *lim_data;
    bool multi_threaded_model_p = false;

    if (dump_file && (dump_flags & TDF_DETAILS))
      {
--- 2048,2057 ----
    unsigned i;
    gimple load;
    struct fmt_data fmt_data;
!   edge ex;
    struct lim_aux_data *lim_data;
    bool multi_threaded_model_p = false;
+   gimple_stmt_iterator gsi;

    if (dump_file && (dump_flags & TDF_DETAILS))
      {
*************** execute_sm (struct loop *loop, vec<edge>
*** 2049,2057 ****

    rewrite_mem_refs (loop, ref, tmp_var);

!   /* Emit the load code into the latch, so that we are sure it will
!      be processed after all dependencies.  */
!   latch_edge = loop_latch_edge (loop);

    /* FIXME/TODO: For the multi-threaded variant, we could avoid this
       load altogether, since the store is predicated by a flag.  We
--- 2076,2085 ----

    rewrite_mem_refs (loop, ref, tmp_var);

!   /* Emit the load code on a random exit edge or into the latch if
!      the loop does not exit, so that we are sure it will be processed
!      by move_computations after all dependencies.  */
!   gsi = gsi_for_stmt (first_mem_ref_loc (loop, ref)->stmt);

    /* FIXME/TODO: For the multi-threaded variant, we could avoid this
       load altogether, since the store is predicated by a flag.  We
*************** execute_sm (struct loop *loop, vec<edge>
*** 2060,2066 ****
    lim_data = init_lim_data (load);
    lim_data->max_loop = loop;
    lim_data->tgt_loop = loop;
!   gsi_insert_on_edge (latch_edge, load);

    if (multi_threaded_model_p)
      {
--- 2088,2094 ----
    lim_data = init_lim_data (load);
    lim_data->max_loop = loop;
    lim_data->tgt_loop = loop;
!   gsi_insert_before (&gsi, load, GSI_SAME_STMT);

    if (multi_threaded_model_p)
      {
*************** execute_sm (struct loop *loop, vec<edge>
*** 2068,2074 ****
        lim_data = init_lim_data (load);
        lim_data->max_loop = loop;
        lim_data->tgt_loop = loop;
!       gsi_insert_on_edge (latch_edge, load);
      }

    /* Sink the store to every exit from the loop.  */
--- 2096,2102 ----
        lim_data = init_lim_data (load);
        lim_data->max_loop = loop;
        lim_data->tgt_loop = loop;
!       gsi_insert_before (&gsi, load, GSI_SAME_STMT);
      }

    /* Sink the store to every exit from the loop.  */
Index: gcc/testsuite/gcc.dg/torture/pr56756.c
===================================================================
*** gcc/testsuite/gcc.dg/torture/pr56756.c	(revision 0)
--- gcc/testsuite/gcc.dg/torture/pr56756.c	(working copy)
***************
*** 0 ****
--- 1,28 ----
+ /* { dg-do compile } */
+ 
+ int a, *b;
+ 
+ void f(void)
+ {
+   if(a)
+     {
+       int k;
+ 
+       for(a = 0; a < 1; a++)
+ 	{
+ 	  int **q;
+ 	  f();
+ 
+ 	  for(; **q; ++**q)
+ 	    lbl:
+ 		if(a)
+ 		  {
+ 		    a = 0;
+ 		    goto lbl;
+ 		  }
+ 
+ 	  b = &k;
+ 	}
+     }
+   goto lbl;
+ }