This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fix PR56756
- From: Richard Biener <rguenther at suse dot de>
- To: gcc-patches at gcc dot gnu dot org
- Date: Tue, 16 Apr 2013 14:02:04 +0200 (CEST)
- Subject: Re: [PATCH] Fix PR56756
- References: <alpine dot LNX dot 2 dot 00 dot 1303281611520 dot 21094 at zhemvz dot fhfr dot qr>
On Thu, 28 Mar 2013, Richard Biener wrote:
>
> The domwalker fix to order dom children after inverted postorder
> exposed the latent issue that LIM relies on domwalk to walk
> all blocks defining SSA names before all blocks using them ...
> which is what the following patch tries to fix using the
> dependency information it already tracks (but incompletely so,
> thus the fixes).
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> I'm not entirely happy with this - it should use a worklist
> of stmts instead of recursing and handling basic-blocks.
> But well ...
>
> Leaving for comments. (inverted_post_order_compute visits
> loop nodes in "weird" order because it visits loop exit
> nodes last)
Ok, I wasn't really happy with the above. The following simpler
patch avoids creating intermediate IL that has SSA uses where
their definition does not dominate the use. Simply by placing
the store-motion load before a random previous load/store where
obviously all SSA names used to compute the load/store address
have dominating definitions. That isn't guaranteed when
inserting into the loop latch block as the testcase shows.
Bootstrap and regtest pending on x86_64-unknown-linux-gnu.
Richard.
2013-04-16 Richard Biener <rguenther@suse.de>
PR tree-optimization/56756
* tree-ssa-loop-im.c (struct first_mem_ref_loc_1): New functor.
(first_mem_ref_loc): New.
(execute_sm): Place the load temporarily before a previous
access instead of in the latch edge to ensure its SSA dependencies
are defined at points dominating the load.
* gcc.dg/torture/pr56756.c: New testcase.
Index: gcc/tree-ssa-loop-im.c
===================================================================
*** gcc/tree-ssa-loop-im.c (revision 197997)
--- gcc/tree-ssa-loop-im.c (working copy)
*************** rewrite_mem_refs (struct loop *loop, mem
*** 1718,1723 ****
--- 1718,1749 ----
for_all_locs_in_loop (loop, ref, rewrite_mem_ref_loc (tmp_var));
}
+ /* Stores the first reference location in LOCP. */
+
+ struct first_mem_ref_loc_1
+ {
+ first_mem_ref_loc_1 (mem_ref_loc_p *locp_) : locp (locp_) {}
+ bool operator()(mem_ref_loc_p loc);
+ mem_ref_loc_p *locp;
+ };
+
+ bool
+ first_mem_ref_loc_1::operator()(mem_ref_loc_p loc)
+ {
+ *locp = loc;
+ return true;
+ }
+
+ /* Returns the first reference location to REF in LOOP. */
+
+ static mem_ref_loc_p
+ first_mem_ref_loc (struct loop *loop, mem_ref_p ref)
+ {
+ mem_ref_loc_p locp = NULL;
+ for_all_locs_in_loop (loop, ref, first_mem_ref_loc_1 (&locp));
+ return locp;
+ }
+
/* The name and the length of the currently generated variable
for lsm. */
#define MAX_LSM_NAME_LENGTH 40
*************** execute_sm (struct loop *loop, vec<edge>
*** 2022,2030 ****
unsigned i;
gimple load;
struct fmt_data fmt_data;
! edge ex, latch_edge;
struct lim_aux_data *lim_data;
bool multi_threaded_model_p = false;
if (dump_file && (dump_flags & TDF_DETAILS))
{
--- 2048,2057 ----
unsigned i;
gimple load;
struct fmt_data fmt_data;
! edge ex;
struct lim_aux_data *lim_data;
bool multi_threaded_model_p = false;
+ gimple_stmt_iterator gsi;
if (dump_file && (dump_flags & TDF_DETAILS))
{
*************** execute_sm (struct loop *loop, vec<edge>
*** 2049,2057 ****
rewrite_mem_refs (loop, ref, tmp_var);
! /* Emit the load code into the latch, so that we are sure it will
! be processed after all dependencies. */
! latch_edge = loop_latch_edge (loop);
/* FIXME/TODO: For the multi-threaded variant, we could avoid this
load altogether, since the store is predicated by a flag. We
--- 2076,2085 ----
rewrite_mem_refs (loop, ref, tmp_var);
! /* Emit the load code on a random exit edge or into the latch if
! the loop does not exit, so that we are sure it will be processed
! by move_computations after all dependencies. */
! gsi = gsi_for_stmt (first_mem_ref_loc (loop, ref)->stmt);
/* FIXME/TODO: For the multi-threaded variant, we could avoid this
load altogether, since the store is predicated by a flag. We
*************** execute_sm (struct loop *loop, vec<edge>
*** 2060,2066 ****
lim_data = init_lim_data (load);
lim_data->max_loop = loop;
lim_data->tgt_loop = loop;
! gsi_insert_on_edge (latch_edge, load);
if (multi_threaded_model_p)
{
--- 2088,2094 ----
lim_data = init_lim_data (load);
lim_data->max_loop = loop;
lim_data->tgt_loop = loop;
! gsi_insert_before (&gsi, load, GSI_SAME_STMT);
if (multi_threaded_model_p)
{
*************** execute_sm (struct loop *loop, vec<edge>
*** 2068,2074 ****
lim_data = init_lim_data (load);
lim_data->max_loop = loop;
lim_data->tgt_loop = loop;
! gsi_insert_on_edge (latch_edge, load);
}
/* Sink the store to every exit from the loop. */
--- 2096,2102 ----
lim_data = init_lim_data (load);
lim_data->max_loop = loop;
lim_data->tgt_loop = loop;
! gsi_insert_before (&gsi, load, GSI_SAME_STMT);
}
/* Sink the store to every exit from the loop. */
Index: gcc/testsuite/gcc.dg/torture/pr56756.c
===================================================================
*** gcc/testsuite/gcc.dg/torture/pr56756.c (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr56756.c (working copy)
***************
*** 0 ****
--- 1,28 ----
+ /* { dg-do compile } */
+
+ int a, *b;
+
+ void f(void)
+ {
+ if(a)
+ {
+ int k;
+
+ for(a = 0; a < 1; a++)
+ {
+ int **q;
+ f();
+
+ for(; **q; ++**q)
+ lbl:
+ if(a)
+ {
+ a = 0;
+ goto lbl;
+ }
+
+ b = &k;
+ }
+ }
+ goto lbl;
+ }