This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PR tree-optimization/52558]: RFC: questions on store data race
On Wed, 25 Apr 2012, Aldy Hernandez wrote:
> On 04/25/12 06:45, Richard Guenther wrote:
> > On Tue, Apr 24, 2012 at 7:43 PM, Aldy Hernandez<aldyh@redhat.com> wrote:
> > > On 04/13/12 03:46, Richard Guenther wrote:
> > > >
> > > > On Fri, Apr 13, 2012 at 12:11 AM, Aldy Hernandez<aldyh@redhat.com>
> > > > wrote:
>
> > + /* ?? Perhaps we should cache this somewhere in the BB, or are
> > + multiple levels of empty BB's common. ?? */
> > + FOR_EACH_EDGE (e, ei, bb->preds)
> > + {
> > + int res = bb_in_transaction_1 (e->src);
> > + if (res != -1)
> > + return (bool) res;
> > + }
> > + return false;
> >
> > that's weird code - if predecessors are not all in or not in a transaction
> > you return a random value. So it seems this is somewhat fragile.
> >
> > If bb status can be derived from looking at a single stmt why is the
> > transaction-ness not recorded as a basic-block flag then? Thus,
> > why do we have a bit in gimple stmts?
>
> How about we get rid of the GIMPLE bit altogether and keep it in the BB flags
> like you mention? See patch.
Yeah, that looks good.
> > + bool single_exit_p = single_pred_p (ex->dest);
> >
> > that's a strange variable name ;)
>
> How about loop_has_only_one_exit? :)
Fine ;)
> >
> > +/* Helper function for execute_sm. On every path that sets REF, set
> > + an appropriate flag indicating the store. */
> > +
> > +static tree
> > +execute_sm_if_changed_flag_set (struct loop *loop, mem_ref_p ref)
> > +{
> > ...
> > + FOR_EACH_VEC_ELT (mem_ref_loc_p, locs, i, loc)
> > + {
> > + gimple_stmt_iterator gsi;
> > + gimple stmt;
> > +
> > + gsi = gsi_start_bb (gimple_bb (loc->stmt));
> > + for (; !gsi_end_p (gsi); gsi_next (&gsi))
> > + if (gsi_stmt (gsi) == loc->stmt)
> >
> > does not seem to do it on every path but on every REF setter. And instead
> > of the innermost loop just use gsi_for_stmt (loc->stmt).
>
> Updated comment. Fixed.
>
> >
> > + /* Emit the load code into the latch, so that we are sure it will
> > + be processed after all dependencies. */
> > + latch_edge = loop_latch_edge (loop);
> > + load = gimple_build_assign (mem_ssa, unshare_expr (ref->mem));
> > lim_data = init_lim_data (load);
> > lim_data->max_loop = loop;
> > lim_data->tgt_loop = loop;
> > + gsi_insert_on_edge (latch_edge, load);
> > + load = gimple_build_assign (tmp_var, mem_ssa);
> > + lim_data = init_lim_data (load);
> > + lim_data->max_loop = loop;
> > + lim_data->tgt_loop = loop;
> > + gsi_insert_on_edge (latch_edge, load);
> >
> > the 2nd assign is no longer a "load", so I suppose you don't need any
> > lim_data
> > for it?
>
> Re-did all this. See patch.
>
> >
> > + else if (store == STORE_IF_CHANGED_FLAG)
> > + execute_sm_if_changed (ex, ref->mem, mem_ssa, tmp_var,
> > + store_flag);
> >
> > and this can pass NULL instead of mem_ssa?
>
> Got rid of mem_ssa altogether, as well as the if-changed approach which proved
> inferior to the flags based approach.
Good.
> > Overall this looks reasonable. With also handling trapping things I meant
> > that we should be able to apply store-motion to
> >
> > int a[256];
> > void foo (int *p)
> > {
> > int i;
> > for (i= 0;i<256;i++)
> > if (flag)
> > *p = a[i];
> > }
> >
> > but we cannot, because the transform
> >
> > lsm = *p;
> > for (i = 0; i< 256; ++i)
> > if (flag)
> > lsm = a[i];
> > *p = lsm;
> >
> > even when the store is properly conditional the load is not (and you do not
> > change that). The unconditional load may trap here. What I meant to say
> > is that we do not need the load at all unless there is also a load involved
> > inside the loop - if we use the flag-based approach. If there is also a
> > load
> > inside the loop we'd need to perform the load there (or not handle this
> > case)
>
> Ahh! Yes, this sounds very profitable. Would you mind me doing this in a
> separate patch? I am already handling loads in this incarnation, so this
> should be straightforward.
No, it's fine to do it separately.
> Speak of loads, I am keeping the information as an additional bitmap in
> `memory_accesses', as ->refs_in_loop was set for stores as well, so I couldn't
> depend on it. Let me know if you have another idea.
Hmm, refs_in_loop & ~all_refs_stored_in_loop, so instead of
+ bitmap reads = VEC_index (bitmap, memory_accesses.reads_in_loop,
+ loop->num);
+ ref_read_in_loop_p = bitmap_bit_p (reads, ref->id);
ref_read_in_loop_p = bitmap_bit_p (refs, ref->id) && !bitmap_bit_p
(stores, ref->id);
? But maybe that doesn't work if a ref is both read and stored to.
Btw, rather than adding a bitmap to memory_accesses I'd rather add
a mark_ref_loaded corresponding to mark_ref_stored (or rather merge
both into one) and a bitmap to struct mem_ref.
> Mildly tested. Wadaya think?
Looks good overall, minus the detail of remembering refs loaded.
Richard.
> Thanks again.
> Aldy
>
--
Richard Guenther <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer