This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

PING: Re: [PR tree-optimization/52558]: RFC: questions on store data race


PING.

Hi. Sorry for the delay. There were various tricky hiccups along the way
to bootstrappability and regression cleanliness...

On 04/26/12 04:51, Richard Guenther wrote:
On Wed, 25 Apr 2012, Aldy Hernandez wrote:

On 04/25/12 06:45, Richard Guenther wrote:
On Tue, Apr 24, 2012 at 7:43 PM, Aldy Hernandez<aldyh@redhat.com>
wrote:
On 04/13/12 03:46, Richard Guenther wrote:

On Fri, Apr 13, 2012 at 12:11 AM, Aldy Hernandez<aldyh@redhat.com> wrote:

Speak of loads, I am keeping the information as an additional bitmap in
`memory_accesses', as ->refs_in_loop was set for stores as well, so I
couldn't
depend on it. Let me know if you have another idea.

Hmm, refs_in_loop& ~all_refs_stored_in_loop, so instead of


+ bitmap reads = VEC_index (bitmap, memory_accesses.reads_in_loop,
+ loop->num);
+ ref_read_in_loop_p = bitmap_bit_p (reads, ref->id);

ref_read_in_loop_p = bitmap_bit_p (refs, ref->id)&& !bitmap_bit_p
(stores, ref->id);

? But maybe that doesn't work if a ref is both read and stored to.
Btw, rather than adding a bitmap to memory_accesses I'd rather add
a mark_ref_loaded corresponding to mark_ref_stored (or rather merge
both into one) and a bitmap to struct mem_ref.

I have done so as mark_ref_loaded(). I first tried to merge both into a generic mark_ref(), to avoid iterating twice, but I was concerned with the early loop exit of !bitmap_bit_p(ref->stored, loop->num), not knowing if I should exit the loop altogether in the merged case or what. In either case, the code was somewhat convoluted, so I opted for a separate clean function.

In scanning the gimple stream for the loads I noticed that instances of
two component refs (say foo.bar) were not pointer comparable, so I am
asking the alias oracle with refs_may_alias_p. It also seemed to make
more sense wrt memory chunks that may alias. What do you think?

This patch is far more robust and fully tested. Two wrenches to
complicate things though...

First, while inserting the code writing the _LSM_ temporary writes into
the original variables, I found a handful of tests where the order of
the writes mattered (aliased locations, inlined code, etc, IIRC). [This
is in loops where multiple stores are moved.] So iterating and inserting
on the exit edges caused the stores to be saved in reverse. I am now
saving the edges and threading the code, so they happen in the correct
order:

if (foo_flag_lsm)
foo = foo_lsm;
if (foo2_flag_lsm)
foo2 = foo2_lsm;
actual_exit:

This required some edge redirection so we didn't jump to the original
actual exit prematurely when handling multiple moved stores. The
dominator tree needed to be updated, and there is one instance where I
couldn't find a clever way of saving the order without jumping through
an inordinate amount of hoops. It seemed easier to call
recompute_dominator.

Second, avoiding the load, unless it happens in the loop like you have
suggested, brought about some bogus unused warnings with the LSM
temporaries. What happens is that compute_ssa() creates uses of the LSM
temporaries that are not yet defined, so we end up with PHI nodes with a
definition entry (D). Since we are sure the eventual uses will be gated
by their corresponding flag, I created phi nodes with an initial value
of 0 in the preheader of the loops in which they are used. This solved
the problem. I also played with initializing to 0 at the entry block,
but that seemed a bit too invasive.

Andrew suggested the correct fix was to add a new pass that was able to
do some ?? flow sensitive data flow analysis ?? that could discover
these unreachable paths and insert the 0 phis at the start of the blocks
automatically. But that seemed like far too much work, considering how
long it's taken me to get this far ;-).

Bootstraped and fully tested with multi_threaded_model_p=true
hard-coded. This is the revision I'd like to contribute, sans the
hardcoded value.

What do you think?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]