This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] [P2] [PR tree-optimization/33562] Lowering more complex assignments.


On Wed, Feb 17, 2016 at 5:10 PM, Jeff Law <law@redhat.com> wrote:
> On 02/17/2016 07:13 AM, Richard Biener wrote:
>>>
>>> -  /* Continue walking until we reach a kill.  */
>>> -  while (!stmt_kills_ref_p (temp, ref));
>>> +  /* Continue walking until we reach a full kill as a single statement
>>> +     or there are no more live bytes.  */
>>> +  while (!stmt_kills_ref_p (temp, ref)
>>> +        && !(live_bytes && bitmap_empty_p (live_bytes)));
>>
>>
>> Just a short quick comment - the above means you only handle partial
>> stores
>> with no interveaning uses.  You don't handle, say
>>
>> struct S { struct R { int x; int y; } r; int z; } s;
>>
>>   s = { {1, 2}, 3 };
>>   s.r.x = 1;
>>   s.r.y = 2;
>>   struct R r = s.r;
>>   s.z = 3;
>>
>> where s = { {1, 2}, 3} is still dead.
>
> Right.  But handling that has never been part of DSE's design goals. Once
> there's a use, DSE has always given up.

Yeah, which is why I in the end said we need a "better" DSE ...

> Having said that...
>
>>
>> That is, you don't make use of the live_bytes in the
>> ref_maybe_used_by_stmt_p
>> check (you can skip uses of only dead bytes).
>>
>> Not sure if it makes a difference in practice (compared to the cost it
>> would take).
>
> Not sure either.  It doesn't appear that it would be hard to experiment with
> that to see if it's worth the effort.  My gut feeling is we're not going to
> see this often, if at all, in practice.

Yeah, I think the case we're after and that happens most is sth like

 a = { aggregate init };
 a.a = ...;
 a.b = ...;
 ...

and what you add is the ability to remove the aggregate init completely.

What would be nice to have is to remove it partly as well, as for

struct { int i; int j; int k; } a = {};
a.i = 1;
a.k = 3;

we'd like to remove the whole-a zeroing but we need to keep zeroing
of a.j.

I believe that SRA already has most of the analysis part, what it is
lacking is that SRA works not flow-sensitive (it just gathers
function-wide data) and that it doesn't consider base objects that
have their address taken or are pointer-based.

>>
>> Rather than building ao_refs in clear_bytes_written_by just use
>> get_ref_base_and_extent
>> directly.
>
> Easy enough to do, but ISTM if we use get_ref_base_and_extent in
> clear_bytes_written-by, then the other blob of similar code in tree-ssa-dse
> should be handled in the same way.  ie, the code you see in
> clear_bytes_written_by is almost a direct copy of code already existing in
> tree-ssa-dse.c (hence my feeling that there's some refactoring of that code
> that we want to do).
>
>
>
>>
>> You don't handle stuff like
>>
>>   s[i] = { 1, 2 };
>>   s[i].x = 1;
>>   s[i].y = 1;
>>
>> either btw.
>
> Correct I believe.
>
> IIRC (I think I looked at this during debugging at some point), the
> ao_ref->max_size field will cover the entire array for this kind of
> situation because we don't know which element in the array we're hitting (or
> -1 if we don't know the array's size).  I don't see a reasonable way to
> handle it with an ao_ref style interface unless the variable parts of the
> address computation are all rolled into the ao_ref->base field.
>
> I did look for cases where the initial store was to a varying location and
> thus max_size covered the entire array with killing stores that eventually
> covered the entire array (but with each individual killing store having size
> == max_size) -- the situation never came up in the codes I looked at (gcc &
> its runtime libraries of course).

I think the only way to handle this case is to "strip" a common base with
a varying address and replace it for the sake of get_ref_base_and_extent
(that is, tell get_ref_base_and_extent to stop at s[i] in this case).  So you
split the piece-alias-test into a same-base comparison plus offset/size
ontop of that.

That said, your patch addresses a very specific case nothing else in
the compiler
handles right now, but it's also quite limited.  So evaluating the compile-time
impact is important here as well as trying to cover more cases of "partial inits
after full inits" optimization.

I don't believe we need to rush this into GCC 6.

Richard.

> Jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]