This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC] [P2] [PR tree-optimization/33562] Lowering more complex assignments.
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Jeff Law <law at redhat dot com>
- Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 18 Feb 2016 10:56:45 +0100
- Subject: Re: [RFC] [P2] [PR tree-optimization/33562] Lowering more complex assignments.
- Authentication-results: sourceware.org; auth=none
- References: <56BD1EFB dot 90008 at redhat dot com> <CAFiYyc3iYgzrprV2V=C02nbAmZntPy7QH_=Dxo_UXps7sneh_g at mail dot gmail dot com> <56BE14B0 dot 9040801 at redhat dot com> <56C0ACC1 dot 60905 at redhat dot com> <0D3F08EF-9DC3-4848-AEC7-1AE639464B3D at gmail dot com> <56C4217F dot 2040809 at redhat dot com> <CAFiYyc0D4-Fum24QD9TYsC6DvPd71yuWnhKZeuEk-BpjcONQkQ at mail dot gmail dot com> <56C47D5C dot 7010804 at redhat dot com> <CAFiYyc0Jx1xgRsr0NK5OS+iYZ9dzCm4ea3UmLQAAHA61PyaeuA at mail dot gmail dot com> <56C49B69 dot 2090209 at redhat dot com>
On Wed, Feb 17, 2016 at 5:10 PM, Jeff Law <law@redhat.com> wrote:
> On 02/17/2016 07:13 AM, Richard Biener wrote:
>>>
>>> - /* Continue walking until we reach a kill. */
>>> - while (!stmt_kills_ref_p (temp, ref));
>>> + /* Continue walking until we reach a full kill as a single statement
>>> + or there are no more live bytes. */
>>> + while (!stmt_kills_ref_p (temp, ref)
>>> + && !(live_bytes && bitmap_empty_p (live_bytes)));
>>
>>
>> Just a short quick comment - the above means you only handle partial
>> stores
>> with no interveaning uses. You don't handle, say
>>
>> struct S { struct R { int x; int y; } r; int z; } s;
>>
>> s = { {1, 2}, 3 };
>> s.r.x = 1;
>> s.r.y = 2;
>> struct R r = s.r;
>> s.z = 3;
>>
>> where s = { {1, 2}, 3} is still dead.
>
> Right. But handling that has never been part of DSE's design goals. Once
> there's a use, DSE has always given up.
Yeah, which is why I in the end said we need a "better" DSE ...
> Having said that...
>
>>
>> That is, you don't make use of the live_bytes in the
>> ref_maybe_used_by_stmt_p
>> check (you can skip uses of only dead bytes).
>>
>> Not sure if it makes a difference in practice (compared to the cost it
>> would take).
>
> Not sure either. It doesn't appear that it would be hard to experiment with
> that to see if it's worth the effort. My gut feeling is we're not going to
> see this often, if at all, in practice.
Yeah, I think the case we're after and that happens most is sth like
a = { aggregate init };
a.a = ...;
a.b = ...;
...
and what you add is the ability to remove the aggregate init completely.
What would be nice to have is to remove it partly as well, as for
struct { int i; int j; int k; } a = {};
a.i = 1;
a.k = 3;
we'd like to remove the whole-a zeroing but we need to keep zeroing
of a.j.
I believe that SRA already has most of the analysis part, what it is
lacking is that SRA works not flow-sensitive (it just gathers
function-wide data) and that it doesn't consider base objects that
have their address taken or are pointer-based.
>>
>> Rather than building ao_refs in clear_bytes_written_by just use
>> get_ref_base_and_extent
>> directly.
>
> Easy enough to do, but ISTM if we use get_ref_base_and_extent in
> clear_bytes_written-by, then the other blob of similar code in tree-ssa-dse
> should be handled in the same way. ie, the code you see in
> clear_bytes_written_by is almost a direct copy of code already existing in
> tree-ssa-dse.c (hence my feeling that there's some refactoring of that code
> that we want to do).
>
>
>
>>
>> You don't handle stuff like
>>
>> s[i] = { 1, 2 };
>> s[i].x = 1;
>> s[i].y = 1;
>>
>> either btw.
>
> Correct I believe.
>
> IIRC (I think I looked at this during debugging at some point), the
> ao_ref->max_size field will cover the entire array for this kind of
> situation because we don't know which element in the array we're hitting (or
> -1 if we don't know the array's size). I don't see a reasonable way to
> handle it with an ao_ref style interface unless the variable parts of the
> address computation are all rolled into the ao_ref->base field.
>
> I did look for cases where the initial store was to a varying location and
> thus max_size covered the entire array with killing stores that eventually
> covered the entire array (but with each individual killing store having size
> == max_size) -- the situation never came up in the codes I looked at (gcc &
> its runtime libraries of course).
I think the only way to handle this case is to "strip" a common base with
a varying address and replace it for the sake of get_ref_base_and_extent
(that is, tell get_ref_base_and_extent to stop at s[i] in this case). So you
split the piece-alias-test into a same-base comparison plus offset/size
ontop of that.
That said, your patch addresses a very specific case nothing else in
the compiler
handles right now, but it's also quite limited. So evaluating the compile-time
impact is important here as well as trying to cover more cases of "partial inits
after full inits" optimization.
I don't believe we need to rush this into GCC 6.
Richard.
> Jeff