This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH ppc64,aarch64,alpha 00/15] Improve backend constant generation
- From: Richard Earnshaw <Richard dot Earnshaw at foss dot arm dot com>
- To: Richard Earnshaw <Richard dot Earnshaw at foss dot arm dot com>, Richard Henderson <rth at redhat dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Cc: David Edelsohn <dje dot gcc at gmail dot com>, Marcus Shawcroft <Marcus dot Shawcroft at arm dot com>
- Date: Wed, 12 Aug 2015 09:43:29 +0100
- Subject: Re: [PATCH ppc64,aarch64,alpha 00/15] Improve backend constant generation
- Authentication-results: sourceware.org; auth=none
- References: <1439341904-9345-1-git-send-email-rth at redhat dot com> <55CB0487 dot 1020505 at foss dot arm dot com>
On 12/08/15 09:32, Richard Earnshaw wrote:
> On 12/08/15 02:11, Richard Henderson wrote:
>> Something last week had me looking at ppc64 code generation,
>> and some of what I saw was fairly bad. Fixing it wasn't going
>> to be easy, due to the fact that the logic for generating
>> constants wasn't contained within a single function.
>>
>> Better is the way that aarch64 and alpha have done it in the
>> past, sharing a single function with all of the logical that
>> can be used for both cost calculation and the actual emission
>> of the constants.
>>
>> However, the way that aarch64 and alpha have done it hasn't
>> been ideal, in that there's a fairly costly search that must
>> be done every time. I've thought before about changing this
>> so that we would be able to cache results, akin to how we do
>> it in expmed.c for multiplication.
>>
>> I've implemented such a caching scheme for three targets, as
>> a test of how much code could be shared. The answer appears
>> to be about 100 lines of boiler-plate. Minimal, true, but it
>> may still be worth it as a way of encouraging backends to do
>> similar things in a similar way.
>>
>
> I've got a short week this week, so won't have time to look at this in
> detail for a while. So a bunch of questions... but not necessarily
> objections :-)
>
> How do we clear the cache, and when? For example, on ARM, switching
> between ARM and Thumb state means we need to generate potentially
> radically different sequences? We can do such splitting at function
> boundaries now.
>
> Can we generate different sequences for hot/cold code within a single
> function?
>
> Can we cache sequences with the context (eg use with AND, OR, ADD, etc)?
>
>
>> Some notes about ppc64 in particular:
>>
>> * Constants aren't split until quite late, preventing all hope of
>> CSE'ing portions of the generated code. My gut feeling is that
>> this is in general a mistake, but...
>>
>> I did attempt to fix it, and got nothing for my troubles except
>> poorer code generation for AND/IOR/XOR with non-trivial constants.
>>
> On AArch64 in particular, building complex constants is generally
> destructive on the source register (if you want to preserve intermediate
> values you have to make intermediate copies); that's clearly never going
> to be a win if you don't need at least 3 instructions to form the
> constant.
>
> There might be some cases where you could form a second constant as a
> difference from an earlier one, but that then creates data-flow
> dependencies and in OoO machines that might not be worth-while. Even
> for in-order machines it can restrict scheduling and result in worse code.
>
>
>> I'm somewhat surprised that the operands to the logicals aren't
>> visible at rtl generation time, given all the work done in gimple.
>> And failing that, combine has enough REG_EQUAL notes that it ought
>> to be able to put things back together and see the simpler pattern.
>>
>
> We've tried it in the past. Exposing the individual steps prevents the
> higher-level rtl-based optimizations since they can no-longer deal with
> the complete sub-expression.
Eg. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63724
R.