This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH ppc64,aarch64,alpha 00/15] Improve backend constant generation


On 12/08/15 09:32, Richard Earnshaw wrote:
> On 12/08/15 02:11, Richard Henderson wrote:
>> Something last week had me looking at ppc64 code generation,
>> and some of what I saw was fairly bad.  Fixing it wasn't going
>> to be easy, due to the fact that the logic for generating
>> constants wasn't contained within a single function.
>>
>> Better is the way that aarch64 and alpha have done it in the
>> past, sharing a single function with all of the logical that
>> can be used for both cost calculation and the actual emission
>> of the constants.
>>
>> However, the way that aarch64 and alpha have done it hasn't
>> been ideal, in that there's a fairly costly search that must
>> be done every time.  I've thought before about changing this
>> so that we would be able to cache results, akin to how we do
>> it in expmed.c for multiplication.
>>
>> I've implemented such a caching scheme for three targets, as
>> a test of how much code could be shared.  The answer appears
>> to be about 100 lines of boiler-plate.  Minimal, true, but it
>> may still be worth it as a way of encouraging backends to do
>> similar things in a similar way.
>>
> 
> I've got a short week this week, so won't have time to look at this in
> detail for a while.  So a bunch of questions... but not necessarily
> objections :-)
> 
> How do we clear the cache, and when?  For example, on ARM, switching
> between ARM and Thumb state means we need to generate potentially
> radically different sequences?  We can do such splitting at function
> boundaries now.
> 
> Can we generate different sequences for hot/cold code within a single
> function?
> 
> Can we cache sequences with the context (eg use with AND, OR, ADD, etc)?
> 
> 
>> Some notes about ppc64 in particular:
>>
>>   * Constants aren't split until quite late, preventing all hope of
>>     CSE'ing portions of the generated code.  My gut feeling is that
>>     this is in general a mistake, but...
>>
>>     I did attempt to fix it, and got nothing for my troubles except
>>     poorer code generation for AND/IOR/XOR with non-trivial constants.
>>
> On AArch64 in particular, building complex constants is generally
> destructive on the source register (if you want to preserve intermediate
> values you have to make intermediate copies); that's clearly never going
> to be a win if you don't need at least 3 instructions to form the
> constant.
> 
> There might be some cases where you could form a second constant as a
> difference from an earlier one, but that then creates data-flow
> dependencies and in OoO machines that might not be worth-while.  Even
> for in-order machines it can restrict scheduling and result in worse code.
> 
> 
>>     I'm somewhat surprised that the operands to the logicals aren't
>>     visible at rtl generation time, given all the work done in gimple.
>>     And failing that, combine has enough REG_EQUAL notes that it ought
>>     to be able to put things back together and see the simpler pattern.
>>
> 
> We've tried it in the past.  Exposing the individual steps prevents the
> higher-level rtl-based optimizations since they can no-longer deal with
> the complete sub-expression.

Eg. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63724

R.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]