[PATCH][combine] PR rtl-optimization/68651 Try changing rtx from (r + r) to (r << 1) to aid recognition

Wed Dec 16 17:00:00 GMT 2015

On 16/12/15 12:18, Bernd Schmidt wrote:
> On 12/15/2015 05:21 PM, Kyrill Tkachov wrote:
>> Then for the shift pattern in the MD file we'd have to dynamically
>> select the scheduling type depending on whether or not the shift
>> amount is 1 and the costs line up?
>
> Yes. This isn't unusual, take a look at i386.md where you have a lot of switches on attr type to decide which string to print.
>

I'm just worried that if we take this idea to its logical conclusion, we have to add a new canonicalisation rule:
"all (plus x x) expressions shall be expressed as (ashift x 1)".
Such a rule seems too specific to me and all targets would have to special-case it in their MD patterns and costs
if they ever wanted to treat an add and a shift differently.
In this particular case we'd have
to conditionalise the scheduling string selection on a particular CPU tuning and the shift amount, which will make
the pattern much harder to read.
To implement this properly we'd also have to

>> The price we pay when trying these substitutions is an iteration over
>>  the rtx with FOR_EACH_SUBRTX_PTR. recog gets called only if that
>> iteration actually performed a substitution of x + x into x << 1. Is
>> that too high a price to pay? (I'm not familiar with the performance
>>  characteristics of the FOR_EACH_SUBRTX machinery)
>
> It depends on how many of these transforms we are going to try; it also feels very hackish, trying to work around the core design of the combiner. IMO it would be better for machine descriptions to work with the pass rather than against it.
>

Perhaps I'm lacking the historical context, but what is the core design of the combiner?
Why should the backend have to jump through these hoops if it already communicates to the midend
(through correct rtx costs) that a shift is more expensive than a plus?
I'd be more inclined to agree that this is perhaps a limitation in recog rather than combine,
but still not a backend problem.

> If you can somehow arrange for the (plus x x) to be turned into a shift while substituting that might be yet another approach to try.
>

I did investigate where else we could make this transformation.
For the zero_extend+shift case (the ubfiz instruction from the testcase in my original submission)
we could fix this by modifying make_extraction to convert its argument to a shift from (plus x x)
as, in that context, shifts are undoubtedly more likely to simplify with the various extraction
operations that it's trying to perform.

That leaves the other case (orr + shift), where converting to a shift isn't a
simplification in any way, but the backend happens to have an instruction that matches the
combined orr+shift form. There we want to perform the transformation purely to aid
recognition, not out of any simplification considerations. That's what I'm trying to figure out
how to do now.

However, we want to avoid doing it unconditionally because if we have just a simple set
of y = x + x we want to leave it as a plus rather than a shift because it's cheaper
on that target.

Thanks,
Kyrill
>
> Bernd