This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][combine][RFC][2/2] PR rtl-optimization/68796: Perfer zero_extract comparison against zero rather than unsupported shorter modes
- From: Kyrill Tkachov <kyrylo dot tkachov at foss dot arm dot com>
- To: Bernd Schmidt <bschmidt at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Cc: Segher Boessenkool <segher at kernel dot crashing dot org>
- Date: Thu, 17 Dec 2015 16:10:28 +0000
- Subject: Re: [PATCH][combine][RFC][2/2] PR rtl-optimization/68796: Perfer zero_extract comparison against zero rather than unsupported shorter modes
- Authentication-results: sourceware.org; auth=none
- References: <5672D68F dot 3030408 at foss dot arm dot com> <5672DB97 dot 7090800 at redhat dot com>
On 17/12/15 15:58, Bernd Schmidt wrote:
On 12/17/2015 04:36 PM, Kyrill Tkachov wrote:
The documentation on RTL canonical forms in md.texi says:
"Equality comparisons of a group of bits (usually a single bit) with zero
will be written using @code{zero_extract} rather than the equivalent
@code{and} or @code{sign_extract} operations. "
However, this is not always followed in combine. If it's trying to optimise
a comparison against zero of a bitmask that is the mode mask of some mode
(255 for QImode and 65535 for HImode in the testcases of this patch)
it will instead create a subreg to that shorter mode.
I suspect that this is an oversight in the documentation, and if given two choices the simpler form is intended to be the canonical one.
it ends up trying to make a QImode comparison against zero, for which
targets like
aarch64 have no pattern.
So, can you define a pattern for it...
To get the benefit on aarch64 this needs patch 1/2 that adds an aarch64
pattern
for comparing a zero_extract with zero.
... instead of this one?
Yes, I had investigated that approach and it has the same effect (on aarch64).
My motivation for this approach was to try avoiding defining multiple patterns for what should
be equivalent expressions. But if the short subreg form is intended to be the canonical form...
What do people think of this approach?
I hope this just enforces the already documented canonicalisation rules
with minimal(none?) negative
fallout.
I'm not so sure about this. Other ports have QImode comparisons and I would want to see some evidence that there are no code quality regressions. This is not stage 3 material in any case.
Well, this patch still produces the QImode comparison if the target has a QImode comparison
(the have_insn_for check in the simplify_comparison hunk).
As I said, the effects on arm and aarch64 were strictly beneficial.
On x86_64 I saw no codegen difference on SPEC2006.
If this is considered too risky at this stage I can propose a QImode pattern for
aarch64 instead to isolate this fix to that backend.
Thanks,
Kyrill
Bernd