This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][combine][1/2] Try to simplify before substituting
- From: Kyrill Tkachov <kyrylo dot tkachov at arm dot com>
- To: Segher Boessenkool <segher at kernel dot crashing dot org>, Jeff Law <law at redhat dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Andrew Pinski <apinski at cavium dot com>
- Date: Mon, 20 Jul 2015 13:53:29 +0100
- Subject: Re: [PATCH][combine][1/2] Try to simplify before substituting
- Authentication-results: sourceware.org; auth=none
- References: <55A7CCDA dot 8050203 at arm dot com> <20150716181306 dot GA8497 at gate dot crashing dot org> <55A7F552 dot 9030003 at arm dot com> <20150716182801 dot GB8497 at gate dot crashing dot org> <55A8E8B2 dot 9070002 at arm dot com> <55A969E6 dot 3080305 at redhat dot com> <20150718160255 dot GA7532 at gate dot crashing dot org>
On 18/07/15 17:02, Segher Boessenkool wrote:
On Fri, Jul 17, 2015 at 02:47:34PM -0600, Jeff Law wrote:
I mean move the whole "if (BINARY_P ..." block to after the existing
simplify calls, to just before the "First see if we can apply" comment,
and not do a new simplify_rtx call at all. Does that work?
Yes, and here's the patch.
It just moves the simplification block.
The effect on codegen in SPEC2006 on aarch64 looks sane in the same
way as the original patch I posted (i.e. many redundant zero_extends
eliminated)
and together with patch 2/2 this helps in the -abs testcase.
I'm bootstrapping this on aarch64, arm and x86.
Any other testing would be appreciated.
Is this version ok if testing comes clean?
Thanks,
Kyrill
2015-07-17 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
* combine.c (combine_simplify_rtx): Move simplification step
before various transformations/substitutions.
OK.
jeff
The patch improves generated code on most archs (or at least code size,
which strongly correlates for combine), or is neutral. xtensa regresses
a tiny bit; powerpc64 and hppa64 regress more. I analysed the powerpc64
differences, and it seems to be all down to code that is now expressed as
(set (reg:DI) (lt:DI (reg:SI) (const_int 0)))
where before it was a bit extract (of a subreg). The newly generated
pattern is simper alright, but the backend didn't recognise it. With a
simple patch, it does, and the generated code is nicely better than
before.
Thanks for analyzing. So will you submit a powerpc patch
for this? I'm not familiar with the patterns there :)
The hppa64 problem looks similar. Maybe other targets could use such
an improvement as well.
So yes, the patch is fine. Thank you for working on it Kyrill :-)
x86_64, aarch64 and arm bootstrap passed successfully on my end
and testing looked clean too.
I've committed this patch with r225996 and 2/2 with r225997.
Thanks for helping me work through this!
Kyrill
Segher