This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH: tune ARM's rtx_costs function


Richard Earnshaw <rearnsha@arm.com> writes:

Ben is on his honeymoon and may not reply immediately.

> > 	* config/arm/arm.md (addsi3): Remove check of
> > 	preserve_subexpressions_p() so that a new pseudo is used.  This
> > 	allows better pattern matching, improving code at -O.
> > 
> 
> I need evidence that this is the right thing to do for -O1.  -O1 is a 
> compromise between generated code and compilation time, the test is a 
> heuristic to determine when generating additional pseudos might be 
> worthwile.  To make a case for this change you need to show either that:
> 
> 1) Compilation times do not increase noticeably for -O1 (or at all for 
> -O0) and
> 2) The resulting performance of executables warrants that increase in 
> compilation time.
> 
> Or:
> 
> 3) Compilation time unconditionally decreases by not doing the check at 
> both -O0 and -O1.
> 
> Either way these need to be for realistic functions, not trivial test 
> cases.

I think the patch to remove preserve_subexpressions_p() was originally
my idea.

It's a bit of a philosophical thing, perhaps, but I don't think that
preserve_subexpressions_p() is the right check here.  The point of
preserve_subexpressions_p() is to make expensive constants available
for CSE when using -fexpensive-optimizations.  In this case the ARM
backend has to put the constant in a pseudo regardless.  The purpose
of that parameter to arm_split_constant() is whether it will use a new
pseudo or whether it will reuse the existing pseudo.  It seems to me
that there is no real advantage to forcing the use of the same pseudo
if you have to use some pseudo anyhow.  It's not really more
expensive--it's a little more expensive, but not so much that I think
-fexpensive-optimizations should be required.

The resulting code is clearly better, since it doesn't require
additional instructions to shuffle registers around after constants
wind up in the wrong place to satisfy the ARM reload restrictions.  I
guess it's simply a question of whether creating the additional
pseudos here really requires -fexpensive-optimizations.

I admit that I haven't done any timings of compilations.

Ian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]