This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC] Tweak reload to const propagate into matching constraint output
- From: Richard Henderson <rth at redhat dot com>
- To: Richard Guenther <richard dot guenther at gmail dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Vladimir Makarov <vmakarov at redhat dot com>, bernds at codesourcery dot com, uweigand at de dot ibm dot com
- Date: Wed, 27 Jun 2012 08:36:41 -0700
- Subject: Re: [RFC] Tweak reload to const propagate into matching constraint output
- References: <4FEA77E0.4020501@redhat.com> <CAFiYyc0h2wUJeoRrj1j25TJNUd7Pdr2ZP--A-wMTaFDuhS=Bsg@mail.gmail.com>
On 06/27/2012 01:45 AM, Richard Guenther wrote:
>> > Of course, this fires for normal integer code as well.
>> > Some cases it's a clear win:
>> >
>> > -: 41 be 1f 00 00 00 mov $0x1f,%r14d
>> > ...
>> > -: 4c 89 f1 mov %r14,%rcx
>> > +: b9 1f 00 00 00 mov $0x1f,%ecx
>> >
>> > sometimes not (increased code size):
>> >
>> > -: 41 bd 01 00 00 00 mov $0x1,%r13d
>> > -: 4d 89 ec mov %r13,%r12
>> > +: 41 bc 01 00 00 00 mov $0x1,%r12d
>> > +: 41 bd 01 00 00 00 mov $0x1,%r13d
> I suppose that might be fixed if instead of
>
> + /* Only use the constant when it's just as cheap as a reg move. */
> + if (set_src_cost (c, optimize_function_for_speed_p (cfun)) == 0)
> + return c;
>
> you'd unconditionall use size costs?
>
For one, without x86 cost changes that wouldn't affect anything.
For another, unconditionally using size costs, locally, would then
exchange the missed optimization from the second case to the first.
> We have an inverse issue elsewhere in that we don't CSE a propagated constant
> but get
>
> mov $0, %(eax)
> mov $0, 4%(eax)
> ...
>
> instead of doing one register clearing and then re-using that as zero. But I
> suppose reload is not exactly the place to fix that ;)
That would be exactly because x86 doesn't model immediate costs properly.
My patch trying to un-cse in exactly the spot where the value is about
to be clobbered. While we could give a go at this in a pre-reload pass,
it would be just a guess until register allocation does or does not
assign a hard reg to the constant, and does or does not choose an
alternative that requires the constant match an output.
Having reviewed more of the cc1 asm diff, the vast majority of cases are:
* the cx input to string insns,
* (1 << n).
These results are certainly skewed by the kind of stuff we do in gcc, but
it makes a fair amount of sense.
r~