This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH: PR target/50603: [x32] Unnecessary lea


On Thu, Oct 6, 2011 at 11:33 PM, H.J. Lu <hjl.tools@gmail.com> wrote:

>>>>>>> OTOH, x86_64 and i686 targets can also benefit from this change. If
>>>>>>> combine can't create more complex address (covered by lea), then it
>>>>>>> will simply propagate memory operand back into the add insn. It looks
>>>>>>> to me that we can't loose here, so:
>>>>>>>
>>>>>>> ?/* Improve address combine. ?*/
>>>>>>> ?if (code == PLUS && MEM_P (src2))
>>>>>>> ? ?src2 = force_reg (mode, src2);
>>>>>>>
>>>>>>> Any opinions?
>>>>>>>
>>>>>>
>>>>>> It doesn't work with 64bit libstdc++:
>>>>>
>>>>> Yeah, yeah. ix86_output_mi_thunk has some ... ?issues.
>>>>>
>>>>> Please try attached patch that introduces ix86_emit_binop and uses it
>>>>> in a bunch of places.
>>>
>>>> I tried it on GCC. ?There are no regressions. ?The bugs are fixed for x32.
>>>> Here are size comparison with GCC runtime libraries on ia32, x32 and
>>>> x86-64:
>>>
>>>> ?884093 ? 18600 ? 27064 ?929757 ? e2fdd old libstdc++.so
>>>> ?884189 ? 18600 ? 27064 ?929853 ? e303d new libs/libstdc++.so
>>>>
>>>> The new code is
>>>>
>>>> mov ? ?0xc(%edi),%eax
>>>> mov ? ?%eax,0x8(%esi)
>>>> mov ? ?-0xc(%eax),%eax
>>>> mov ? ?0x10(%edi),%edx
>>>> lea ? ?0x8(%esi,%eax,1),%eax
>>>>
>>>> The old one is
>>>>
>>>> mov ? ?0xc(%edi),%edx
>>>> lea ? ?0x8(%esi),%eax
>>>> mov ? ?%edx,0x8(%esi)
>>>> add ? ?-0xc(%edx),%eax
>>>> mov ? ?0x10(%edi),%edx
>>>
>>> The new code merged lea+add into one lea, so it looks quite OK to me.
>>>
>>> Do you have some performance numbers?
>>>
>>
>> I will report performance numbers in a few days.
>
> The differences in SPEC CPU 2006 on ia32, x86-64 and
> x32 are within noise range.

Great.

Attached is a slightly updated patch, where we consider only
integer-mode PLUS RTXes.

2011-10-07  Uros Bizjak  <ubizjak@gmail.com>
	    H.J. Lu  <hongjiu.lu@intel.com>

	PR target/50603
	* config/i386/i386.c (ix86_fixup_binary_operands): Force src2 of
	integer PLUS RTX to a register to improve address combine.

testsuite/ChangeLog:

2011-10-07  Uros Bizjak  <ubizjak@gmail.com>
	    H.J. Lu  <hongjiu.lu@intel.com>

	PR target/50603
	* gcc.target/i386/pr50603.c: New test.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN.

Uros.

Attachment: p.diff.txt
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]