This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Request for code review - (ZEE patch : Redundant Zero extension elimination)


Hi Richard,

     I finally got around to getting the data you wanted. Thanks for
the response. Please
find my comments below.


On Sun, Aug 9, 2009 at 2:15 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Sat, Aug 8, 2009 at 11:59 PM, Sriraman Tallam<tmsriram@google.com> wrote:
>> Hi,
>>
>>    Here is a patch to eliminate redundant zero-extension instructions
>> on x86_64.
>>
>> Tested: Ran the gcc regresssion testsuite on x86_64-linux and verified
>> that the results are the same with/without this patch.
>
> The patch misses testcases.

Added.

Why does zee run after register allocation?
> Your examples suggest that it will free hard registers so doing it before
> regalloc looks odd.

Originally, I had written this patch to have ZEE run before IRA.
However, I noticed
that IRA generates poorer code when my patch is turned on.

Here is to give an example of how badly RA can hurt . I show a piece
of code around a
zero-extend that got eliminated. The code on the right is after
eliminating zero-extends.
The code is pretty much the same except the extra move highlighted in
yellow. IRA is not
able to coalesce %esi and %r15d.

Base line :

48b760: imul   $0x9e406cb5,%r15d,%esi
48b767: mov    %rax,%rcx
48b76a: shr    $0x12,%esi
48b76d: and    %r12d,%esi
48b770: mov    %edi,%eax
48b772: add    $0x1,%edi
48b775: shr    $0x5,%eax
48b778: mov    %eax,%eax    # redundant zero extend
48b77a: lea    (%rcx,%rax,1),%rax
48b77e: cmp    %rax,%r9


-fzee :

48b7d0: imul   $0x9e406cb5,%r15d,%r15d # The destination should have
just been esi.
48b7d7: mov    %rax,%rcx
48b7da: shr    $0x12,%r15d
48b7de: mov    %r15d,%esi   # This move is useless if r15d and esi can
be coalesced into esi.
48b7e1: and    %r12d,%esi
48b7e4: mov    %edi,%eax
48b7e6: add    $0x1,%edi
48b7e9: shr    $0x5,%eax
Ok, zero-extend eliminated.
48b7ec: lea    (%rcx,%rax,1),%rax
48b7f0: cmp    %rax,%r9

Going after IRA preserves code quality and the useless extension gets removed.

>
> What is the compile-time impact of your patch on say, gcc bootstrap?
> How many percent of instructions are removed as useless zero-extensions
> during gcc bootstrap?  How much do CSiBE numbers improve?

CSiBE numbers :

Total number of zero-extension instructions before : 667.
Total number of zero-extension instructions after   : 122.
Performance : no measurable impact.

GCC bootstrap :

Total number of zero-extension instructions before  : 1456
Total number of zero-extension instructions after    :  5814
No impact on boot-strap time.


I have attached the latest patch :


On Sun, Aug 9, 2009 at 2:15 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Sat, Aug 8, 2009 at 11:59 PM, Sriraman Tallam<tmsriram@google.com> wrote:
>> Hi,
>>
>> ? ?Here is a patch to eliminate redundant zero-extension instructions
>> on x86_64.
>>
>> Tested: Ran the gcc regresssion testsuite on x86_64-linux and verified
>> that the results are the same with/without this patch.
>
> The patch misses testcases. ?Why does zee run after register allocation?
> Your examples suggest that it will free hard registers so doing it before
> regalloc looks odd.
>
> What is the compile-time impact of your patch on say, gcc bootstrap?
> How many percent of instructions are removed as useless zero-extensions
> during gcc bootstrap? ?How much do CSiBE numbers improve?
>
> Thanks,
> Richard.
>
>>
>> Problem Description :
>> ---------------------------------
>>
>> This pass is intended to be applicable only to targets that implicitly
>> zero-extend 64-bit registers after writing to their lower 32-bit half.
>> For instance, x86_64 zero-extends the upper bits of a register
>> implicitly whenever an instruction writes to its lower 32-bit half.
>> For example, the instruction *add edi,eax* also zero-extends the upper
>> 32-bits of rax after doing the addition. ?These zero extensions come
>> for free and GCC does not always exploit this well. ?That is, it has
>> been observed that there are plenty of cases where GCC explicitly
>> zero-extends registers for x86_64 that are actually useless because
>> these registers were already implicitly zero-extended in a prior
>> instruction. ?This pass tries to eliminate such useless zero extension
>> instructions.
>>
>> Motivating Example I :
>> ----------------------------------
>> For this program :
>> **********************************************
>> bad_code.c
>>
>> int mask[1000];
>>
>> int foo(unsigned x)
>> {
>> ?if (x < 10)
>> ? ?x = x * 45;
>> ?else
>> ? ?x = x * 78;
>> ?return mask[x];
>> }
>> **********************************************
>>
>> $ gcc -O2 bad_code.c
>> ?........
>> ?400315: ? ? ? b8 4e 00 00 00 ? ? ? ? ? ?mov ? ?$0x4e,%eax
>> ?40031a: ? ? ? 0f af f8 ? ? ? ? ? ? ? ? ? ? ? ?imul ? %eax,%edi
>> ?40031d: ? ? ? 89 ff ? ? ? ? ? ? ? ? ? ? ? ? ? ? mov ? ?%edi,%edi
>> ---> Useless zero extend.
>> ?40031f: ? ? ? 8b 04 bd 60 19 40 00 ? ?mov ? ?0x401960(,%rdi,4),%eax
>> ?400326: ? ? ? c3 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? retq
>> ?......
>> ?400330: ? ? ? ba 2d 00 00 00 ? ? ? ? ?mov ? ?$0x2d,%edx
>> ?400335: ? ? ? 0f af fa ? ? ? ? ? ? ? ? ? ? ?imul ? %edx,%edi
>> ?400338: ? ? ? 89 ff ? ? ? ? ? ? ? ? ? ? ? ? ? mov ? ?%edi,%edi ?--->
>> Useless zero extend.
>> ?40033a: ? ? ? 8b 04 bd 60 19 40 00 ? ?mov ? ?0x401960(,%rdi,4),%eax
>> ?400341: ? ? ? c3 ? ? ? ? ? ? ? ? ? ? ?retq
>>
>> $ gcc -O2 -fzee bad_code.c
>> ?......
>> ?400315: ? ? ? 6b ff 4e ? ? ? ? ? ? ? ?imul ? $0x4e,%edi,%edi
>> ?400318: ? ? ? 8b 04 bd 40 19 40 00 ? ?mov ? ?0x401940(,%rdi,4),%eax
>> ?40031f: ? ? ? c3 ? ? ? ? ? ? ? ? ? ? ?retq
>> ?400320: ? ? ? 6b ff 2d ? ? ? ? ? ? ? ?imul ? $0x2d,%edi,%edi
>> ?400323: ? ? ? 8b 04 bd 40 19 40 00 ? ?mov ? ?0x401940(,%rdi,4),%eax
>> ?40032a: ? ? ? c3 ? ? ? ? ? ? ? ? ? ? ?retq
>>
>>
>>
>> Thanks,
>>
>> Sriraman M Tallam.
>> Google, Inc.
>> tmsriram@google.com
>>
>

Attachment: zee20090923.txt
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]