This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Hi Richard, I finally got around to getting the data you wanted. Thanks for the response. Please find my comments below. On Sun, Aug 9, 2009 at 2:15 PM, Richard Guenther <richard.guenther@gmail.com> wrote: > On Sat, Aug 8, 2009 at 11:59 PM, Sriraman Tallam<tmsriram@google.com> wrote: >> Hi, >> >> Here is a patch to eliminate redundant zero-extension instructions >> on x86_64. >> >> Tested: Ran the gcc regresssion testsuite on x86_64-linux and verified >> that the results are the same with/without this patch. > > The patch misses testcases. Added. Why does zee run after register allocation? > Your examples suggest that it will free hard registers so doing it before > regalloc looks odd. Originally, I had written this patch to have ZEE run before IRA. However, I noticed that IRA generates poorer code when my patch is turned on. Here is to give an example of how badly RA can hurt . I show a piece of code around a zero-extend that got eliminated. The code on the right is after eliminating zero-extends. The code is pretty much the same except the extra move highlighted in yellow. IRA is not able to coalesce %esi and %r15d. Base line : 48b760: imul $0x9e406cb5,%r15d,%esi 48b767: mov %rax,%rcx 48b76a: shr $0x12,%esi 48b76d: and %r12d,%esi 48b770: mov %edi,%eax 48b772: add $0x1,%edi 48b775: shr $0x5,%eax 48b778: mov %eax,%eax # redundant zero extend 48b77a: lea (%rcx,%rax,1),%rax 48b77e: cmp %rax,%r9 -fzee : 48b7d0: imul $0x9e406cb5,%r15d,%r15d # The destination should have just been esi. 48b7d7: mov %rax,%rcx 48b7da: shr $0x12,%r15d 48b7de: mov %r15d,%esi # This move is useless if r15d and esi can be coalesced into esi. 48b7e1: and %r12d,%esi 48b7e4: mov %edi,%eax 48b7e6: add $0x1,%edi 48b7e9: shr $0x5,%eax Ok, zero-extend eliminated. 48b7ec: lea (%rcx,%rax,1),%rax 48b7f0: cmp %rax,%r9 Going after IRA preserves code quality and the useless extension gets removed. > > What is the compile-time impact of your patch on say, gcc bootstrap? > How many percent of instructions are removed as useless zero-extensions > during gcc bootstrap? How much do CSiBE numbers improve? CSiBE numbers : Total number of zero-extension instructions before : 667. Total number of zero-extension instructions after : 122. Performance : no measurable impact. GCC bootstrap : Total number of zero-extension instructions before : 1456 Total number of zero-extension instructions after : 5814 No impact on boot-strap time. I have attached the latest patch : On Sun, Aug 9, 2009 at 2:15 PM, Richard Guenther <richard.guenther@gmail.com> wrote: > On Sat, Aug 8, 2009 at 11:59 PM, Sriraman Tallam<tmsriram@google.com> wrote: >> Hi, >> >> ? ?Here is a patch to eliminate redundant zero-extension instructions >> on x86_64. >> >> Tested: Ran the gcc regresssion testsuite on x86_64-linux and verified >> that the results are the same with/without this patch. > > The patch misses testcases. ?Why does zee run after register allocation? > Your examples suggest that it will free hard registers so doing it before > regalloc looks odd. > > What is the compile-time impact of your patch on say, gcc bootstrap? > How many percent of instructions are removed as useless zero-extensions > during gcc bootstrap? ?How much do CSiBE numbers improve? > > Thanks, > Richard. > >> >> Problem Description : >> --------------------------------- >> >> This pass is intended to be applicable only to targets that implicitly >> zero-extend 64-bit registers after writing to their lower 32-bit half. >> For instance, x86_64 zero-extends the upper bits of a register >> implicitly whenever an instruction writes to its lower 32-bit half. >> For example, the instruction *add edi,eax* also zero-extends the upper >> 32-bits of rax after doing the addition. ?These zero extensions come >> for free and GCC does not always exploit this well. ?That is, it has >> been observed that there are plenty of cases where GCC explicitly >> zero-extends registers for x86_64 that are actually useless because >> these registers were already implicitly zero-extended in a prior >> instruction. ?This pass tries to eliminate such useless zero extension >> instructions. >> >> Motivating Example I : >> ---------------------------------- >> For this program : >> ********************************************** >> bad_code.c >> >> int mask[1000]; >> >> int foo(unsigned x) >> { >> ?if (x < 10) >> ? ?x = x * 45; >> ?else >> ? ?x = x * 78; >> ?return mask[x]; >> } >> ********************************************** >> >> $ gcc -O2 bad_code.c >> ?........ >> ?400315: ? ? ? b8 4e 00 00 00 ? ? ? ? ? ?mov ? ?$0x4e,%eax >> ?40031a: ? ? ? 0f af f8 ? ? ? ? ? ? ? ? ? ? ? ?imul ? %eax,%edi >> ?40031d: ? ? ? 89 ff ? ? ? ? ? ? ? ? ? ? ? ? ? ? mov ? ?%edi,%edi >> ---> Useless zero extend. >> ?40031f: ? ? ? 8b 04 bd 60 19 40 00 ? ?mov ? ?0x401960(,%rdi,4),%eax >> ?400326: ? ? ? c3 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? retq >> ?...... >> ?400330: ? ? ? ba 2d 00 00 00 ? ? ? ? ?mov ? ?$0x2d,%edx >> ?400335: ? ? ? 0f af fa ? ? ? ? ? ? ? ? ? ? ?imul ? %edx,%edi >> ?400338: ? ? ? 89 ff ? ? ? ? ? ? ? ? ? ? ? ? ? mov ? ?%edi,%edi ?---> >> Useless zero extend. >> ?40033a: ? ? ? 8b 04 bd 60 19 40 00 ? ?mov ? ?0x401960(,%rdi,4),%eax >> ?400341: ? ? ? c3 ? ? ? ? ? ? ? ? ? ? ?retq >> >> $ gcc -O2 -fzee bad_code.c >> ?...... >> ?400315: ? ? ? 6b ff 4e ? ? ? ? ? ? ? ?imul ? $0x4e,%edi,%edi >> ?400318: ? ? ? 8b 04 bd 40 19 40 00 ? ?mov ? ?0x401940(,%rdi,4),%eax >> ?40031f: ? ? ? c3 ? ? ? ? ? ? ? ? ? ? ?retq >> ?400320: ? ? ? 6b ff 2d ? ? ? ? ? ? ? ?imul ? $0x2d,%edi,%edi >> ?400323: ? ? ? 8b 04 bd 40 19 40 00 ? ?mov ? ?0x401940(,%rdi,4),%eax >> ?40032a: ? ? ? c3 ? ? ? ? ? ? ? ? ? ? ?retq >> >> >> >> Thanks, >> >> Sriraman M Tallam. >> Google, Inc. >> tmsriram@google.com >> >
Attachment:
zee20090923.txt
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |