[PATCH] Fold zero extensions into bit-wise ANDs
Roger Sayle
roger@eyesopen.com
Tue Apr 16 21:00:00 GMT 2002
> >As an excellent counter example, consider
> >
> >long long foo(unsigned char bar)
> >{
> > return bar & 0x24;
> >}
> >
> >with the current mainline CVS we'd generate:
> >
> >foo: movzbl 4(%esp), %ecx
> > xorl %edx, %edx
> > andb $36, %cl
> > movzbl %cl, %eax
> > ret
> >
> >with my patch applied it now generates
> >
> >foo: movzbl 4(%esp), %eax
> > andl $36, %eax
> > cltd
> > ret
> >
> >which by my reckoning is shorter, faster and uses less registers.
>
> Hmm, the sequence I'd _expect_ to be optimal would seem to be
>
> foo: movl 4(%esp),%eax
> xorl %edx,%edx
> andl $36, %eax
> ret
The strange thing is that when the tranformation performed by my
patch is applied by hand, i.e. rewriting the function above as
long long foo(unsigned char bar)
{
return (long long)bar & 0x24;
}
Then GCC (both mainline and with my patch) generates
foo: movzbl 4(%esp), %eax
xorl %edx, %edx
andl $36, %eax
ret
which uses the prefered "xorl %edx, %edx" instead of the "cltd".
I'll need to investigate further why I'm getting a slightly different
instruction sequence. The "movzbl" should also be fixed.
Ahh, the mysteries of gcc.
Roger
--
Roger Sayle, E-mail: roger@eyesopen.com
OpenEye Scientific Software, WWW: http://www.eyesopen.com/
Suite 1107, 3600 Cerrillos Road, Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507. Fax: (+1) 505-473-0833
More information about the Gcc-patches
mailing list