[PATCH] Fold zero extensions into bit-wise ANDs

Roger Sayle roger@eyesopen.com
Tue Apr 16 21:00:00 GMT 2002


> >As an excellent counter example, consider
> >
> >long long foo(unsigned char bar)
> >{
> >  return bar & 0x24;
> >}
> >
> >with the current mainline CVS we'd generate:
> >
> >foo:	movzbl	4(%esp), %ecx
> >	xorl	%edx, %edx
> >	andb	$36, %cl
> >	movzbl	%cl, %eax
> >	ret
> >
> >with my patch applied it now generates
> >
> >foo:	movzbl	4(%esp), %eax
> >	andl	$36, %eax
> >	cltd
> >	ret
> >
> >which by my reckoning is shorter, faster and uses less registers.
>
> Hmm, the sequence I'd _expect_ to be optimal would seem to be
>
> foo:	movl	4(%esp),%eax
> 	xorl	%edx,%edx
> 	andl	$36, %eax
> 	ret

The strange thing is that when the tranformation performed by my
patch is applied by hand, i.e. rewriting the function above as

long long foo(unsigned char bar)
{
  return (long long)bar & 0x24;
}

Then GCC (both mainline and with my patch) generates

foo:	movzbl	4(%esp), %eax
	xorl	%edx, %edx
	andl	$36, %eax
	ret

which uses the prefered "xorl %edx, %edx" instead of the "cltd".
I'll need to investigate further why I'm getting a slightly different
instruction sequence.  The "movzbl" should also be fixed.

Ahh, the mysteries of gcc.

Roger
--
Roger Sayle,                         E-mail: roger@eyesopen.com
OpenEye Scientific Software,         WWW: http://www.eyesopen.com/
Suite 1107, 3600 Cerrillos Road,     Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507.         Fax: (+1) 505-473-0833



More information about the Gcc-patches mailing list