This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fold zero extensions into bit-wise ANDs
- From: Linus Torvalds <torvalds at transmeta dot com>
- To: Roger Sayle <roger at eyesopen dot com>
- Cc: <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 16 Apr 2002 21:20:32 -0700 (PDT)
- Subject: Re: [PATCH] Fold zero extensions into bit-wise ANDs
On Tue, 16 Apr 2002, Roger Sayle wrote:
>
> The strange thing is that when the tranformation performed by my
> patch is applied by hand, i.e. rewriting the function above as
>
> long long foo(unsigned char bar)
> {
> return (long long)bar & 0x24;
> }
The above is _not_ the same. That cast changes what the code means.
The above parses as
return ((long long)bar) & 0x24;
which in turn (because of the widening of the arguments to '&') is the
same as
return ((long long)bar) & (long long)0x24;
while the original "return bar & 0x24;" with the types done explicitly is
return (long long) ((unsigned char)(bar) & (int)0x24);
which becomes
return (long long) ((int)(unsigned char)(bar) & (int) 0x24);
ie in the new case you created by adding the cast, you have a 64-bit
logical and operation, while in the original case you have a 32-bit
logical and that gets sign-extended to 64 bits.
> Then GCC (both mainline and with my patch) generates
>
> foo: movzbl 4(%esp), %eax
> xorl %edx, %edx
> andl $36, %eax
> ret
>
> which uses the prefered "xorl %edx, %edx" instead of the "cltd".
> I'll need to investigate further why I'm getting a slightly different
> instruction sequence. The "movzbl" should also be fixed.
>
> Ahh, the mysteries of gcc.
Not mysterious at all - by adding the cast, you really changed the code,
and now it became an "and" with the 64-big constant 0x00000000 00000024,
where the upper bits of the result are clearly zero.
In contrast, in the original code, you had a 32-bit and, and the upper
bits are sign-extended from the 32-bit result (and are "clearly zero" only
if you notice that the 32-bit result always has the sign bit clear.
Of course, the fact that gcc generates different code is an indication
that it missed an optimization, since the two (different) expressions
should clearly always get the same end result in the case of a positive
integer that fits in all types involved.
Linus