This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fold zero extensions into bit-wise ANDs
- From: Linus Torvalds <torvalds at transmeta dot com>
- To: roger at eyesopen dot com, gcc-patches at gcc dot gnu dot org
- Cc:
- Date: Tue, 16 Apr 2002 10:10:09 -0700
- Subject: Re: [PATCH] Fold zero extensions into bit-wise ANDs
- Newsgroups: linux.egcs.patches
- Organization:
- References: <20020415141004.E19759@redhat.com>
In article <Pine.LNX.4.33.0204151729250.26035-100000@www.eyesopen.com> you write:
>
>As an excellent counter example, consider
>
>long long foo(unsigned char bar)
>{
> return bar & 0x24;
>}
>
>with the current mainline CVS we'd generate:
>
>foo: movzbl 4(%esp), %ecx
> xorl %edx, %edx
> andb $36, %cl
> movzbl %cl, %eax
> ret
>
>with my patch applied it now generates
>
>foo: movzbl 4(%esp), %eax
> andl $36, %eax
> cltd
> ret
>
>which by my reckoning is shorter, faster and uses less registers.
Hmm, the sequence I'd _expect_ to be optimal would seem to be
foo:
movl 4(%esp),%eax
xorl %edx,%edx
andl $36, %eax
ret
since:
- there's no point in doing the movzbl when we'll clear it by hand, and
a plain "movl" is smaller (and faster at least on some machines).
This, of course, only works when you know the alignment of the data to
be ok (which we know in this example due to it being an argument to
the function, but maybe that's not the common case)
- "xorl reg,reg" is certainly recommended by Intel over cltd, which is
rather slow and also has a (unnecessary) data dependency.
The xorl is definitely preferred, the movl/movzbl thing is just a detail
that only works in some cases.
Linus