[PATCH][ARM] CLZ optimization for clzsi2 and clzdi2

Doug Kwan (關振德) dougkwan@google.com
Thu Jun 12 17:59:00 GMT 2008


Yes, I did. The code generated for arm5 was poor and that's the reason
why I did the optimization in the first place.  However, I don't know
all variants of the ARM architectures so it is prudent for me to leave
those unoptimized.

-Doug

2008/6/12 Paul Brook <paul@codesourcery.com>:
>> > Having separate C and asm implementations seemed like more trouble than
>> > it was worth. The code generated for the C implementation wasn't great
>> > either, so I just implemented the whole thing in assembly. This means we
>> > can use __builtin_clz in longlong.h, and get ffs, ctz et. al. for free.
>>
>> Originally I added also a C implementation because I think the C
>> compiler may generate better code for specific pre armv5te
>> architecture if some one configures his/her tree for a specific arch
>> than using a fixed assembly implementation for all architecture.
>> Without clz, builtin_clz is non-trivial, so I think it may be good
>> idea to leave it to the C compiler to do architecture specific
>> scheduling and optimization.
>
> Have you actually looked at the code it generates? It's pretty poor. clz is
> hard to write efficiently in C.
>
> Paul
>



More information about the Gcc-patches mailing list