[PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

Stefan Kanthak stefan.kanthak@nexgo.de
Tue Nov 24 13:57:52 GMT 2020


Andreas Schwab wrote 2020-11-11:

> On Nov 10 2020, Stefan Kanthak wrote:
> 
>> Eric Botcazou <botcazou@adacore.com> wrote:
>>
>>>> The implementation of the __ashlDI3(), __ashrDI3() and __lshrDI3() functions
>>>> is rather bad, it yields bad machine code at least on i386 and AMD64. Since
>>>> GCC knows how to shift integers twice the register size these functions can
>>>> be written as one-liners.
>>> 
>>> These functions are precisely meant to be used when GCC cannot do that.
>>
>> On which processor(s) is GCC unable to generate code for DWtype shifts?
> 
> On most 32-bit targets with -Os.

--- counter-FUD-example.c ---
long long __ashldi3 (long long value, int count) {
    return value << count;
}

long long __ashrdi3 (unsigned long long value, int count) {
    return value >> count;
}

unsigned long long __lshrdi3 (unsigned long long value, int count) {
    return value >> count;
}

// just for completeness sake:

unsigned long long __lshldi3 (unsigned long long value, int count) {
    return value << count;
}

extern   signed long long  left,  right;
extern unsigned long long uleft, uright;

int main(int argc, char **argv) {
   left <<= argc;
   right >>= argc;
   uleft <<= argc;
   uright >>= argc;
}
--- EOF ---

lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
...
Model name:            AMD EPYC 7262 8-Core Processor
...
gcc -m32          -o- -Os -S counter-FUD-example.c | fgrep 'call'
gcc -m32 -mno-sse -o- -Os -S counter-FUD-example.c | fgrep 'call'

'nuff said
Stefan

JTFR: without -mno-sse, GCC (at least version 8.4) generates rather
      awful and DEFINITELY LONGER code (42 vs. 38) bytes than with
      -mno-sse, i.e. -Os is buggy too!


More information about the Gcc-patches mailing list