Will GCC eventually support SSE2 or SSE4.1?

Jonathan Wakely jwakely.gcc@gmail.com
Fri May 26 08:28:37 GMT 2023


On Fri, 26 May 2023 at 09:00, Stefan Kanthak <stefan.kanthak@nexgo.de> wrote:
>
> "Jonathan Wakely" <jwakely.gcc@gmail.com> wrote:
>
> > On Fri, 26 May 2023, 08:01 Andrew Pinski via Gcc, <gcc@gcc.gnu.org> wrote:
> >
> >> On Thu, May 25, 2023 at 11:56?PM Stefan Kanthak <stefan.kanthak@nexgo.de>
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> compile the following function on a system with Core2 processor
> >>> (released January 2008) for the 32-bit execution environment:
> >>>
> >>> --- demo.c ---
> >>> int ispowerof2(unsigned long long argument)
> >>> {
> >>>     return (argument & argument - 1) == 0;
> >>> }
> >>> --- EOF ---
> >>>
> >>> GCC 13.3: gcc -m32 -O3 demo.c
> >>>
> >>> NOTE: -mtune=native is the default!
> >>
> >> You need to use -march=native and not -mtune=native .... to turn on
> >> the architecture features.
>
> (Un)fortunately this changes nothing!
>
> STOP: that's wrong, it makes it even WORSE!
>
> # Compilation provided by Compiler Explorer at https://godbolt.org/
> ispowerof2(unsigned long long):
>         vmovq   xmm1, QWORD PTR [esp+4]
>         vpcmpeqd        xmm0, xmm0, xmm0
>         xor     eax, eax
>         vpaddq  xmm0, xmm1, xmm0
>         vpand   xmm0, xmm0, xmm1
>         vpunpcklqdq     xmm0, xmm0, xmm0
>         vptest  xmm0, xmm0
>         sete    al
>         ret
>
> That's what I call a REALLY EPIC FAILURE!
>
> Compare this unefficient BLOAT to the SSE4.1 code from my original post!
>
> > Yes this is just user error. You didn't use the right options to say you
> > want SSE2.
>
> ARGH: please read CAREFULLY what I wrote!

You wrote "Now add the -mtune=core2 option to EXPLICITLY enable the
NATIVE SSE4.1
alias "Penryn New Instruction Set" of the Core2 processor" which is
wrong, that's not what -mtune does.

Read the docs CAREFULLY: https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html

>
> 1) I didn't tell GCC to use SSE at all (I DON'T want any compiler to use
>    SSE per default, especially when the generated code is SLOWER and BIGGER
>    than conventional code using the general purpose registers)!
>
> 2) GCC uses SSE2 on its own, but doesn't support it well: it FAILS to use
>    PMOVMSKB here, despite -O3!

So report a bug to bugzilla, not via an email to the wrong list.

>
> 3) -march=core2 doesn't help too, GCC fails to use SSE4.1 at all!

core2 doesn't enable SSE4.1, as clearly shown in the docs:
https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html

If you send emails full of confused mistakes, don't be surprised if
the replies aren't what you want.

If you think GCC is generating bad code, file a bug. But make sure
you're actually using the right options to enable the right
instruction sets before complaining about the instructions used.


More information about the Gcc mailing list