[Bug target/103973] x86: 4-way comparison of floats/doubles with spaceship operator possibly suboptimal

cvs-commit at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Mon Jan 17 12:41:31 GMT 2022


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103973

--- Comment #9 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:463d9108766dcbb6a1051985e6c840a46897fe10

commit r12-6637-g463d9108766dcbb6a1051985e6c840a46897fe10
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Mon Jan 17 13:39:05 2022 +0100

    widening_mul, i386: Improve spaceship expansion on x86 [PR103973]

    C++20:
     #include <compare>
     auto cmp4way(double a, double b)
     {
       return a <=> b;
     }
    expands to:
            ucomisd %xmm1, %xmm0
            jp      .L8
            movl    $0, %eax
            jne     .L8
    .L2:
            ret
            .p2align 4,,10
            .p2align 3
    .L8:
            comisd  %xmm0, %xmm1
            movl    $-1, %eax
            ja      .L2
            ucomisd %xmm1, %xmm0
            setbe   %al
            addl    $1, %eax
            ret
    That is 3 comparisons of the same operands.
    The following patch improves it to just one comparison:
            comisd  %xmm1, %xmm0
            jp      .L4
            seta    %al
            movl    $0, %edx
            leal    -1(%rax,%rax), %eax
            cmove   %edx, %eax
            ret
    .L4:
            movl    $2, %eax
            ret
    While a <=> b expands to a == b ? 0 : a < b ? -1 : a > b ? 1 : 2
    where the first comparison is equality and this shouldn't raise
    exceptions on qNaN operands, if the operands aren't equal (which
    includes unordered cases), then it immediately performs < or >
    comparison and that raises exceptions even on qNaNs, so we can just
    perform a single comparison that raises exceptions on qNaN.
    As the 4 different cases are encoded as
    ZF CF PF
    1  1  1  a unordered b
    0  0  0  a > b
    0  1  0  a < b
    1  0  0  a == b
    we can emit optimal sequence of comparions, first jp
    for the unordered case, then je for the == case and finally jb
    for the < case.

    The patch pattern recognizes spaceship-like comparisons during
    widening_mul if the spaceship optab is implemented, and replaces
    those comparisons with comparisons of .SPACESHIP ifn which returns
    -1/0/1/2 based on the comparison.  This seems to work well both for the
    case of just returning the -1/0/1/2 (when we have just a common
    successor with a PHI) or when the different cases are handled with
    various other basic blocks.  The testcases cover both of those cases,
    the latter with different function calls in those.

    2022-01-17  Jakub Jelinek  <jakub@redhat.com>

            PR target/103973
            * tree-cfg.h (cond_only_block_p): Declare.
            * tree-ssa-phiopt.c (cond_only_block_p): Move function to ...
            * tree-cfg.c (cond_only_block_p): ... here.  No longer static.
            * optabs.def (spaceship_optab): New optab.
            * internal-fn.def (SPACESHIP): New internal function.
            * internal-fn.h (expand_SPACESHIP): Declare.
            * internal-fn.c (expand_PHI): Formatting fix.
            (expand_SPACESHIP): New function.
            * tree-ssa-math-opts.c (optimize_spaceship): New function.
            (math_opts_dom_walker::after_dom_children): Use it.
            * config/i386/i386.md (spaceship<mode>3): New define_expand.
            * config/i386/i386-protos.h (ix86_expand_fp_spaceship): Declare.
            * config/i386/i386-expand.c (ix86_expand_fp_spaceship): New
function.
            * doc/md.texi (spaceship@var{m}3): Document.

            * gcc.target/i386/pr103973-1.c: New test.
            * gcc.target/i386/pr103973-2.c: New test.
            * gcc.target/i386/pr103973-3.c: New test.
            * gcc.target/i386/pr103973-4.c: New test.
            * gcc.target/i386/pr103973-5.c: New test.
            * gcc.target/i386/pr103973-6.c: New test.
            * gcc.target/i386/pr103973-7.c: New test.
            * gcc.target/i386/pr103973-8.c: New test.
            * gcc.target/i386/pr103973-9.c: New test.
            * gcc.target/i386/pr103973-10.c: New test.
            * gcc.target/i386/pr103973-11.c: New test.
            * gcc.target/i386/pr103973-12.c: New test.
            * gcc.target/i386/pr103973-13.c: New test.
            * gcc.target/i386/pr103973-14.c: New test.
            * gcc.target/i386/pr103973-15.c: New test.
            * gcc.target/i386/pr103973-16.c: New test.
            * gcc.target/i386/pr103973-17.c: New test.
            * gcc.target/i386/pr103973-18.c: New test.
            * gcc.target/i386/pr103973-19.c: New test.
            * gcc.target/i386/pr103973-20.c: New test.
            * g++.target/i386/pr103973-1.C: New test.
            * g++.target/i386/pr103973-2.C: New test.
            * g++.target/i386/pr103973-3.C: New test.
            * g++.target/i386/pr103973-4.C: New test.
            * g++.target/i386/pr103973-5.C: New test.
            * g++.target/i386/pr103973-6.C: New test.
            * g++.target/i386/pr103973-7.C: New test.
            * g++.target/i386/pr103973-8.C: New test.
            * g++.target/i386/pr103973-9.C: New test.
            * g++.target/i386/pr103973-10.C: New test.
            * g++.target/i386/pr103973-11.C: New test.
            * g++.target/i386/pr103973-12.C: New test.
            * g++.target/i386/pr103973-13.C: New test.
            * g++.target/i386/pr103973-14.C: New test.
            * g++.target/i386/pr103973-15.C: New test.
            * g++.target/i386/pr103973-16.C: New test.
            * g++.target/i386/pr103973-17.C: New test.
            * g++.target/i386/pr103973-18.C: New test.
            * g++.target/i386/pr103973-19.C: New test.
            * g++.target/i386/pr103973-20.C: New test.


More information about the Gcc-bugs mailing list