[Bug c/105521] New: missed optimization in modulo arithmetic

goswin-v-b at web dot de gcc-bugzilla@gcc.gnu.org
Sun May 8 04:08:22 GMT 2022


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105521

            Bug ID: 105521
           Summary: missed optimization in modulo arithmetic
           Product: gcc
           Version: 11.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: goswin-v-b at web dot de
  Target Milestone: ---

I'm trying to compute (a*a)%n for uint64_t types on x86_64 using "gcc -O2 -W
-Wall" like this:

  #include <stdint.h>
  #include <assert.h>

  uint64_t sqrmod(uint64_t a, uint64_t n) {
    assert(a < n);
    unsigned __int128 x = a;
    x *= a;
    return x % n;
  }

I expected to get the following code:

  sqrmod:
        cmpq    %rsi, %rdi
        jnb     .L13         // assert(a < n) failure
        movq    %rdi, %rax
        mul     %rdi
        div     %rsi
        movq    %rdx, %rax
        ret

The compiler does get the "mul" right but instead of the "div" it throws in a
call to "__umodti3". The "__umodti3" function is horribly long code that will
be worlds slower than a simple div.

Note: The "asset(a < n);" should tell the compiler that the "div" instruction
can not overflow and will not cause a #DivisionError. Without the assert the
compiler could (conditionally) add "a %= n;" for the same effect.

https://godbolt.org/z/cd57Wd4oo


More information about the Gcc-bugs mailing list