[Bug c++/90606] New: Replace mfence with faster xchg for std::memory_order_seq_cst.

Thu May 23 18:49:00 GMT 2019

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90606

            Bug ID: 90606
           Summary: Replace mfence with faster xchg for
                    std::memory_order_seq_cst.
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: maxim.yegorushkin at gmail dot com
  Target Milestone: ---

The following example:

    #include <atomic>
    std::atomic<int> a;
    void foo_seq_cst(int b) { a = b; }

Compiles with `gcc-9.1 -O3 -std=c++17 -pthread` into 

    foo_seq_cst(int):
        mov     DWORD PTR a[rip], edi
        mfence
        ret

Whereas `clang++-9 -O3 -std=c++17 -pthread` compiles it into:

    foo_seq_cst(int):                       # @foo_seq_cst(int)
        xchg    dword ptr [rip + a], edi
        ret

xchg was benchmarked to be 2-3x faster than mfence and Linux kernel switched to
xchg were possible. 

gcc should also switch to using xchg for std::memory_order_seq_cst.

See:

https://lore.kernel.org/lkml/20160112150032-mutt-send-email-mst@redhat.com/

https://stackoverflow.com/questions/56205324/why-do-gcc-inserts-mfence-where-clang-dont-use-it/