[Bug tree-optimization/104480] [12 Regression] Combining stores across memory locations might violate [intro.memory]/3

Thu Feb 10 08:40:43 GMT 2022

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104480

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|unknown                     |12.0

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
I don't think [intro.memory]/3 (wherever that should point to?) is realized
this way on CPUs with a less strong memory ordering guarantee than x86.  And we
definitely do not ensure atomicity or commit order unless you use atomic access
primitives.

So I think this is invalid.  As Andrew says we're happily combining

void foo (double * __restrict a, double *b)
{
  a[0] = b[0];
  a[1] = b[1];
}

into

        movupd  (%rsi), %xmm0
        movups  %xmm0, (%rdi)

since forever using vectorization which would have the exact same issue
when the store crosses a cacheline boundary.