This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug c/81389] _mm_cmpestri segfault on -O0


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81389

--- Comment #13 from Marc Glisse <glisse at gcc dot gnu.org> ---
(In reply to rockeet from comment #7)
> @Marc @Jakub @Martin
> Intel CPU document says: operand of _mm_cmpestri can be memory or mm
> register, when the operand is memory, it does not require alignment.

That's the doc for the CPU instruction. The intrinsic, as a C function, always
takes an object of type __m128i, not a register or memory. The only question is
what the alignment of the type __m128i is. In gcc, it is 16 bytes. What does
alignof (or _Alignof or whatever variant you can get working) return with
Intel's compiler?

> The issue is: GCC does not know this knowledge(memory operand need not
> memory align), and there is no way to enforce gcc to generate a _mm_cmpestri
> which always use memory operand, not mm register.

Use inline asm? Intrinsics are not quite as low level as you seem to expect.

> If I manually load the unaligned memory into an aligned `__m128i`, it has
> performance penalty on optimizing compilation.

Uh? With -O1, the compiler merges the unaligned load with pcmpestri (it knows
that the insn can read unaligned memory). Did you mean to talk about the
performance of code generated with -O0? We explicitly do not care about that.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]