This is a friendly reminder there's still no way to enjoy pextrw without undue zero/sign extension unless inline asm is used; there's even a gradient of ignominy from intrinsic to builtins, as exemplified by: $ cat pextrw.cc #include <smmintrin.h> long unsigned int foo1(__m128i x) { return _mm_extract_epi16(x, 3); } long unsigned int foo2(__v8hi x) { return __builtin_ia32_vec_ext_v8hi((__v8hi) x, 3); } int main() { return 0; } $ /usr/local/gcc-4.6-20100811/bin/g++ -O3 -march=native pextrw.cc 00000000004004a0 <_Z4foo1Dv2_x>: 4004a0: 66 0f c5 c0 03 pextrw $0x3,%xmm0,%eax 4004a5: 98 cwtl 4004a6: 48 98 cltq 4004a8: c3 retq 00000000004004b0 <_Z4foo2Dv8_s>: 4004b0: 66 0f c5 c0 03 pextrw $0x3,%xmm0,%eax 4004b5: 48 0f bf c0 movswq %ax,%rax 4004b9: c3 retq That's on x86-64, on a Intel I7 which, incidentally, is much faster at that whole pextrw business than previous generations. This report may or may not be construed as a duplicate of the long forgotten PR 41323.
(In reply to comment #0) > This is a friendly reminder there's still no way to enjoy pextrw without undue > zero/sign extension unless inline asm is used; there's even a gradient of > ignominy from intrinsic to builtins, as exemplified by: GCC does not simplify following instruction: Trying 8 -> 9: Failed to match this instruction: (set (reg:DI 65 [ D.6814 ]) (sign_extend:DI (sign_extend:SI (reg:HI 64)))) IMO, this RTX should simplify to: (set (reg:DI 65 [ D.6814 ]) (sign_extend:DI (reg:HI 64)))
The sign extension is because the builtin returns a signed quantity (unlike the machine instruction, which zero-extends), so the conversion is inserted by the language frontend.
*** This bug has been marked as a duplicate of 41323 ***