[PATCH]: Fix pr target/21149
Uros Bizjak
uros.bizjak@kss-loka.si
Wed Jul 20 14:34:00 GMT 2005
Hello!
This patch fixes target/21149: "invalid code generation for _mm_movehl_ps SSE
intrisinc".
According to Intel docs, "movhlps A, B" does this operation
A = { a3 a2 a1 a0 }
B = { b3 b2 b1 b0 }
a3 a2 a1 a0 = a3 a2 b3 b2
where A = target and B = source.
sse_movhlps pattern selects its elements via vec_concat, so as an
intermediate vector, we have:
7 6 5 4 3 2 1 0
b3 b2 b1 b0 a3 a2 a1 a0
----op2---- ----op1----
Correct selector value should be
3 2 7 6
alias:
el (0) (parallel [(const_int 6)
el (1) (const_int 7)
el (2) (const_int 2)
el (3) (const_int 3)])))]
to get the resulting vector
3 2 1 0
a3 a2 b3 b2
The testcase from PR then compiles into:
gcc -O2 -msse -fno-strict-aliasing pr21149.c
./a.out
*Y=[9 1 2 -3]
foo=[2 -3 0 0]
bar=[0 0 2 -3]
The resut is now equal to the result without optimization:
gcc -msse -fno-strict-aliasing pr21149.c
./a.out
*Y=[9 1 2 -3]
foo=[2 -3 0 0]
bar=[0 0 2 -3]
Patch is bootstrapped on i686-pc-linux-gnu, regtesting in progress.
BTW: Could someone commit this patch to mainline if OK?
2005-07-20 Uros Bizjak <uros@kss-loka.si>
PR target/21149
* config/i386/i386.md (sse_movhlps): Fix vec_select values.
Uros.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pr21149.diff
Type: application/octet-stream
Size: 477 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20050720/0e3973d9/attachment.obj>
More information about the Gcc-patches
mailing list