[PATCH]: Fix pr target/21149

Uros Bizjak uros.bizjak@kss-loka.si
Wed Jul 20 14:34:00 GMT 2005


Hello!

This patch fixes target/21149: "invalid code generation for _mm_movehl_ps SSE
intrisinc".

According to Intel docs, "movhlps A, B" does this operation

A = { a3 a2 a1 a0 }
B = { b3 b2 b1 b0 }

a3 a2 a1 a0 = a3 a2 b3 b2

where A = target and B = source.

sse_movhlps pattern selects its elements via vec_concat, so as an
intermediate vector, we have:

7  6  5  4  3  2  1  0
b3 b2 b1 b0 a3 a2 a1 a0
----op2---- ----op1----

Correct selector value should be

3  2  7  6

alias:

el (0)	  (parallel [(const_int 6)
el (1)		     (const_int 7)
el (2)		     (const_int 2)
el (3)		     (const_int 3)])))]

to get the resulting vector

3  2  1  0
a3 a2 b3 b2

The testcase from PR then compiles into:
gcc -O2 -msse -fno-strict-aliasing pr21149.c
./a.out
*Y=[9 1 2 -3]
foo=[2 -3 0 0]
bar=[0 0 2 -3]

The resut is now equal to the result without optimization:
gcc -msse -fno-strict-aliasing pr21149.c
./a.out
*Y=[9 1 2 -3]
foo=[2 -3 0 0]
bar=[0 0 2 -3]

Patch is bootstrapped on i686-pc-linux-gnu, regtesting in progress.

BTW: Could someone commit this patch to mainline if OK?

2005-07-20  Uros Bizjak  <uros@kss-loka.si>

	PR target/21149
	* config/i386/i386.md (sse_movhlps): Fix vec_select values.

Uros.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: pr21149.diff
Type: application/octet-stream
Size: 477 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20050720/0e3973d9/attachment.obj>


More information about the Gcc-patches mailing list