This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH]: Fix pr target/21149


Hello!

This patch fixes target/21149: "invalid code generation for _mm_movehl_ps SSE
intrisinc".

According to Intel docs, "movhlps A, B" does this operation

A = { a3 a2 a1 a0 }
B = { b3 b2 b1 b0 }

a3 a2 a1 a0 = a3 a2 b3 b2

where A = target and B = source.

sse_movhlps pattern selects its elements via vec_concat, so as an
intermediate vector, we have:

7  6  5  4  3  2  1  0
b3 b2 b1 b0 a3 a2 a1 a0
----op2---- ----op1----

Correct selector value should be

3  2  7  6

alias:

el (0)	  (parallel [(const_int 6)
el (1)		     (const_int 7)
el (2)		     (const_int 2)
el (3)		     (const_int 3)])))]

to get the resulting vector

3  2  1  0
a3 a2 b3 b2

The testcase from PR then compiles into:
gcc -O2 -msse -fno-strict-aliasing pr21149.c
./a.out
*Y=[9 1 2 -3]
foo=[2 -3 0 0]
bar=[0 0 2 -3]

The resut is now equal to the result without optimization:
gcc -msse -fno-strict-aliasing pr21149.c
./a.out
*Y=[9 1 2 -3]
foo=[2 -3 0 0]
bar=[0 0 2 -3]

Patch is bootstrapped on i686-pc-linux-gnu, regtesting in progress.

BTW: Could someone commit this patch to mainline if OK?

2005-07-20  Uros Bizjak  <uros@kss-loka.si>

	PR target/21149
	* config/i386/i386.md (sse_movhlps): Fix vec_select values.

Uros.


Attachment: pr21149.diff
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]