This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[PATCH]: Fix pr target/21149

From: Uros Bizjak <uros dot bizjak at kss-loka dot si>
To: gcc-patches at gcc dot gnu dot org
Date: Wed, 20 Jul 2005 16:33:32 +0200
Subject: [PATCH]: Fix pr target/21149
Reply-to: ubizjak at gmail dot com

Hello!

This patch fixes target/21149: "invalid code generation for _mm_movehl_ps SSE
intrisinc".

According to Intel docs, "movhlps A, B" does this operation

A = { a3 a2 a1 a0 }
B = { b3 b2 b1 b0 }

a3 a2 a1 a0 = a3 a2 b3 b2

where A = target and B = source.

sse_movhlps pattern selects its elements via vec_concat, so as an
intermediate vector, we have:

7  6  5  4  3  2  1  0
b3 b2 b1 b0 a3 a2 a1 a0
----op2---- ----op1----

Correct selector value should be

3  2  7  6

alias:

el (0)	  (parallel [(const_int 6)
el (1)		     (const_int 7)
el (2)		     (const_int 2)
el (3)		     (const_int 3)])))]

to get the resulting vector

3  2  1  0
a3 a2 b3 b2

The testcase from PR then compiles into:
gcc -O2 -msse -fno-strict-aliasing pr21149.c
./a.out
*Y=[9 1 2 -3]
foo=[2 -3 0 0]
bar=[0 0 2 -3]

The resut is now equal to the result without optimization:
gcc -msse -fno-strict-aliasing pr21149.c
./a.out
*Y=[9 1 2 -3]
foo=[2 -3 0 0]
bar=[0 0 2 -3]

Patch is bootstrapped on i686-pc-linux-gnu, regtesting in progress.

BTW: Could someone commit this patch to mainline if OK?

2005-07-20  Uros Bizjak  <uros@kss-loka.si>

	PR target/21149
	* config/i386/i386.md (sse_movhlps): Fix vec_select values.

Uros.

Attachment: pr21149.diff
Description: Binary data

Follow-Ups:
- Re: [PATCH]: Fix pr target/21149
  - From: Richard Henderson

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]