This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH] Fix PR 18503
- From: Uros Bizjak <uros at kss-loka dot si>
- To: gcc-patches at gcc dot gnu dot org
- Date: Wed, 17 Nov 2004 10:54:38 +0100
- Subject: [PATCH] Fix PR 18503
Hello!
According to
http://www.intel.com/software/products/compilers/clin/docs/ug_cpp/comm1030.htm,
"sse_movss" and "sse2_movsd" patterns have wrong vec_merge: selector
bitmask.
2004-11-17 Uros Bizjak <uros@kss-loka.si>
PR target/18503
* configure/i386/i386.md: (sse_movss, sse2_movsd):
Fix wrong vec_merge selector bitmask.
Testcase from PR18503 produces wrong results with current mainline when
optimization is enabled:
(sse float version):
=============
#include <xmmintrin.h>
#include <stdio.h>
__m128 bug(__m128 a, __m128 b) {
__m128 c = _mm_sub_ps(a, b);
return _mm_move_ss(c, a);
}
int main(void) {
float val1 = 1.3f, val2 = 2.1f, result[4];
__m128 error = bug(_mm_load1_ps(&val1), _mm_load1_ps(&val2));
_mm_storeu_ps(result, error);
printf("%f %f %f %f\n", result[0], result[1], result[2], result[3]);
return 0;
}
(sse2 double version):
===============
#include <xmmintrin.h>
#include <stdio.h>
__m128 bug(__m128 a, __m128 b) {
__m128 c = _mm_sub_pd(a, b);
return _mm_move_sd(c, a);
}
int main(void) {
double val1 = 1.3, val2 = 2.1, result[2];
__m128 error = bug(_mm_load1_pd(&val1), _mm_load1_pd(&val2));
_mm_storeu_pd(result, error);
printf("%f %f\n", result[0], result[1]);
return 0;
}
The problem is that vec_merge selector bitmask is wrong, and when
optimizing, the whole function is wrongly combined into single
"vmsubv4sf3" pattern.
Bootstrapped on i686-pc-linux-gnu, regtest for c,c++in progress.
Uros.
Index: i386.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/i386/i386.md,v
retrieving revision 1.562
diff -u -p -r1.562 i386.md
--- i386.md 18 Oct 2004 13:01:31 -0000 1.562
+++ i386.md 17 Nov 2004 09:37:27 -0000
@@ -20817,7 +20817,7 @@
(vec_merge:V4SF
(match_operand:V4SF 1 "register_operand" "0")
(match_operand:V4SF 2 "register_operand" "x")
- (const_int 1)))]
+ (const_int 14)))]
"TARGET_SSE"
"movss\t{%2, %0|%0, %2}"
[(set_attr "type" "ssemov")
@@ -24279,7 +24279,7 @@
(vec_merge:V2DF
(match_operand:V2DF 1 "nonimmediate_operand" "0,0,0")
(match_operand:V2DF 2 "nonimmediate_operand" "x,m,x")
- (const_int 1)))]
+ (const_int 2)))]
"TARGET_SSE2 && ix86_binary_operator_ok (UNKNOWN, V2DFmode, operands)"
"@movsd\t{%2, %0|%0, %2}
movlpd\t{%2, %0|%0, %2}