This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Better instruction choice for cvtt*
- From: Jan Hubicka <jh at suse dot cz>
- To: gcc-patches at gcc dot gnu dot org, rth at cygnus dot com
- Date: Thu, 6 Mar 2003 16:28:28 +0100
- Subject: Better instruction choice for cvtt*
Hi,
For K8, the mem->reg forms of SSE->int conversions are vector decoded,
so it is better to split them up. This patch does so only for the
scalar ones, not for the builtins as I think that the builtins are
wrong. For instance:
(define_insn "fix_truncdfdi_sse"
[(set (match_operand:DI 0 "register_operand" "=r,r")
(fix:DI (match_operand:DF 1 "nonimmediate_operand" "Y,Ym")))]
"TARGET_64BIT && TARGET_SSE2"
"cvttsd2si{q}\t{%1, %0|%0, %1}"
[(set_attr "type" "sseicvt,sseicvt")
(set_attr "athlon_decode" "double,vector")])
(define_insn "cvttsd2si"
[(set (match_operand:SI 0 "register_operand" "=r,r")
(unspec:SI [(vec_select:DF (match_operand:V2DF 1 "register_operand" "x,xm")
(parallel [(const_int 0)]))] UNSPEC_FIX))]
"TARGET_SSE2"
"cvttsd2si\t{%1, %0|%0, %1}"
[(set_attr "type" "sseicvt")
(set_attr "mode" "SI")
(set_attr "athlon_decode" "double,vector")])
(define_insn "cvtsd2si"
[(set (match_operand:SI 0 "register_operand" "=r")
(fix:SI (vec_select:DF (match_operand:V2DF 1 "register_operand" "xm")
(parallel [(const_int 0)]))))]
"TARGET_SSE2"
"cvtsd2si\t{%1, %0|%0, %1}"
[(set_attr "type" "sseicvt")
(set_attr "mode" "SI")])
I guess it should be reversed - cvtsd2si should use unspec and cvttsd2si not
or am I missing something?
Anyway this is for followup patch. Is the current patch OK?
/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
/* { dg-options "-O2 -march=k8" } */
/* { dg-final { scan-assembler "cvttsd2si.*xmm" } } */
/* { dg-final { scan-assembler "cvttss2si.*xmm" } } */
int a,a1;
double b;
float b1;
t()
{
a=b;
a1=b1;
}
Thu Mar 6 16:22:38 CET 2003 Jan Hubicka <jh at suse dot cz>
* i386.md (cvtts?2si peep2): New.
Index: i386.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/i386/i386.md,v
retrieving revision 1.445
diff -c -3 -p -r1.445 i386.md
*** i386.md 25 Feb 2003 11:39:20 -0000 1.445
--- i386.md 6 Mar 2003 15:22:26 -0000
***************
*** 4519,4524 ****
--- 4519,4534 ----
[(set_attr "type" "sseicvt")
(set_attr "athlon_decode" "double,vector")])
+ ;; Avoid vector decoded form of the instruction.
+ (define_peephole2
+ [(match_scratch:SF 2 "x")
+ (set (match_operand:DI 0 "register_operand" "")
+ (fix:DI (match_operand:SF 1 "nonimmediate_operand" "")))]
+ "TARGET_K8 && !optimize_size"
+ [(set (match_dup 2) (match_dup 1))
+ (set (match_dup 0) (fix:DI (match_dup 2)))]
+ "")
+
(define_insn "fix_truncdfdi_sse"
[(set (match_operand:DI 0 "register_operand" "=r,r")
(fix:DI (match_operand:DF 1 "nonimmediate_operand" "Y,Ym")))]
***************
*** 4527,4532 ****
--- 4537,4552 ----
[(set_attr "type" "sseicvt,sseicvt")
(set_attr "athlon_decode" "double,vector")])
+ ;; Avoid vector decoded form of the instruction.
+ (define_peephole2
+ [(match_scratch:DF 2 "Y")
+ (set (match_operand:DI 0 "register_operand" "")
+ (fix:DI (match_operand:DF 1 "nonimmediate_operand" "")))]
+ "TARGET_K8 && !optimize_size"
+ [(set (match_dup 2) (match_dup 1))
+ (set (match_dup 0) (fix:DI (match_dup 2)))]
+ "")
+
;; Signed conversion to SImode.
(define_expand "fix_truncxfsi2"
***************
*** 4630,4635 ****
--- 4650,4665 ----
[(set_attr "type" "sseicvt")
(set_attr "athlon_decode" "double,vector")])
+ ;; Avoid vector decoded form of the instruction.
+ (define_peephole2
+ [(match_scratch:SF 2 "x")
+ (set (match_operand:SI 0 "register_operand" "")
+ (fix:SI (match_operand:SF 1 "nonimmediate_operand" "")))]
+ "TARGET_K8 && !optimize_size"
+ [(set (match_dup 2) (match_dup 1))
+ (set (match_dup 0) (fix:SI (match_dup 2)))]
+ "")
+
(define_insn "fix_truncdfsi_sse"
[(set (match_operand:SI 0 "register_operand" "=r,r")
(fix:SI (match_operand:DF 1 "nonimmediate_operand" "Y,Ym")))]
***************
*** 4637,4642 ****
--- 4667,4682 ----
"cvttsd2si\t{%1, %0|%0, %1}"
[(set_attr "type" "sseicvt")
(set_attr "athlon_decode" "double,vector")])
+
+ ;; Avoid vector decoded form of the instruction.
+ (define_peephole2
+ [(match_scratch:DF 2 "Y")
+ (set (match_operand:SI 0 "register_operand" "")
+ (fix:SI (match_operand:DF 1 "nonimmediate_operand" "")))]
+ "TARGET_K8 && !optimize_size"
+ [(set (match_dup 2) (match_dup 1))
+ (set (match_dup 0) (fix:SI (match_dup 2)))]
+ "")
(define_split
[(set (match_operand:SI 0 "register_operand" "")