This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH][AArch64] Improve float to int moves
- From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- To: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Cc: nd <nd at arm dot com>, James Greenhalgh <James dot Greenhalgh at arm dot com>, "Richard Earnshaw" <Richard dot Earnshaw at arm dot com>
- Date: Wed, 26 Apr 2017 12:39:51 +0000
- Subject: [PATCH][AArch64] Improve float to int moves
- Authentication-results: sourceware.org; auth=none
- Authentication-results: arm.com; dkim=none (message not signed) header.d=none;arm.com; dmarc=none action=none header.from=arm.com;
- Nodisclaimer: True
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
Float to int moves currently generate inefficient code due to
hacks used in the movsi and movdi patterns. The 'r = w' variant
uses '*' which explicitly tells the register allocator to ignore it.
As a result float to int moves typically spill to the stack, which is
extremely inefficient. For example:
static inline unsigned asuint (float f)
{
union { float f; unsigned i; } u = {f};
return u.i;
}
float foo (float x)
{
unsigned i = asuint (x);
if (__builtin_expect (i > 42, 0))
return x*x;
return i;
}
generates:
sub sp, sp, #16
str s0, [sp, 12]
ldr w0, [sp, 12]
cmp w0, 42
bhi .L7
scvtf s0, w0
add sp, sp, 16
ret
.L7:
fmul s0, s0, s0
add sp, sp, 16
ret
Removing '*' from the variant generates:
fmov w0, s0
cmp w0, 42
bhi .L6
scvtf s0, w0
ret
.L6:
fmul s0, s0, s0
ret
Passes regress & bootstrap, OK for commit?
ChangeLog:
2017-04-26 Wilco Dijkstra <wdijkstr@arm.com>
* config/aarch64/aarch64.md (movsi_aarch64): Remove '*' from r=w.
(movdi_aarch64): Likewise.
--
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 51368e29f2d1fd12f48a972bd81a08589a720e07..d656e92e1ff02bdc90c824227ec3b2e1ccfe665a 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1026,8 +1026,8 @@ (define_expand "mov<mode>"
)
(define_insn_and_split "*movsi_aarch64"
- [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r,*w,m, m,r,r ,*w, r,*w")
- (match_operand:SI 1 "aarch64_mov_operand" " r,r,k,M,n,m, m,rZ,*w,S,Ush,rZ,*w,*w"))]
+ [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r,*w,m, m,r,r ,*w,r,*w")
+ (match_operand:SI 1 "aarch64_mov_operand" " r,r,k,M,n,m, m,rZ,*w,S,Ush,rZ,w,*w"))]
"(register_operand (operands[0], SImode)
|| aarch64_reg_or_zero (operands[1], SImode))"
"@
@@ -1058,8 +1058,8 @@ (define_insn_and_split "*movsi_aarch64"
)
(define_insn_and_split "*movdi_aarch64"
- [(set (match_operand:DI 0 "nonimmediate_operand" "=r,k,r,r,r,r,*w,m, m,r,r, *w, r,*w,w")
- (match_operand:DI 1 "aarch64_mov_operand" " r,r,k,N,n,m, m,rZ,*w,S,Ush,rZ,*w,*w,Dd"))]
+ [(set (match_operand:DI 0 "nonimmediate_operand" "=r,k,r,r,r,r,*w,m, m,r,r, *w,r,*w,w")
+ (match_operand:DI 1 "aarch64_mov_operand" " r,r,k,N,n,m, m,rZ,*w,S,Ush,rZ,w,*w,Dd"))]
"(register_operand (operands[0], DImode)
|| aarch64_reg_or_zero (operands[1], DImode))"
"@