[Bug middle-end/85980] New: suboptimal code for strncmp for powerpc64
msebor at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Tue May 29 18:17:00 GMT 2018
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85980
Bug ID: 85980
Summary: suboptimal code for strncmp for powerpc64
Product: gcc
Version: 8.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: msebor at gcc dot gnu.org
Target Milestone: ---
As discussed in https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01514.html, for
the following test case and the powerpc64le target, GCC emits the code below:
int f (__SIZE_TYPE__ i)
{
return __builtin_strncmp ("1234", "123", i < 3 ? i : 3);
}
f:
.LFB0:
.cfi_startproc
.LCF0:
0: addis 2,12,.TOC.-.LCF0@ha
addi 2,2,.TOC.-.LCF0@l
.localentry f,.-f
mflr 0
cmpldi 7,3,3
li 5,3
std 0,16(1)
stdu 1,-32(1)
.cfi_def_cfa_offset 32
.cfi_offset 65, 16
bgt 7,.L5
cmpdi 7,3,4 << unnecessary
mr 5,3 <<
ble 7,.L5 <<
li 5,4 <<
.L5:
addis 4,2,.LC1@toc@ha
addis 3,2,.LC0@toc@ha
addi 4,4,.LC1@toc@l
addi 3,3,.LC0@toc@l
bl strncmp
nop
addi 1,1,32
.cfi_def_cfa_offset 0
ld 0,16(1)
mtlr 0
.cfi_restore 65
blr
The comparison and the subsequent branch are helpful when strncmp is expanded
inline but do not benefit the library version of strncmp and only bloat and
slow down the caller. (The origins of the code are tracked down in
https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01406.html). The following
simple patch is enough to improve the generated code:
diff --git a/gcc/builtins.c b/gcc/builtins.c
index 841c1ef..5b9085b 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -4708,12 +4708,7 @@ expand_builtin_strncmp (tree exp, ATTRIBUTE_UNUSED rtx
target,
return target;
}
- /* Expand the library call ourselves using a stabilized argument
- list to avoid re-evaluating the function's arguments twice. */
- tree fn = build_call_nofold_loc (loc, fndecl, 3, arg1, arg2, len);
- gcc_assert (TREE_CODE (fn) == CALL_EXPR);
- CALL_EXPR_TAILCALL (fn) = CALL_EXPR_TAILCALL (exp);
- return expand_call (fn, target, target == const0_rtx);
+ return expand_call (exp, target, target == const0_rtx);
}
/* Expand a call to __builtin_saveregs, generating the result in TARGET,
More information about the Gcc-bugs
mailing list