[Bug middle-end/85980] New: suboptimal code for strncmp for powerpc64

msebor at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Tue May 29 18:17:00 GMT 2018


            Bug ID: 85980
           Summary: suboptimal code for strncmp for powerpc64
           Product: gcc
           Version: 8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: msebor at gcc dot gnu.org
  Target Milestone: ---

As discussed in https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01514.html, for
the following test case and the powerpc64le target, GCC emits the code below:

int f (__SIZE_TYPE__ i)
  return __builtin_strncmp ("1234", "123", i < 3 ? i : 3);

0:      addis 2,12,.TOC.-.LCF0@ha
        addi 2,2,.TOC.-.LCF0@l
        .localentry     f,.-f
        mflr 0
        cmpldi 7,3,3
        li 5,3
        std 0,16(1)
        stdu 1,-32(1)
        .cfi_def_cfa_offset 32
        .cfi_offset 65, 16
        bgt 7,.L5
        cmpdi 7,3,4                  << unnecessary
        mr 5,3                       << 
        ble 7,.L5                    <<
        li 5,4                       <<
        addis 4,2,.LC1@toc@ha
        addis 3,2,.LC0@toc@ha
        addi 4,4,.LC1@toc@l
        addi 3,3,.LC0@toc@l
        bl strncmp
        addi 1,1,32
        .cfi_def_cfa_offset 0
        ld 0,16(1)
        mtlr 0
        .cfi_restore 65

The comparison and the subsequent branch are helpful when strncmp is expanded
inline but do not benefit the library version of strncmp and only bloat and
slow down the caller.  (The origins of the code are tracked down in
https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01406.html).  The following
simple patch is enough to improve the generated code:

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 841c1ef..5b9085b 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -4708,12 +4708,7 @@ expand_builtin_strncmp (tree exp, ATTRIBUTE_UNUSED rtx
       return target;

-  /* Expand the library call ourselves using a stabilized argument
-     list to avoid re-evaluating the function's arguments twice.  */
-  tree fn = build_call_nofold_loc (loc, fndecl, 3, arg1, arg2, len);
-  gcc_assert (TREE_CODE (fn) == CALL_EXPR);
-  return expand_call (fn, target, target == const0_rtx);
+  return expand_call (exp, target, target == const0_rtx);

 /* Expand a call to __builtin_saveregs, generating the result in TARGET,

More information about the Gcc-bugs mailing list