This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug middle-end/78809] Inline strcmp with small constant strings


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809

--- Comment #23 from Qing Zhao <qing.zhao at oracle dot com> ---
I have an implementation for the part C of this task in my private space:

 part C: for strcmp (s1, s2), strncmp (s1, s2, n):

      if the result is NOT used to do simple equality test against zero, one of 
"s1" or "s2" is a small constant string, n is a constant, and the Min value of 
the length of the constant string and "n" is smaller than a predefined 
threshold T, 
      inline the call by a byte-to-byte comparision sequence to avoid calling 
overhead. 


with this implementation, I was able to measure the performance impact of the
inlining transformation on different value of "n", n is the length of the
string need to be compared. In order to decide the following two concerns:
    A. what's the default value of n.
    B. on a platform that support string compare hardware insns (for exmaple,
X86), which should be done first for a call to strcmp/strncmp, the inline or
the hardware insns?


On both aarch64 and X86, I tried the following small testing case for the
performance experiments:

qinzhao@gcc116:~/Bugs/78809/const_cmp$ cat t_p.c
#include <string.h>

char array[]= "fishiiiiiiiiiiiiiiiiiiiiiiiiiiiii";

#define NUM 1000000000
int __attribute__ ((noinline))
cmp2 (const char *p)
{
  int result = 0;
  int i;
  for (i=0; i< NUM; i++) {
    result |=  strcmp (p, "fishiiiii"); 
  }
  return result;
}

int result = 0;

int main()
{
  for (int i = 0; i < 25; i++)
     result += cmp2 (array);
  return 0;
}

and the option I used was:  -O -fno-tree-loop-im and the corresponding option
to enable or disable the added inlining, the following is the performance
result:

aarch64  strcmp

n=              3       4       5       6       10      20

inline          31      41      62      72      114     242
no-inline       229     229     229     229     272     333

aarch64  strncmp

n=              3       4       5       6       10      20

inline          41      62      62      76      125     250
no-inline       291     291     291     291     364     427


X86  strcmp

n=                      4       5       6       10      20

inline                  21      25      31      42      163
no-inline               445     461     488     529     672


X86  strncmp

n=                      4       5       6       10      20

inline                  21      25      28      43      77
no-inline               412     435     442     495     638

From the above, we can see:

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]