This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Optimize sprintf(buffer,"foo")


On Sun, Jun 22, 2003 at 04:23:47PM -0600, Roger Sayle wrote:
> The following patch optimizes both sprintf(buffer,"foo") and
> sprintf(dst,"%s",src) into calls to strcpy, and if the result
> of the sprintf is required, potentially an additional call to
> strlen.  The performance of strcpy+strlen is better than that
> the equivalent sprintf on most platforms, especially as both
> strcpy/memcpy and strlen may get inlined, and the strlen often
> isn't required or can be determined to be a constant at compile
> time.

sprintf certainly has bigger initial overhead, but with longer
strings the cost is at least on most platforms
sprintf: initial_overhead + cost of going through the string once
strcpy+strlen: cost of going through the string twice

#include <stdio.h>

int main (void)
{
  char buf1[8192], buf2[8192];
  unsigned long long a, b, c;
  int i, j;
  memset (buf2, 'A', 8191);
  buf2[8191] = 0;
  c = -1;
  for (i = 0; i < 100; ++i)
    {
      __asm__ __volatile__ ("rdtsc" : "=A" (a));
      j = sprintf (buf1, "%s", buf2);
      __asm__ __volatile__ ("rdtsc" : "=A" (b) : "r" (j));
      if (b - a < c)
        c = b - a;
    }
  printf ("sprintf %lld\n", c);
  c = -1;
  for (i = 0; i < 100; ++i)
    {
      __asm__ __volatile__ ("rdtsc" : "=A" (a));
      strcpy (buf1, buf2);
      j = strlen (buf2);
      __asm__ __volatile__ ("rdtsc" : "=A" (b) : "r" (j));
      if (b - a < c)
        c = b - a;
    }
  printf ("strcpy+strlen(buf2) %lld\n", c);
  c = -1;
  for (i = 0; i < 100; ++i)
    {
      __asm__ __volatile__ ("rdtsc" : "=A" (a));
      strcpy (buf1, buf2);
      j = strlen (buf1);
      __asm__ __volatile__ ("rdtsc" : "=A" (b) : "r" (j));
      if (b - a < c)
        c = b - a;
    }
  printf ("strcpy+strlen(buf1) %lld\n", c);
  c = -1;
  for (i = 0; i < 100; ++i)
    {
      __asm__ __volatile__ ("rdtsc" : "=A" (a));
      j = stpcpy (buf1, buf2) - buf1;
      __asm__ __volatile__ ("rdtsc" : "=A" (b) : "r" (j));
      if (b - a < c)
        c = b - a;
    }
  printf ("stpcpy+strlen %lld\n", c);
  exit (0);
}

gives on my PIII:
-O2
sprintf 24559
strcpy+strlen(buf2) 48952
strcpy+strlen(buf1) 48957
stpcpy+strlen 18503

-O2 -march=i686
sprintf 24512
strcpy+strlen(buf2) 24672
strcpy+strlen(buf1) 24689
stpcpy+strlen 18503

I think sprintf should be only optimized to strcpy+strlen if it is known
strlen will be a constant (of course, if the result is not used,
it can be always optimized to strcpy).
Then, if we have some way to find out at compile time the target platform
provides stpcpy in libc (dunno whether explicit stpcpy prototype would
be enough or whether we need some #pragma or whatever),
i = sprintf (buf, "%s", string_where_length_is_not_known);
could be optimized into:
i = stpcpy (save_expr (buf), string_where_length_is_not_known) - save_expr (buf);

	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]