Patch to projects.html: glibc's string function macros

Joseph S. Myers
Sat Oct 28 16:23:00 GMT 2000

This patch to projects.html documents the architecture-independent
optimizations in glibc's <bits/string2.h> macros as possible projects for
implementing in GCC.  In particular, it would be desirable if the memset,
strcmp and strcpy optimizations could be implemented in GCC, as those
macros in glibc have huge expansions, and could be disabled if GCC did all
that glibc tries to do.  OK to commit?

--- projects.html	Fri Oct 20 16:07:07 2000
+++ projectsnew.html	Sat Oct 28 22:56:10 2000
@@ -79,6 +79,126 @@
 <p>Contact <a href=" "></a> if
 you're interested in working on lazy code motion.</p>

+<h2>Better builtin string functions</h2>
+<p>GNU libc includes some macros to optimize calls to some string
+functions with constant arguments.  These macros tend to cause huge
+blowup in the size of preprocessed source if nested; for example, each
+nested call to <code>strcpy</code> expands the source 20-fold, with
+four nested calls having an expansion ten megabytes in size.  GCC then
+consumes a huge amount of memory compiling such expressions.  Many of
+the optimizations to ISO C string functions could be implemented in
+GCC and then disabled in glibc, with benefits to other systems as
+well, and the potential to use information GCC has about
+<p>All the string functions act as if they access individual
+characters, so care may need to be taken that no
+<code>-fstrict-aliasing</code> problems occur when internal uses of
+other types are generated.</p>
+<p>Care must be taken that any optimizations in GCC are
+standards-conforming in terms of not possibly accessing beyond the
+arrays involved (possibly within a function call created in the
+optimization); whereas the glibc macros know the glibc implementation
+and how much memory it might access, GCC optimizations can't.</p>
+<p>There are some further optimizations in glibc not covered here,
+that either optimize calls to non-ISO C functions or call glibc
+internal functions in the expansion of the macro.</p>
+<p>glibc also has inline assembler versions of various string
+functions; GCC has some, but not necessarily the same ones on the same
+<p>Many of these optimizations should not be applied if
+<code>-Os</code> is specified.</p>
+<li>GCC optimizes <code>memset</code> only when the value and length
+involved have no side effects, and the value is a constant zero.  On
+architectures allowing unaligned accesses, glibc also optimizes if the
+length is constant and does not exceed 16; the value to which memory
+is set need not be constant.  In such a case, GCC should convert the
+value (constant or not) to <code>unsigned char</code>, maybe using
+<code>save_expr</code> to protect against multiple evaluation, and
+compute and store the values <code>c * 0x01</code>, <code>c *
+0x0101</code>, and so on (whichever are needed for the given length),
+to store as single words then as part of a word.  Where the
+architecture does not support unaligned accesses, GCC may still be
+able to optimize if the destination is known to be suitably aligned.
+For example, no alignment is needed for a set of one byte; a set of
+zero bytes should be eliminated in all cases; if two bytes are to be
+set, two byte alignment may suffice; and so on.</li>
+<li>GCC converts <code>strcpy</code> from a string constant into a
+<code>memcpy</code> with the known length of that string constant.  In
+turn, <code>emit_block_move</code> may expand that <code>memcpy</code>
+inline.  glibc includes optimizations to load the constant contents of
+the string directly rather than via a memory copy, when no more than 8
+bytes need copying.  <code>emit_block_move</code> could be taught to
+find the contents of the string constant and include the appropriate
+integers directly in the assembler output for the copy.</li>
+<li>Similarly, glibc optimizes <code>strncpy</code> from a string
+constant.  Where the maximum length to be copied is not more than the
+length of the string constant including the terminating null
+character, GCC could (but does not) optimize as a
+<code>memcpy</code>.  Where the maximum length to be copied is
+greater, trailing null characters must be added to make up the length;
+here glibc can use a micro-optimization not available to GCC, its
+<code>__mempcpy</code> function to copy and return the address after
+the data copied, to pass into <code>memset</code>.  This is only used
+when as inline assembler <code>__mempcpy</code> is available.</li>
+<li>glibc optimizes <code>strncat</code> with constant source and
+maximum length.  If the maximum length exceeds the length of the
+source string, it is optimized to <code>strcat</code>; GCC could do
+this (including when the maximum length equals the length of the
+source string).  Otherwise, on architectures where glibc has an
+inline assembler <code>strchr</code> and it is used, glibc converts
+the <code>strncat</code> to a <code>memcpy</code>, but fails to add
+the terminating null character.</li>
+<li>glibc has various complicated optimizations for
+<code>strcmp</code>.  The following in GCC would completely cover
+them: comparisons of constant strings should be done at compile time;
+comparisons where one string is constant and of length less than 4
+should be inlined to compare successive bytes to the known constant
+ones (further optimization to compare more than one byte at once is in
+general unsafe as it might access too much memory).  Certain of these
+cases could also be inlined for <code>memcmp</code>.</li>
+<li>glibc optimizes <code>strncmp</code>, where one string is constant
+and strictly shorter than the maximum number of characters to be
+compared, by replacing it with <code>strcmp</code> (which may then be
+further optimized).  This may not be safe for GCC except in the cases
+where it handles <code>strcmp</code> internally as above, since
+<code>strcmp</code> may require both strings to be null-terminated
+whereas <code>strncmp</code> doesn't.</li>
+<li>glibc optimizes <code>strcspn</code> where the string of excluded
+characters is constant and of length not greater than three.  GCC
+could readily optimize the specific case where the length is zero,
+converting it to <code>strlen</code>.</li>
+<li>Similarly, glibc optimizes <code>strspn</code> where not more than
+three characters are involved.  Where there are zero characters
+acceptable in the initial segment, GCC could optimize to the integer
+constant 0.</li>
+<li>glibc also has similar optimizations for <code>strpbrk</code>.
+The cases of finding the first occurrence of zero characters (return a
+null pointer) or one character (convert to <code>strchr</code>) could
+readily be optimized by GCC.</li>
+<li>glibc optimizes calls to <code>strstr</code> searching for a
+string of length zero (return the string in which the search is made)
+or one (use <code>strchr</code>).</li>
 <h2>Format (<code>printf</code>, <code>scanf</code> and
 <code>strftime</code>) checking:</h2>

Joseph S. Myers

More information about the Gcc-patches mailing list