This is the mail archive of the
mailing list for the GCC project.
Re: strlen optimizations based on whether stpcpy is declared?
On 10/02/2017 09:06 AM, Jakub Jelinek wrote:
On Mon, Oct 02, 2017 at 09:00:41AM -0600, Martin Sebor wrote:
Thanks. That makes sense to me. The wrinkle with this approach
is that the same code (same function) has different effects on
the compiler (as in, is subject to different optimization
decisions, or can cause false positives/negatives) depending
whether some unrelated code (in another function) calls
__builtin_stpcpy or calls (and declares) stpcpy, or does neither.
This is probably not very common in application programs but it
does happen often in the GCC test suite (this is the second time
I've been bitten by it in just a few months).
Why is that a problem? In most user code, people just
#include <string.h> or #include <cstring> and depending on feature
test macros, either stpcpy is available, or not.
For GCC testsuite the tests that specially test for these transformations
have or intentionally don't have the stpcpy prototype available.
It's a gotcha for those writing GCC tests who are unaware of this
subtlety. Some tests that exercise both built-in functions define
macros to call them:
#define stpcpy __builtin_stpcpy
#define strcpy __builtin_strcpy
Other test declare them:
extern char* stpcpy (char*, const char*);
extern char* strcpy (char*, const char*);
Other tests still exercise one function at a time. As I said,
it's surprising when the tests have different effects even though
the calls to these functions are otherwise identical, because for
other standard functions they behave the same. I spent close to
an hour the other day debugging two of my tests side by side
trying to understand why they were behaving differently before
it dawned on me that the cause was in the strlen pass.
IIUC, ideally, the decision whether or not to make
the transformation would be based on whether stpcpy is called
by the function on the result of a prior strcpy/strcat. A less
I don't understand this suggestion. Usually there is no stpcpy call
anywhere, we still want to make the transformation if we can assume
the library provides it. So you'd penalize a lot of code for no benefit.
Ah, okay I get it now. After re-reading some of the comments
in the file and some more testing I see the pass transforms all
calls to strcpy to stpcpy whose source length is unknown and
the length of whose destination is later needed. It does that
because the latter length can be computed more efficiently by
subtracting the stpcpy return value from the first argument.
And the decision whether or not to make use of stpcpy is based
on the presence of its declaration.
I also take back what I said about application programs being
unaffected by this. Using the declaration to make these decisions
results in less optimal code when compiling in strict conformance
mode (e.g., -std=c11 or -std=c++14) than in "relaxed mode" (-std=
gnu11 or -std=gnu++14). This can be seen using the following test
void f (char *d, const char *s)
strcpy (d, s);
if (__builtin_strlen (d) != __builtin_strlen (s))
I understand this is because, as you said, in strict mode, stpcpy
could be declared to be a different symbol. After our discussion
I will (hopefully) remember this and avoid getting surprised by
it in the future. But it still feels like a subtlety that should
be more prominently advertised somehow/somewhere to help others
avoid falling into the same trap.