Summary: | Not replacing __builtin___memcpy_chk() as documented | ||
---|---|---|---|
Product: | gcc | Reporter: | Jeff Davis <pgsql> |
Component: | middle-end | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | RESOLVED INVALID | ||
Severity: | normal | CC: | jakub |
Priority: | P3 | ||
Version: | 9.2.1 | ||
Target Milestone: | --- | ||
Host: | Target: | ||
Build: | Known to work: | ||
Known to fail: | Last reconfirmed: | ||
Attachments: |
Example 2
Example 1 Example 3 |
Description
Jeff Davis
2020-06-06 01:22:38 UTC
Created attachment 48687 [details]
Example 1
Created attachment 48688 [details]
Example 3
Another example that works (i.e. builtin is properly replaced by memcpy as described in the document).
The only difference between this working example and the failing example2.c is that I replaced the sizeof() with a constant.
Original larger case was discovered in PostgreSQL: https://www.postgresql.org/message-id/99b2eab335c1592c925d8143979c8e9e81e1575f.camel@j-davis.com It is unclear what you are complaining about. for i in gcc-7 gcc-8 gcc-9 gcc-10 gcc; do echo $i; for j in 1 2 3; do /usr/src/$i/obj/gcc/cc1 -quiet -O2 pr95556-$j.c; done; grep 'memcpy\|rep.movs' pr95556-*.s; done gcc-7 pr95556-1.s: rep movsq pr95556-2.s: call memcpy pr95556-3.s: call memcpy gcc-8 pr95556-1.s: rep movsq pr95556-2.s: call memcpy pr95556-3.s: call memcpy gcc-9 pr95556-1.s: rep movsq pr95556-2.s: call memcpy pr95556-3.s: call memcpy gcc-10 pr95556-1.s: rep movsq pr95556-2.s: rep movsq pr95556-3.s: call memcpy gcc pr95556-1.s: rep movsq pr95556-2.s: rep movsq pr95556-3.s: call memcpy There are no __memcpy_chk calls, which means GCC did in all cases what is documented, replace the __builtin___memcpy_chk calls with the corresponding __builtin_memcpy calls and handled that as usually (which isn't always a library call, there are many different options how a builtin memcpy can be expanded and one can find tune that through various command line options. It depends on what CPU the code is tuned for, whether it is considered hot or cold code, whether the size is constant and what constant or if it is variable and what alignment guarantees the destination and source has. And note that - if (lt->pos >= (8192-sizeof(S))) + if (lt->pos >= (8192-16)) is not an insignificant change, the first one is unsigned comparison, the second one signed. See -mno-align-stringops, -minline-all-stringops, -minline-stringops-dynamically, -mstringop-strategy= , -mmemcpy-strategy= options and their documentation in the GCC manual. "...built-in functions are optimized into the normal string functions like memcpy if the last argument is (size_t) -1..." My reading of the document lead me to believe that a last argument of -1 *would* be a normal library call. And certainly should be with -fno-builtin-memcpy, right? If that's not what's happening, should the document be clarified? (In reply to Jeff Davis from comment #7) > "...built-in functions are optimized into the normal string functions like > memcpy if the last argument is (size_t) -1..." > > My reading of the document lead me to believe that a last argument of -1 > *would* be a normal library call. And certainly should be with > -fno-builtin-memcpy, right? No. Because -fno-builtin-memcpy only disables the special behavior if one uses memcpy, when one uses __builtin_memcpy, it behaves always as builtin. And you are using __builtin___memcpy_chk which is also a builtin and thus not affected by -fno-builtin*. You can use -fno-builtin-__memcpy_chk but then you'll get __memcpy_chk calls if you call it that way. As I wrote, if you for whatever reason want to use the library call, e.g. always, you can just use -mmemcpy-strategy=libcall:-1:1 or so, but then even very small ones will not be done inline, which is not really beneficial. I still feel like the documentation is misleading on this point. Regardless, it doesn't seem like you think there is any bug here, so go ahead and close. . |