Created attachment 47643 [details] reduced testcase Output: $ x86_64-pc-linux-gnu-gcc -O testcase.c $ ./a.out Aborted # testcase.c:6: char d[2] = { 0x00, 0x11 }; mov WORD PTR [rsp+14], 4352 # d, # testcase.c:7: __builtin_strncpy (&b[2], d, 2); mov edx, 1 #, lea rsi, [rsp+15] # tmp102, mov edi, OFFSET FLAT:b+3 #, call strncpy # The code copies b[3] = d[1] // == 0x11 It should have done: b[2] = d[0] // later overwritten by the second strncpy b[3] = 0 // strncpy zeroes all bytes in dest up to 'n' $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-280156-checking-yes-rtl-df-extra-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/10.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-280156-checking-yes-rtl-df-extra-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 10.0.0 20200111 (experimental) (GCC) The PR93213 patch doesn't seem to fix this issue.
Confirmed. The DSE pass transforms the first strncpy call as shown in the dump below. Since the second element of the source string d is non-zero it writes its value into the destination at b[3], rather that storing two nuls there as the original code does. I can reproduce it at all optimization levels and with -fno-optimize-strlen. $ gcc -O -S -Wall -fdump-tree-dse1-details=/dev/stdout pr93249.c ;; Function main (main, funcdef_no=0, decl_uid=1932, cgraph_uid=1, symbol_order=2) Trimming statement (head = 1, tail = 0): __builtin_strncpy (&b[2], &d, 2); main () { char d[2]; char _1; char _2; char _3; char _4; <bb 2> : d = "\x00\x11"; __builtin_strncpy (&MEM <char> [(void *)&b + 3B], &MEM <char> [(void *)&d + 1B], 1); __builtin_strncpy (&b[1], &a, 2);
Bisection points to Jeff's r273606: r273606 | law | 2019-07-19 13:04:51 -0400 (Fri, 19 Jul 2019) | 6 lines * tree-ssa-dse.c (initialize_ao_ref_for_dse): Handle strncpy. Drop some trivial dead code. (maybe_trim_memstar_call): Handle strncpy.
Indeed, which means that either we only do tail trimming and not head trimming for strncpy*, or allow head trimming for strncpy* as well, but only if the source argument can be proven not to contain any zeros. The former change would be trivial, the latter might need slightly more work, will look at it.
Created attachment 47650 [details] gcc10-pr93249.patch Untested fix.
"Proven not to contain any zeros in the first N bytes" where N is the source offset in strncpy, would suggest the strlen pass might be more suitable for this transformation than DSE.
(In reply to Martin Sebor from comment #5) > "Proven not to contain any zeros in the first N bytes" where N is the source > offset in strncpy, would suggest the strlen pass might be more suitable for > this transformation than DSE. That is not really possible, because the change it is doing heavily relies on the DSE pass infrastructure as well as being called from within it. With get_range_strlen, it can handle some cases, and will just avoid head trimming of strncpy otherwise, no big deal. Tail trimming and full trimming is still possible.
I think the expedient thing to do here is fix DSE in isolation. Note that I think the code in question is new, so if we have to xfail some tests, that wouldn't IMHO represent a regression. Integrating DSE & strlen would likely be painful and I don't think capturing the case envisioned in c#5 would likely be worth the effort involved. I wouldn't object to opening a missed-optimization bug, but I don't think we'd likely be prioritizing it anytime soon.
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:623c6fddd605f8f225142d714440320e4ef54d84 commit r10-5961-g623c6fddd605f8f225142d714440320e4ef54d84 Author: Jakub Jelinek <jakub@redhat.com> Date: Wed Jan 15 01:28:43 2020 +0100 tree-optimization: Fix tree dse of strncpy PR93249 As the testcase shows, tail trimming of strncpy in tree-ssa-dse.c is fine, we just copy or clear fewer bytes in the destination, but unlike memcpy/memset etc., head trimming is problematic in certain cases. If we can prove that there are no zero bytes among initial head_trim bytes, it is ok to trim it, if we can prove there is at least one zero byte among initial head_trim bytes, we could (not implemented in the patch) turn the strncpy into memset 0, but otherwise we need to avoid the head trimming, because the presence or absence of NUL byte there changes the behavior for subsequent bytes, whether further bytes from src are copied or if further bytes are cleared. 2020-01-15 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/93249 * tree-ssa-dse.c: Include builtins.h and gimple-fold.h. (maybe_trim_memstar_call): Move head_trim and tail_trim vars to function body scope, reindent. For BUILTIN_IN_STRNCPY*, don't perform head trim unless we can prove there are no '\0' chars from the source among the first head_trim chars. * gcc.c-torture/execute/pr93249.c: New test.
Fixed.