Bug 93249 - [10 Regression] wrong code with __builtin_strncpy() copying empty string
Summary: [10 Regression] wrong code with __builtin_strncpy() copying empty string
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 10.0
: P1 normal
Target Milestone: 10.0
Assignee: Jakub Jelinek
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2020-01-13 10:38 UTC by Zdenek Sojka
Modified: 2020-01-15 00:35 UTC (History)
3 users (show)

See Also:
Host: x86_64-pc-linux-gnu
Target:
Build:
Known to work: 9.2.1
Known to fail: 10.0
Last reconfirmed: 2020-01-13 00:00:00


Attachments
reduced testcase (145 bytes, text/plain)
2020-01-13 10:38 UTC, Zdenek Sojka
Details
gcc10-pr93249.patch (1.71 KB, patch)
2020-01-14 11:31 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Zdenek Sojka 2020-01-13 10:38:23 UTC
Created attachment 47643 [details]
reduced testcase

Output:
$ x86_64-pc-linux-gnu-gcc -O testcase.c
$ ./a.out 
Aborted

# testcase.c:6:   char d[2] = { 0x00, 0x11 };
	mov	WORD PTR [rsp+14], 4352	# d,
# testcase.c:7:   __builtin_strncpy (&b[2], d, 2);
	mov	edx, 1	#,
	lea	rsi, [rsp+15]	# tmp102,
	mov	edi, OFFSET FLAT:b+3	#,
	call	strncpy	#

The code copies b[3] = d[1] // == 0x11

It should have done:
b[2] = d[0] // later overwritten by the second strncpy
b[3] = 0 // strncpy zeroes all bytes in dest up to 'n'


$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-280156-checking-yes-rtl-df-extra-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/10.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-280156-checking-yes-rtl-df-extra-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.0.0 20200111 (experimental) (GCC) 

The PR93213 patch doesn't seem to fix this issue.
Comment 1 Martin Sebor 2020-01-13 11:32:57 UTC
Confirmed.  The DSE pass transforms the first strncpy call as shown in the dump below.  Since the second element of the source string d is non-zero it writes its value into the destination at b[3], rather that storing two nuls there as the original code does.  I can reproduce it at all optimization levels and with -fno-optimize-strlen. 
 
$ gcc -O -S -Wall -fdump-tree-dse1-details=/dev/stdout pr93249.c

;; Function main (main, funcdef_no=0, decl_uid=1932, cgraph_uid=1, symbol_order=2)

  Trimming statement (head = 1, tail = 0): __builtin_strncpy (&b[2], &d, 2);

main ()
{
  char d[2];
  char _1;
  char _2;
  char _3;
  char _4;

  <bb 2> :
  d = "\x00\x11";
  __builtin_strncpy (&MEM <char> [(void *)&b + 3B], &MEM <char> [(void *)&d + 1B], 1);
  __builtin_strncpy (&b[1], &a, 2);
Comment 2 Martin Sebor 2020-01-13 11:40:05 UTC
Bisection points to Jeff's r273606:

r273606 | law | 2019-07-19 13:04:51 -0400 (Fri, 19 Jul 2019) | 6 lines

	* tree-ssa-dse.c (initialize_ao_ref_for_dse): Handle
	strncpy.  Drop some trivial dead code.
	(maybe_trim_memstar_call): Handle strncpy.
Comment 3 Jakub Jelinek 2020-01-13 21:36:08 UTC
Indeed, which means that either we only do tail trimming and not head trimming for strncpy*, or allow head trimming for strncpy* as well, but only if the source argument can be proven not to contain any zeros.  The former change would be trivial, the latter might need slightly more work, will look at it.
Comment 4 Jakub Jelinek 2020-01-14 11:31:04 UTC
Created attachment 47650 [details]
gcc10-pr93249.patch

Untested fix.
Comment 5 Martin Sebor 2020-01-14 11:38:28 UTC
"Proven not to contain any zeros in the first N bytes" where N is the source offset in strncpy, would suggest the strlen pass might be more suitable for this transformation than DSE.
Comment 6 Jakub Jelinek 2020-01-14 11:41:57 UTC
(In reply to Martin Sebor from comment #5)
> "Proven not to contain any zeros in the first N bytes" where N is the source
> offset in strncpy, would suggest the strlen pass might be more suitable for
> this transformation than DSE.

That is not really possible, because the change it is doing heavily relies on the DSE pass infrastructure as well as being called from within it.
With get_range_strlen, it can handle some cases, and will just avoid head trimming of strncpy otherwise, no big deal.  Tail trimming and full trimming is still possible.
Comment 7 Jeffrey A. Law 2020-01-14 21:46:46 UTC
I think the expedient thing to do here is fix DSE in isolation.  Note that I think the code in question is new, so if we have to xfail some tests, that wouldn't IMHO represent a regression.

Integrating DSE & strlen would likely be painful and I don't think capturing the case envisioned in c#5 would likely be worth the effort involved.  I wouldn't object to opening a missed-optimization bug, but I don't think we'd likely be prioritizing it anytime soon.
Comment 8 GCC Commits 2020-01-15 00:30:02 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:623c6fddd605f8f225142d714440320e4ef54d84

commit r10-5961-g623c6fddd605f8f225142d714440320e4ef54d84
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Jan 15 01:28:43 2020 +0100

    tree-optimization: Fix tree dse of strncpy PR93249
    
    As the testcase shows, tail trimming of strncpy in tree-ssa-dse.c is fine,
    we just copy or clear fewer bytes in the destination, but unlike
    memcpy/memset etc., head trimming is problematic in certain cases.
    If we can prove that there are no zero bytes among initial head_trim bytes,
    it is ok to trim it, if we can prove there is at least one zero byte among
    initial head_trim bytes, we could (not implemented in the patch) turn
    the strncpy into memset 0, but otherwise we need to avoid the head trimming,
    because the presence or absence of NUL byte there changes the behavior for
    subsequent bytes, whether further bytes from src are copied or if further
    bytes are cleared.
    
    2020-01-15  Jakub Jelinek  <jakub@redhat.com>
    
    	PR tree-optimization/93249
    	* tree-ssa-dse.c: Include builtins.h and gimple-fold.h.
    	(maybe_trim_memstar_call): Move head_trim and tail_trim vars to
    	function body scope, reindent.  For BUILTIN_IN_STRNCPY*, don't
    	perform head trim unless we can prove there are no '\0' chars
    	from the source among the first head_trim chars.
    
    	* gcc.c-torture/execute/pr93249.c: New test.
Comment 9 Jakub Jelinek 2020-01-15 00:35:23 UTC
Fixed.