78408 – C loop initial declarations generate wrong code

Bug 78408 - C loop initial declarations generate wrong code

Summary: C loop initial declarations generate wrong code

Status:	RESOLVED DUPLICATE of bug 72747

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	c (show other bugs)
Version:	6.2.1

Importance:	P3 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:	wrong-code

Depends on:
Blocks:

Reported:	2016-11-17 18:52 UTC by Nathaniel McCallum
Modified:	2016-12-16 15:58 UTC (History)
CC List:	2 users (show)

See Also:
Host:
Target:
Build:
Known to work:	7.0
Known to fail:	5.4.0, 6.2.1
Last reconfirmed:	2016-11-17 00:00:00

Attachments
simple test case (262 bytes, text/x-csrc) 2016-11-17 20:22 UTC, Nathaniel McCallum	Details
output assembly from the test case (491 bytes, text/plain) 2016-11-17 20:24 UTC, Nathaniel McCallum	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Nathaniel McCallum 2016-11-17 18:52:39 UTC

$ gcc -v
Using built-in specs.
COLLECT_GCC=/usr/bin/gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/6.2.1/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,objc,obj-c++,fortran,ada,go,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --disable-libgcj --with-isl --enable-libmpx --enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 6.2.1 20160916 (Red Hat 6.2.1-2) (GCC)

Compiling the file uses this command line:

gcc -DPACKAGE_NAME=\"clevis\" -DPACKAGE_TARNAME=\"clevis\" -DPACKAGE_VERSION=\"1\" -DPACKAGE_STRING=\"clevis\ 1\" -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"clevis\" -DVERSION=\"1\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -I.    -Wall -Wextra -Werror -Wstrict-aliasing -Wchar-subscripts -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wpointer-arith -Wshadow -Wsign-compare -Wstrict-prototypes -Wtype-limits -Wno-missing-field-initializers -Wno-unused-parameter  -pthread -I/usr/include/udisks2 -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -DCLEVIS_CMD_DIR=/usr/libexec/clevis -g -O2 -MT clevis-luks-udisks2.o -MD -MP -MF $depbase.Tpo -c -o clevis-luks-udisks2.o clevis-luks-udisks2.c

The relevant code is here[1].

GCC sees that "pkt_t key" might go out of scope and therefore optimizes out the zeroing of the struct that occurs at the end of the for loop iteration. However, key doesn't go out of scope during loop iteration. Thus, on subsequent loops the struct is used presuming that it contains a zero value but it doesn't. This means that (what I take to be) perfectly valid ISO C behaves in strange ways.

In particular, the bug that arises is as follows. When load_jwe() fails[2], the (incorrectly) presumed zeroed key struct is sent[3] to the client. The end result is that an invalid key (the previous successful one) is sent when an empty UDP packet should be sent.

GCC should detect that "pkt_t key" is not going out of scope and should zero it as requested rather than invalidly optimizing out this statement. Clang does not have this bug, so there is at least one example of a compiler doing the right thing.

[1] - https://github.com/latchset/clevis/blob/6834b02b390771a97775dcfa26f85243f3a0195d/udisks2/clevis-luks-udisks2.c#L446
[2] - https://github.com/latchset/clevis/blob/6834b02b390771a97775dcfa26f85243f3a0195d/udisks2/clevis-luks-udisks2.c#L455
[3] - https://github.com/latchset/clevis/blob/6834b02b390771a97775dcfa26f85243f3a0195d/udisks2/clevis-luks-udisks2.c#L469

Comment 1 Marc Glisse 2016-11-17 19:43:49 UTC

Do you think you could produce a smaller, stand-alone testcase that reproduces the issue? Or at least attach the preprocessed sources to the report?

Comment 2 Nathaniel McCallum 2016-11-17 20:22:09 UTC

Created attachment 40076 [details]
simple test case

Compile with: gcc -o test test.c

This is a simple echo server. It *should* echo whatever the client types.  However, if you type "foo" the first time and then type ctrl-d the second time, the second reply is the same as the first reply. This is because the buffer was not properly zeroed.

Comment 3 Nathaniel McCallum 2016-11-17 20:24:19 UTC

Created attachment 40077 [details]
output assembly from the test case

This assembly was produced with: gcc -S test.c.

Comment 4 Markus Trippelsdorf 2016-11-17 20:46:48 UTC

Confirmed. Trunk is fine.

--- good        2016-11-17 21:45:22.671247332 +0100
+++ bad 2016-11-17 21:45:18.851324283 +0100
@@ -58,11 +58,6 @@
        call    write
        testq   %rax, %rax
        jle     .L10
-       leaq    -65536(%rbp), %rax
-       movl    $65520, %edx
-       movl    $0, %esi
-       movq    %rax, %rdi
-       call    memset
        leaq    -131056(%rbp), %rax
        leaq    -65536(%rbp), %rcx
        movl    $65520, %edx
@@ -86,5 +81,5 @@
        .cfi_endproc
 .LFE0:
        .size   main, .-main
-       .ident  "GCC: (GNU) 7.0.0 20161117 (experimental)"
+       .ident  "GCC: (GNU) 6.2.1 20161017"
        .section        .note.GNU-stack,"",@progbits

Comment 5 Markus Trippelsdorf 2016-11-17 21:07:01 UTC

This has nothing to do with "aggressive optimization" it is a bug in C front-end, I guess in the "loop initial declarations" implementation.

Comment 6 Marc Glisse 2016-11-17 21:15:07 UTC

I don't think this is related to the loop, a = b = (struct buf) {} loses the assignment to b even if you put it alone on its line.

Comment 7 jsm-csl@polyomino.org.uk 2016-11-17 21:28:51 UTC

I can't reproduce this with recent GCC 6 branch.  Possibly a duplicate of 
bug 72747?

Comment 8 Markus Trippelsdorf 2016-11-18 06:09:18 UTC

(In reply to joseph@codesourcery.com from comment #7)
> I can't reproduce this with recent GCC 6 branch.  Possibly a duplicate of 
> bug 72747?

Yes.

*** This bug has been marked as a duplicate of bug 72747 ***

Comment 9 Jakub Jelinek 2016-11-18 16:58:22 UTC

Note in the testcase a = (struct buf) {}, b = (struct buf) {} generates
significantly more efficient code than a = b = (struct buf) {} - the former
is 2x memset, the latter 1x memset + 1x memcpy.
So, shall we for large aggregates gimplify those differently as an optimization?

Comment 10 Richard Biener 2016-11-24 13:55:15 UTC

(In reply to Jakub Jelinek from comment #9)
> Note in the testcase a = (struct buf) {}, b = (struct buf) {} generates
> significantly more efficient code than a = b = (struct buf) {} - the former
> is 2x memset, the latter 1x memset + 1x memcpy.
> So, shall we for large aggregates gimplify those differently as an
> optimization?

I believe the gimplifier is not a good place to do optimization.  You can
pattern-match

  memset (&a, ..., N);
  memcpy (&b, &a, N);

and transform it to two times memset.

Comment 11 Jakub Jelinek 2016-11-24 13:59:54 UTC

(In reply to Richard Biener from comment #10)
> (In reply to Jakub Jelinek from comment #9)
> > Note in the testcase a = (struct buf) {}, b = (struct buf) {} generates
> > significantly more efficient code than a = b = (struct buf) {} - the former
> > is 2x memset, the latter 1x memset + 1x memcpy.
> > So, shall we for large aggregates gimplify those differently as an
> > optimization?
> 
> I believe the gimplifier is not a good place to do optimization.  You can
> pattern-match
> 
>   memset (&a, ..., N);
>   memcpy (&b, &a, N);
> 
> and transform it to two times memset.

Even the *.optimized dump contains:
  b = {};
  a = b;
  d = {};
  c = d;
Can match.pd handle these, or would it need to be special folding code (where?  gimple-fold.c, tree dse, what other passes have good framework for that)?

Comment 12 Jakub Jelinek 2016-12-16 15:58:15 UTC

Author: jakub
Date: Fri Dec 16 15:57:43 2016
New Revision: 243753

URL: https://gcc.gnu.org/viewcvs?rev=243753&root=gcc&view=rev
Log:
	PR c/78408
	* tree-ssa-ccp.c: Include tree-dfa.h.
	(optimize_memcpy): New function.
	(pass_fold_builtins::execute): Use it.  Remove useless conditional
	break after BUILT_IN_VA_*.

	* gcc.dg/pr78408-1.c: New test.
	* gcc.dg/pr78408-2.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/pr78408-1.c
    trunk/gcc/testsuite/gcc.dg/pr78408-2.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/sanopt.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-ccp.c