Bug 77326 - [avr] Invalid optimization omits comparison
Summary: [avr] Invalid optimization omits comparison
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 5.1.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2016-08-22 18:50 UTC by Matthijs Kooijman
Modified: 2016-09-21 16:18 UTC (History)
1 user (show)

See Also:
Host:
Target: avr
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
Preprocessed source generated by avr-gcc foo.c -Dissue -save-temps (330 bytes, text/plain)
2016-08-22 18:50 UTC, Matthijs Kooijman
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Matthijs Kooijman 2016-08-22 18:50:39 UTC
Created attachment 39483 [details]
Preprocessed source generated by avr-gcc foo.c -Dissue -save-temps

This bug was originally reported to the Arduino bug tracker[1], but seems to be a avr-specific gcc bug.

A minimal program showing the problem:

        #include <stddef.h>
        #include <stdarg.h>

        void test(void) __attribute__((weak));

        void va_pseudo(int flag,...){
                va_list ap;
                va_start (ap, flag);
                va_end (ap);
        }

        int main(void) {
                #if defined(issue)
                        va_pseudo(1, 2, 3, 4);
                #else
                        va_pseudo(1, 2, 3);
                #endif

                if(test!=NULL) {
                        test();
                }
                return 0;
        }

When compiled with -O but without -Dissue, this produces the following assembler:

        $ avr-gcc foo.c -O; avr-objdump -d a.out

        a.out:     file format elf32-avr


        Disassembly of section .text:

        00000000 <va_pseudo>:
           0:   cf 93           push    r28
           2:   df 93           push    r29
           4:   cd b7           in      r28, 0x3d       ; 61
           6:   de b7           in      r29, 0x3e       ; 62
           8:   df 91           pop     r29
           a:   cf 91           pop     r28
           c:   08 95           ret

        0000000e <main>:
           e:   1f 92           push    r1
          10:   83 e0           ldi     r24, 0x03       ; 3
          12:   8f 93           push    r24
          14:   1f 92           push    r1
          16:   82 e0           ldi     r24, 0x02       ; 2
          18:   8f 93           push    r24
          1a:   1f 92           push    r1
          1c:   81 e0           ldi     r24, 0x01       ; 1
          1e:   8f 93           push    r24
          20:   ef df           rcall   .-34            ; 0x0 <va_pseudo>
          22:   0f 90           pop     r0
          24:   0f 90           pop     r0
          26:   0f 90           pop     r0
          28:   0f 90           pop     r0
          2a:   0f 90           pop     r0
          2c:   0f 90           pop     r0
          2e:   80 e0           ldi     r24, 0x00       ; 0
          30:   90 e0           ldi     r25, 0x00       ; 0
          32:   89 2b           or      r24, r25
          34:   09 f0           breq    .+2             ; 0x38 <main+0x2a>
          36:   e4 df           rcall   .-56            ; 0x0 <va_pseudo>
          38:   80 e0           ldi     r24, 0x00       ; 0
          3a:   90 e0           ldi     r25, 0x00       ; 0
          3c:   08 95           ret

Note the lines from 0x2e to 0x34, which implement the `if(test!=NULL)`, which should of course always fail and skip the next `rcall`. Now, when compiling this with -Dissue, the `or r24, r25` line gets dropped, making the generated code invalid:

        $ avr-gcc foo.c -O -Dissue; avr-objdump -d a.out | grep -B 2 breq
          38:   80 e0           ldi     r24, 0x00       ; 0
          3a:   90 e0           ldi     r25, 0x00       ; 0
          3c:   09 f0           breq    .+2             ; 0x40 <__SREG__+0x1>

The diff between without and with -Dissue looks like this (jump addresses have been stripped to minimize the diff):

        @@ -15,6 +15,9 @@ <va_pseudo>:

         <main>:
                 1f 92           push    r1
        +        84 e0           ldi     r24, 0x04       ; 4
        +        8f 93           push    r24
        +        1f 92           push    r1
                 83 e0           ldi     r24, 0x03       ; 3
                 8f 93           push    r24
                 1f 92           push    r1
        @@ -24,16 +27,17 @@ <main>:
                 81 e0           ldi     r24, 0x01       ; 1
                 8f 93           push    r24
                 xx xx           rcall                   ; <va_pseudo>
        -        0f 90           pop     r0
        -        0f 90           pop     r0
        -        0f 90           pop     r0
        -        0f 90           pop     r0
        -        0f 90           pop     r0
        -        0f 90           pop     r0
        +        8d b7           in      r24, 0x3d       ; 61
        +        9e b7           in      r25, 0x3e       ; 62
        +        08 96           adiw    r24, 0x08       ; 8
        +        0f b6           in      r0, 0x3f        ; 63
        +        f8 94           cli
        +        9e bf           out     0x3e, r25       ; 62
        +        0f be           out     0x3f, r0        ; 63
        +        8d bf           out     0x3d, r24       ; 61
                 80 e0           ldi     r24, 0x00       ; 0
                 90 e0           ldi     r25, 0x00       ; 0
        -        89 2b           or      r24, r25
                 xx xx           breq                    ; <main+0x....>
                 xx xx           rcall                   ; <va_pseudo>
                 80 e0           ldi     r24, 0x00       ; 0
                 90 e0           ldi     r25, 0x00       ; 0

As you can see, the extra vararg changes the stack cleanup from a number
of pops to direct manipulation of the stack pointer, which involves the
same registers (r24 and r25) as the `test` check.

When running without -O, this bug does not occur. Then, the check looks
like this:

        $ avr-gcc foo.c -Dissue; avr-objdump -d a.out | grep -B 3 breq
          50:   80 e0           ldi     r24, 0x00       ; 0
          52:   90 e0           ldi     r25, 0x00       ; 0
          54:   00 97           sbiw    r24, 0x00       ; 0
          56:   09 f0           breq    .+2             ; 0x5a <__SREG__+0x1b>


Here, the check uses the (slightly slower) sbiw instruction, where the
-O version uses or. I suspect that the optimization that makes this
change is responsible for, or at least involved in the bug. I couldn't
pinpoint the exact optimization responsible, running without -O, but
with all the options that -O is documented to turn on did not produce
the bug.

The above was tested using:

        Using built-in specs.
        COLLECT_GCC=avr-gcc
        COLLECT_LTO_WRAPPER=/usr/lib/gcc/avr/4.8.1/lto-wrapper
        Target: avr
        Configured with: ../src/configure -v --enable-languages=c,c++ --prefix=/usr/lib --infodir=/usr/share/info --mandir=/usr/share/man --bindir=/usr/bin --libexecdir=/usr/lib --libdir=/usr/lib --enable-shared --with-system-zlib --enable-long-long --enable-nls --without-included-gettext --disable-libssp --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=avr
        Thread model: single
        gcc version 4.8.1 (GCC)

But the bug also occurs using:

        Using built-in specs.
        Reading specs from
        /home/matthijs/pkg-x86_64-unknown-linux-gnu/bin/../lib/gcc/avr/5.1.0/device-specs/specs-avr2
        COLLECT_GCC=/home/matthijs/pkg-x86_64-unknown-linux-gnu/bin/avr-gcc
        COLLECT_LTO_WRAPPER=/home/matthijs/pkg-x86_64-unknown-linux-gnu/bin/../libexec/gcc/avr/5.1.0/lto-wrapper
        Target: avr
        Configured with: /home/admin/avr-gcc-5.1.0/gcc-5.1.0/configure
        --disable-install-libiberty --disable-libssp --disable-libstdcxx-pch
        --disable-libunwind-exceptions --disable-nls --enable-fixed-point
        --enable-long-long --disable-werror --disable-__cxa_atexit
        --enable-checking=release --enable-clocale=gnu
        --enable-cloog-backend=isl --enable-gnu-unique-object --with-avrlibc=yes
        --with-dwarf2 --enable-languages=c,c++ --disable-libada --disable-doc
        --enable-lto --enable-gold --disable-plugin
        --prefix=/home/admin/avr-gcc-5.1.0/pkg-x86_64-unknown-linux-gnu/
        --disable-shared --with-gnu-ld --host=x86_64-unknown-linux-gnu
        --build=x86_64-unknown-linux-gnu --target=avr
        Thread model: single
        gcc version 5.1.0 (GCC)
Comment 1 Matthijs Kooijman 2016-08-22 18:55:27 UTC
The original reporter just added a comment that this does not occur anymore in gcc 6.1.0, though I haven't got anything newer than 5.1 available here to check.
Comment 2 Georg-Johann Lay 2016-09-21 09:18:05 UTC
Author: gjl
Date: Wed Sep 21 09:17:32 2016
New Revision: 240306

URL: https://gcc.gnu.org/viewcvs?rev=240306&root=gcc&view=rev
Log:
gcc/
	PR target/77326
	* config/avr/avr.c (avr_notice_update_cc) [CC_NONE]: If insn
	touches some regs mentioned in cc_status, do CC_STATUS_INIT.
gcc/testsuite/
	PR target/77326
	* gcc.target/avr/torture/pr77326.c: New test.


Added:
    trunk/gcc/testsuite/gcc.target/avr/torture/pr77326.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/avr/avr.c
    trunk/gcc/testsuite/ChangeLog
Comment 3 Georg-Johann Lay 2016-09-21 09:32:10 UTC
Author: gjl
Date: Wed Sep 21 09:31:38 2016
New Revision: 240308

URL: https://gcc.gnu.org/viewcvs?rev=240308&root=gcc&view=rev
Log:
	Backport from 2016-09-21 trunk r240306.
	PR target/77326
	* config/avr/avr.c (avr_notice_update_cc) [CC_NONE]: If insn
	touches some regs mentioned in cc_status, do CC_STATUS_INIT.


Added:
    branches/gcc-6-branch/gcc/testsuite/gcc.target/avr/torture/pr77326.c
Modified:
    branches/gcc-6-branch/gcc/ChangeLog
    branches/gcc-6-branch/gcc/config/avr/avr.c
    branches/gcc-6-branch/gcc/testsuite/ChangeLog
Comment 4 Georg-Johann Lay 2016-09-21 14:12:31 UTC
Author: gjl
Date: Wed Sep 21 14:11:59 2016
New Revision: 240315

URL: https://gcc.gnu.org/viewcvs?rev=240315&root=gcc&view=rev
Log:
gcc/
	Backport from 2016-09-21 trunk r240306.
	PR target/77326
	* config/avr/avr.c (hard-reg-set.h): Move #include up in front
	of rtl.h to that HARD_CONST is defined in rtl.h.
	(avr_notice_update_cc) [CC_NONE]: If insn touches some regs
	mentioned in cc_status, do CC_STATUS_INIT.

gcc/testsuite/
	Backport from 2016-09-21 trunk r240306.
	PR target/77326
	* gcc.target/avr/torture/pr77326.c: New test.


Added:
    branches/gcc-5-branch/gcc/testsuite/gcc.target/avr/torture/pr77326.c
Modified:
    branches/gcc-5-branch/gcc/ChangeLog
    branches/gcc-5-branch/gcc/config/avr/avr.c
    branches/gcc-5-branch/gcc/testsuite/ChangeLog
Comment 5 Georg-Johann Lay 2016-09-21 14:34:23 UTC
Fixed in 5.5 and 6.3+.

Also changed the bug title because the issue has nothing to do with varargs or weak; these features just appened to be present in a test case demonstrating the problem.
Comment 6 Matthijs Kooijman 2016-09-21 16:18:41 UTC
Thanks!