Bug 57046

Summary: [4.8 Regression] wrong code generated by gcc 4.8.0 on i686
Product: gcc Reporter: Mattias EngdegÄrd <mattiase>
Component: rtl-optimizationAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: eraman, jakub, kha, vmakarov, vmakarov
Priority: P1 Keywords: wrong-code
Version: 4.8.0   
Target Milestone: 4.8.1   
Host: Target: i?86-*-*
Build: Known to work: 4.7.2, 4.9.0
Known to fail: 4.8.0 Last reconfirmed: 2013-04-23 00:00:00
Attachments: Test case including driver demonstrating the bug
Single-file test case.

Description Mattias EngdegÄrd 2013-04-23 10:42:02 UTC
Created attachment 29917 [details]
Test case including driver demonstrating the bug

Gcc 4.8.0 silently miscompiles the attached short test case (emac.c) on 32-bit x86 with -O2. It appears that the value returned from get_value is thrown away.

gcc -V output:
Using built-in specs.
COLLECT_GCC=/usr/local/gcc/4.8.0/bin/gcc
COLLECT_LTO_WRAPPER=/home/local/linux/gcc/4.8.0/bin/../libexec/gcc/i686-pc-linux-gnu/4.8.0/lto-wrapper
Target: i686-pc-linux-gnu
Configured with: ../gcc-4.8.0/configure --prefix=/usr/local/gcc/4.8.0 --with-mpc=/tmp/extra --with-gmp=/tmp/extra --with-mpfr=/tmp/extra --with-isl=/tmp/extra --with-cloog=/tmp/extra --with-as=/usr/local/binutils/2.23.2/bin/as --with-ld=/usr/local/binutils/2.23.2/bin/ld.gold --enable-languages=c,c++,objc,go
Thread model: posix
gcc version 4.8.0 (GCC)
Comment 1 Mikael Pettersson 2013-04-23 11:31:06 UTC
Created attachment 29918 [details]
Single-file test case.

I can reproduce the wrong-code on x86_64-linux with gcc 4.9-20130421 and 4.8-20130418, using -m32 -O2 -Wall.  gcc 4.7 and 4.6 generate correct code.
Comment 2 Richard Biener 2013-04-23 12:04:18 UTC
Confirmed.
Comment 3 Jakub Jelinek 2013-04-23 12:09:58 UTC
Started with http://gcc.gnu.org/r192719 aka LRA merge, the problematic function is emac_operation.
Comment 4 Jakub Jelinek 2013-04-23 12:26:35 UTC
We have after the get_value call:
(insn 73 30 32 6 (set (reg:SI 76 [ D.1441 ])
        (reg:SI 0 ax)) pr57046.c:42 85 {*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 0 ax)
        (nil)))
(insn 32 73 33 6 (parallel [
            (set (reg:SI 73 [ D.1443 ])
                (ashift:SI (subreg:SI (reg:DI 60 [ D.1441 ]) 0)
                    (const_int 2 [0x2])))
            (clobber (reg:CC 17 flags))
        ]) 502 {*ashlsi3_1}
     (expr_list:REG_DEAD (reg:DI 60 [ D.1441 ])
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))

and IRA decides to put pseudo 76 into %ebx and pseudo 60 into %ecx.  Either it is an IRA bug, or IRA takes into account that we only really need the low 32-bits of pseudo 60 at that point.  In any case, reload loads SImode %ecx from the stack and uses it in the shift, while LRA loads full DImode %ecx (i.e. %ecx and %ebx) from the stack, then uses just the low bits from that (i.e. %ecx) in the shift.  So the LRA generated code clobbers the value in %ebx, and get_value call is even later on DCEd because of it.
Comment 5 Vladimir Makarov 2013-04-23 15:34:40 UTC
(In reply to comment #4)
> We have after the get_value call:
> (insn 73 30 32 6 (set (reg:SI 76 [ D.1441 ])
>         (reg:SI 0 ax)) pr57046.c:42 85 {*movsi_internal}
>      (expr_list:REG_DEAD (reg:SI 0 ax)
>         (nil)))
> (insn 32 73 33 6 (parallel [
>             (set (reg:SI 73 [ D.1443 ])
>                 (ashift:SI (subreg:SI (reg:DI 60 [ D.1441 ]) 0)
>                     (const_int 2 [0x2])))
>             (clobber (reg:CC 17 flags))
>         ]) 502 {*ashlsi3_1}
>      (expr_list:REG_DEAD (reg:DI 60 [ D.1441 ])
>         (expr_list:REG_UNUSED (reg:CC 17 flags)
>             (nil))))
> 
> and IRA decides to put pseudo 76 into %ebx and pseudo 60 into %ecx.  Either it
> is an IRA bug, or IRA takes into account that we only really need the low
> 32-bits of pseudo 60 at that point.  In any case, reload loads SImode %ecx from
> the stack and uses it in the shift, while LRA loads full DImode %ecx (i.e. %ecx
> and %ebx) from the stack, then uses just the low bits from that (i.e. %ecx) in
> the shift.  So the LRA generated code clobbers the value in %ebx, and get_value
> call is even later on DCEd because of it.

It seems like a discrepancy in IRA which allocates in terms of subregisters and LRA splitting (including call save/restore as in this case) in terms of pseudos.  I guess fixing this might take about week.
Comment 6 Jakub Jelinek 2013-04-26 18:03:02 UTC
Author: vmakarov
Date: Wed Apr 24 20:27:33 2013
New Revision: 198263

URL: http://gcc.gnu.org/viewcvs?rev=198263&root=gcc&view=rev
Log:
2013-04-24  Vladimir Makarov  <vmakarov@redhat.com>

	PR rtl-optimizations/57046
	* lra-constraints (split_reg): Set up lra_risky_transformations_p
	for multi-reg splits.

2013-04-24  Vladimir Makarov  <vmakarov@redhat.com>

	PR rtl-optimizations/57046
	* gcc.target/i386/pr57046.c: New test.


Added:
    trunk/gcc/testsuite/gcc.target/i386/pr57046.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/lra-constraints.c
    trunk/gcc/testsuite/ChangeLog
Comment 7 Jakub Jelinek 2013-05-02 19:29:52 UTC
Author: vmakarov
Date: Thu May  2 17:52:45 2013
New Revision: 198556

URL: http://gcc.gnu.org/viewcvs?rev=198556&root=gcc&view=rev
Log:
2013-05-02  Vladimir Makarov  <vmakarov@redhat.com>

	Backport from mainline
	2013-04-24  Vladimir Makarov  <vmakarov@redhat.com>

        PR rtl-optimizations/57046
        * lra-constraints (split_reg): Set up lra_risky_transformations_p
        for multi-reg splits.

2013-05-02  Vladimir Makarov  <vmakarov@redhat.com>

	Backport from mainline
	2013-04-24  Vladimir Makarov  <vmakarov@redhat.com>

        PR rtl-optimizations/57046
        * gcc.target/i386/pr57046.c: New test.


Added:
    branches/gcc-4_8-branch/gcc/testsuite/gcc.target/i386/pr57046.c
Modified:
    branches/gcc-4_8-branch/gcc/ChangeLog
    branches/gcc-4_8-branch/gcc/lra-constraints.c
    branches/gcc-4_8-branch/gcc/testsuite/ChangeLog
Comment 8 Easwaran Raman 2013-05-21 01:53:34 UTC
*** Bug 57088 has been marked as a duplicate of this bug. ***