63404 – [5 Regression] gcc 5 miscompiles linux block layer

Bug 63404 - [5 Regression] gcc 5 miscompiles linux block layer

Summary: [5 Regression] gcc 5 miscompiles linux block layer

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	target (show other bugs)
Version:	5.0

Importance:	P1 normal
Target Milestone:	5.0
Assignee:	Jiong Wang

URL:
Keywords:	wrong-code

Duplicates (2):	63463 63481 (view as bug list)
Depends on:
Blocks:	63463
	Show dependency tree / graph

Reported:	2014-09-29 04:20 UTC by Andi Kleen
Modified:	2014-11-07 15:55 UTC (History)
CC List:	2 users (show)

See Also:
Host:
Target:	x86_64-linux
Build:
Known to work:
Known to fail:
Last reconfirmed:	2014-09-29 00:00:00

Attachments
not quite yet runnable test case (198.78 KB, application/x-gzip) 2014-09-29 04:25 UTC, Andi Kleen	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Andi Kleen 2014-09-29 04:20:10 UTC

When I boot a current Linux mainline kernel compiled with mainline gcc 
and the section fix patch applied I always get a crash at boot in the block layer.

gcc version 5.0.0 20140926 (experimental) (GCC) 

    1.318801] EXT4-fs (sda1): write access will be enabled during recovery
[    1.367592] ------------[ cut here ]------------
[    1.369061] kernel BUG at /home/andi/lsrc/linux/block/blk-flush2.c:80!
[    1.370910] invalid opcode: 0000 [#1] SMP 


I narrowed it down to one function. When only the function is compiled with gcc 4.9 the kernel boots.

Attach is a test case with only the function.
It doesn't quite run by itself yet, so the code has to be examined.

Comment 1 Andi Kleen 2014-09-29 04:25:56 UTC

Created attachment 33607 [details]
not quite yet runnable test case


In the real execution blk_flush_complete_seq always ends up in the default case in the switch and crashes.

Comment 2 Andi Kleen 2014-09-29 04:57:06 UTC

The switch is miscompiled and destroys the flags register in the middle of a comparison:

.LVL2:
        .loc 1 49 0
        cmpl    $2, %eax        #, seq
        je      .L5     #,
        shrb    $2, %r12b       #, D.32130      <-------- BAD1
        andl    $1, %r12d       #, D.32130      <-------- BAD2
        jbe     .L24    #,
        cmpl    $4, %eax        #, seq
        je      .L7     #,
        cmpl    $8, %eax        #, seq
        jne     .L4     #,


gcc 4.9 creates the same code except for BAD1/BAD2. These two
JBE relies on CF/ZF being preserved, but SHR can overwrite ZF/CF,
which breaks the JBE after the CMP

So somehow the backend lost track of these two flag bits.

Comment 3 Markus Trippelsdorf 2014-09-29 08:29:08 UTC

Started with r215563.

Comment 4 Jiong Wang 2014-09-29 10:48:59 UTC

sorry for causing the trouble.

the reason might be the "flag" is an implified register while it's not take into account in current shrink-wrap reg read/write analysis.

I will revert my patch temperarily if I couldn't find a proper fix today.

Comment 5 Jiong Wang 2014-09-29 11:14:07 UTC

we need to check the following 

   else if (GET_CODE == CLOBBER
            || GET_CODE (x) == USE
            || GET_CODE (x) == ASM_INPUT)

I will post the fix after pass x86 bootstrap and regression

Comment 6 Pat Haugen 2014-09-29 18:46:40 UTC

(In reply to Jiong Wang from comment #5)
> we need to check the following 
> 
>    else if (GET_CODE == CLOBBER
>             || GET_CODE (x) == USE
>             || GET_CODE (x) == ASM_INPUT)
> 
> I will post the fix after pass x86 bootstrap and regression

r215563 also introduced a miscompare on PowerPC for cpu2000 benchmark 254.gap. Applying your patch proposed on the gcc-patches ml for this bug fixes the issue.

Comment 7 Jiong Wang 2014-09-29 18:48:20 UTC

(In reply to Pat Haugen from comment #6)
> (In reply to Jiong Wang from comment #5)
> > we need to check the following 
> > 
> 
> r215563 also introduced a miscompare on PowerPC for cpu2000 benchmark
> 254.gap. Applying your patch proposed on the gcc-patches ml for this bug
> fixes the issue.

thanks for reporting this, sorry for causing trouble.

Comment 8 Jiong Wang 2014-09-29 18:50:25 UTC

(In reply to Pat Haugen from comment #6)
> (In reply to Jiong Wang from comment #5)
> > we need to check the following 
> > 
> >    else if (GET_CODE == CLOBBER
> >             || GET_CODE (x) == USE
> >             || GET_CODE (x) == ASM_INPUT)
> > 
> > I will post the fix after pass x86 bootstrap and regression
> 
> r215563 also introduced a miscompare on PowerPC for cpu2000 benchmark
> 254.gap. Applying your patch proposed on the gcc-patches ml for this bug
> fixes the issue.

and I am curious about whether there are any performance change since this insn sink change.

Comment 9 Dominique d'Humieres 2014-09-29 21:03:37 UTC

For the record the test gfortran.dg/typebound_operator_3.f03 also failed with -O1 and -m64 (see https://gcc.gnu.org/ml/gcc-regression/2014-09/msg00226.html). This is fixed by the patch at https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02568.html.

Comment 10 Pat Haugen 2014-10-01 20:45:10 UTC

(In reply to Jiong Wang from comment #8)
> and I am curious about whether there are any performance change since this
> insn sink change.

I built/ran cpu2000 and didn't see any difference outside the noise range. The number of shrink-wrapped procedures over the entire benchmark suite build went from 558 to 567.

Comment 11 Andrew Pinski 2014-10-08 04:41:57 UTC

*** Bug 63481 has been marked as a duplicate of this bug. ***

Comment 12 Richard Henderson 2014-10-10 15:56:38 UTC

Author: rth
Date: Fri Oct 10 15:56:07 2014
New Revision: 216096

URL: https://gcc.gnu.org/viewcvs?rev=216096&root=gcc&view=rev
Log:
PR target/63404

  * shrink-wrap.c (move_insn_for_shrink_wrap): Don't use single_set.
  Restrict the set of expressions we're willing to move.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/shrink-wrap.c

Comment 13 Markus Trippelsdorf 2014-10-17 12:14:06 UTC

Fixed.

Comment 14 Jiong Wang 2014-11-07 15:55:16 UTC

*** Bug 63463 has been marked as a duplicate of this bug. ***