Bug 44013 - [4.5 Regression] VTA produces wrong code
Summary: [4.5 Regression] VTA produces wrong code
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 4.5.0
: P3 normal
Target Milestone: ---
Assignee: Alexandre Oliva
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-05-06 12:57 UTC by hariharan.gcc
Modified: 2010-06-09 04:54 UTC (History)
3 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: picochip-unknown-none
Build: i686-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed:


Attachments
The test sourcecode (799 bytes, text/plain)
2010-05-06 13:00 UTC, hariharan.gcc
Details
debug dump for combine stage (4.71 KB, text/plain)
2010-05-06 14:03 UTC, hariharan.gcc
Details
debug dump just before sched1 (2.96 KB, text/plain)
2010-05-06 14:05 UTC, hariharan.gcc
Details
sched1 output obtained with -fsched-verbose=8 (6.88 KB, text/plain)
2010-05-06 14:06 UTC, hariharan.gcc
Details
Patch that fixes the problem (425 bytes, patch)
2010-06-02 05:19 UTC, Alexandre Oliva
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description hariharan.gcc 2010-05-06 12:57:26 UTC
I first found this bug in GCC 4.5.0, but it is repeatable in mainline as of 159067. The attached preprocessed testcase, when compiled using "-O2 -S" will show the problem.

We have a basic block [bb 3] with the following piece of code.

[32] X(SI) = unspec_volatile (const_int 3)
[33] var_location (some_var) = X
[34] (subreg:SI (reg:DI Z) 0) = X
...
[41] Y(SI) = unspec_volatile (const_int 3)
[42] var_location (some_var) = Y
[43] (subreg:SI (reg:DI Z) 4) = Y

Combine combines 32->34 and 41->43 and  converts this into

[33] var_location (some_var) = unspec_volatile (const_int 3)
[34] (subreg:SI (reg:DI Z) 0) = unspec_volatile (const_int 3)
...
[42] var_location (some_var) = unspec_volatile (const_int 3)
[43] (subreg:SI (reg:DI Z) 4) = unspec_volatile (const_int 3

I am not sure if this is a valid transformation in itself. var_location debug instructions now dont just use registers, but they have unspec_volatile which is assumed to clobber all register/memory.

The scheduler dependency for this becomes 34->42->43 since the debug_insn in 42 actually clobbers everything. But, when scheduling instructions we ignore the debug_insn in 42 and hence the dependency is broken. 34 and 43 are both deemed ready, but 43 gets scheduled first which results in the two unspec_volatile instructions being reordered.

I am not sure which of the steps above is incorrect. Any pointers on this would be greatly appreciated.

Please let me know if you need any information.

Thanks
Hari
Comment 1 hariharan.gcc 2010-05-06 13:00:01 UTC
Created attachment 20584 [details]
The test sourcecode
Comment 2 Jakub Jelinek 2010-05-06 13:07:04 UTC
DEBUG_INSNs must not affect code generation (in scheduling case scheduling of non-DEBUG_INSNs).  Does the testcase fail with -fcompare-debug?
From your description it is unclear what is going on.  Perhaps you should attach snippets before and after scheduling and say what do you think is wrong and why.
Comment 3 hariharan.gcc 2010-05-06 14:02:42 UTC
This test does fail with -fcompare-debug.
The relevant part of this basic block just before sched1 is

(debug_insn 33 32 34 3 vta_bug.i:12 (var_location:SI converter$rawValue (unspec_volatile:SI [
            (const_int 3 [0x3])
        ] 8)) -1 (nil))

(insn 34 33 35 3 vta_bug.i:75 (set (subreg:SI (reg/v:DI 46 [ trchHeader ]) 0)
        (unspec_volatile:SI [
                (const_int 3 [0x3])
            ] 8)) 81 {commsGet} (nil))

(insn 38 37 39 3 vta_bug.i:75 (set (reg:HI 45 [ trchHeader$D1290$channelCodingEnum ])
        (lshiftrt:HI (subreg:HI (reg/v:DI 46 [ trchHeader ]) 0)
            (const_int 14 [0xe]))) 65 {lshrhi3} (nil))

(debug_insn 39 38 40 3 (var_location:QI trchHeader$D1290$channelCodingEnum (subreg:QI (reg:HI 45 [ trchHeader$D1290$channelCodingEnum ]) 0)) -1 (nil))

(debug_insn 40 39 41 3 (var_location:QI trchHeader$D1290$channelCodingEnum (subreg:QI (reg:HI 45 [ trchHeader$D1290$channelCodingEnum ]) 0)) -1 (nil))

(debug_insn 42 41 43 3 vta_bug.i:12 (var_location:SI converter$rawValue (unspec_volatile:SI [
            (const_int 3 [0x3])
        ] 8)) -1 (nil))

(insn 43 42 44 3 vta_bug.i:76 (set (subreg:SI (reg/v:DI 46 [ trchHeader ]) 4)
        (unspec_volatile:SI [
                (const_int 3 [0x3])
            ] 8)) 81 {commsGet} (nil))



The scheduler dependency for this bb looks like this

;;   --------------- forward dependences: ------------ 

;;   --- Region Dependences --- b 3 bb 0 
;;      insn  code    bb   dep  prio  cost   reservation
;;      ----  ----    --   ---  ----  ----   -----------
;;       34    81     3     0     2     1   slot1       : 61 42 40 39 38 
;;       38    65     3     1     1     1   slot0|slot1 : 61 42 40 39 
;;       39    -1     3     2     0     0   nothing     : 42 40 
;;       40    -1     3     3     0     0   nothing     : 42 
;;       42    -1     3     4     0     0   nothing     : 48 43 
;;       43    81     3     0     5     1   slot1       : 61 55 54 53 52 51 50 49 48 47 
;;       47    65     3     1     4     1   slot0|slot1 : 61 55 51 49 48 
;;       48    -1     3     3     0     0   nothing     : 55 49 
;;       49    -1     3     3     0     0   nothing     : 55 50 
;;       50    -1     3     2     0     0   nothing     : 55 51 
;;       51    -1     3     3     0     0   nothing     : 55 52 
;;       52    -1     3     2     0     0   nothing     : 55 53 
;;       53    -1     3     2     0     0   nothing     : 55 54 
;;       54    -1     3     2     0     0   nothing     : 56 55 
;;       55    83     3     2     3     1   slot1       : 61 59 58 57 56 
;;       56    -1     3     2     0     0   nothing     : 59 57 
;;       57    -1     3     2     0     0   nothing     : 59 58 
;;       58    -1     3     2     0     0   nothing     : 60 59 
;;       59    83     3     1     2     1   slot1       : 61 60 
;;       60    -1     3     2     0     0   nothing     : 
;;       61     7     3     6     1     1   (slot0+slot1+slot2) : 

;;              dependencies resolved: insn 34
;;              tick updated: insn 34 into ready
;;              dependencies resolved: insn 43
;;              tick updated: insn 43 into ready
;;      Advanced a state.
;;              Ready list after queue_to_ready:    43:15  34:10
;;              Ready list after ready_sort:    34:10  43:15
;;      Clock 0
;;      Ready list (t =   0):    34:10  43:15
;;              Chosen insn : 43
;;        0-->    43 r46#4=unspec/v[0x3] 8             :slot1
;;              resetting: debug insn 42


Note that insn 34 has forward dependency to 42 and 42 to 43. So we have 34->42->43. But, 42 is a debug instruction. While scheduling, note that both 34 and 43 are deemed ready and 43 gets scheduled first, resulting in wrong code.

I will attach the compiler dumps from 3 stages now.

Thanks
Hari

(In reply to comment #2)
> DEBUG_INSNs must not affect code generation (in scheduling case scheduling of
> non-DEBUG_INSNs).  Does the testcase fail with -fcompare-debug?
> From your description it is unclear what is going on.  Perhaps you should
> attach snippets before and after scheduling and say what do you think is wrong
> and why.
> 

Comment 4 hariharan.gcc 2010-05-06 14:03:50 UTC
Created attachment 20586 [details]
debug dump for combine stage
Comment 5 hariharan.gcc 2010-05-06 14:05:01 UTC
Created attachment 20587 [details]
debug dump just before sched1
Comment 6 hariharan.gcc 2010-05-06 14:06:15 UTC
Created attachment 20588 [details]
sched1 output obtained with -fsched-verbose=8
Comment 7 Alexandre Oliva 2010-06-02 05:08:42 UTC
Mine
Comment 8 Alexandre Oliva 2010-06-02 05:19:26 UTC
Created attachment 20804 [details]
Patch that fixes the problem

Here's the patch I'm regstrapping now.  It avoids relying on the implied dependency.
Comment 9 hariharan.gcc 2010-06-02 08:41:26 UTC
Thanks Alexandre. I can confirm that this patch works for this testcase. 

Cheers
Hari
Comment 10 Jakub Jelinek 2010-06-04 16:44:10 UTC
Subject: Bug 44013

Author: jakub
Date: Fri Jun  4 16:43:42 2010
New Revision: 160281

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=160281
Log:
	PR rtl-optimization/44013
	* sched-deps.c (add_dependence_list_and_free): Don't free lists
	when processing debug insns.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/sched-deps.c

Comment 11 Jakub Jelinek 2010-06-07 07:16:37 UTC
Fixed on the trunk.
Comment 12 Alexandre Oliva 2010-06-09 04:54:03 UTC
Fixed