This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Sched1 stability issue
- From: Kugan Vivekanandarajah <kugan dot vivekanandarajah at linaro dot org>
- To: gcc <gcc at gcc dot gnu dot org>
- Date: Wed, 4 Jul 2018 16:47:18 +1000
- Subject: Sched1 stability issue
Hi,
We noticed a difference in the code generated for aarch64 gcc 7.2
hosted in Linux vs mingw. AFIK, we are supposed to produce the same
output.
For the testacse we have (quite large and I am trying to reduce), the
difference comes from sched1 pass. If I disable sched1 the difference
is going away.
Is this a known issue? Attached is the sched1 dump snippet where there
is the difference.
Thanks,
Kugan
verify found no changes in insn with uid = 41.
starting the processing of deferred insns
ending the processing of deferred insns
df_analyze called
Pass 0 for finding pseudo/allocno costs
r84 costs: CALLER_SAVE_REGS:0 GENERAL_REGS:0 FP_LO_REGS:20000
FP_REGS:20000 ALL_REGS:20000 MEM:8000
r83 costs: CALLER_SAVE_REGS:0 GENERAL_REGS:0 FP_LO_REGS:20000
FP_REGS:20000 ALL_REGS:20000 MEM:8000
r80 costs: CALLER_SAVE_REGS:0 GENERAL_REGS:0 FP_LO_REGS:10000
FP_REGS:10000 ALL_REGS:10000 MEM:8000
r79 costs: CALLER_SAVE_REGS:0 GENERAL_REGS:0 FP_LO_REGS:4000
FP_REGS:4000 ALL_REGS:10000 MEM:8000
r78 costs: CALLER_SAVE_REGS:0 GENERAL_REGS:0 FP_LO_REGS:4000
FP_REGS:4000 ALL_REGS:10000 MEM:8000
r77 costs: CALLER_SAVE_REGS:0 GENERAL_REGS:0 FP_LO_REGS:9000
FP_REGS:9000 ALL_REGS:10000 MEM:8000
Pass 1 for finding pseudo/allocno costs
r86: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r85: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r84: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r83: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r82: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r81: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r80: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r79: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r78: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r77: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r76: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r75: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r74: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r73: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r72: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r71: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r70: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r69: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r68: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r67: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r84 costs: GENERAL_REGS:0 FP_LO_REGS:20000 FP_REGS:20000
ALL_REGS:20000 MEM:8000
r83 costs: GENERAL_REGS:0 FP_LO_REGS:20000 FP_REGS:20000
ALL_REGS:20000 MEM:8000
r80 costs: GENERAL_REGS:0 FP_LO_REGS:10000 FP_REGS:10000
ALL_REGS:10000 MEM:8000
r79 costs: GENERAL_REGS:0 FP_LO_REGS:10000 FP_REGS:10000
ALL_REGS:10000 MEM:8000
r78 costs: GENERAL_REGS:0 FP_LO_REGS:10000 FP_REGS:10000
ALL_REGS:10000 MEM:8000
r77 costs: GENERAL_REGS:0 FP_LO_REGS:10000 FP_REGS:10000
ALL_REGS:10000 MEM:8000
;; ======================================================
;; -- basic block 2 from 3 to 48 -- before reload
;; ======================================================
;; 0--> b 0: i 24 r77=ap-0x40
:cortex_a53_slot_any:GENERAL_REGS+1(1)FP_REGS+0(0)
;; 0--> b 0: i 26 r78=0xffffffffffffffc8
:cortex_a53_slot_any:GENERAL_REGS+1(1)FP_REGS+0(0)
;; 1--> b 0: i 25 [sfp-0x10]=r77
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:@GENERAL_REGS+0(-1)@FP_REGS+0(0)
------
-;; 1--> b 0: i 9 [ap-0x8]=x7
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:@GENERAL_REGS+0(-1)@FP_REGS+0(0)
------
-;; 2--> b 0: i 22 [sfp-0x20]=ap
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(0)FP_REGS+0(0)
+;; 1--> b 0: i 22 [sfp-0x20]=ap
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:@GENERAL_REGS+0(0)@FP_REGS+0(0)
;; 2--> b 0: i 23 [sfp-0x18]=ap
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(0)FP_REGS+0(0)
-;; 3--> b 0: i 27 [sfp-0x8]=r78
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(-1)FP_REGS+0(0)
+;; 2--> b 0: i 27 [sfp-0x8]=r78
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(-1)FP_REGS+0(0)
;; 3--> b 0: i 28 r79=0xffffffffffffff80
:cortex_a53_slot_any:GENERAL_REGS+1(1)FP_REGS+0(0)
-;; 4--> b 0: i 10 [ap-0xc0]=v0
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:@GENERAL_REGS+0(0)@FP_REGS+0(-1)
+;; 3--> b 0: i 10 [ap-0xc0]=v0
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:@GENERAL_REGS+0(0)@FP_REGS+0(-1)
;; 4--> b 0: i 29 [sfp-0x4]=r79
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(-1)FP_REGS+0(0)
-;; 5--> b 0: i 11 [ap-0xb0]=v1
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:@GENERAL_REGS+0(0)@FP_REGS+0(-1)
+;; 4--> b 0: i 11 [ap-0xb0]=v1
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:@GENERAL_REGS+0(0)@FP_REGS+0(-1)
;; 5--> b 0: i 12 [ap-0xa0]=v2
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(0)FP_REGS+0(-1)
-;; 6--> b 0: i 13 [ap-0x90]=v3
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(0)FP_REGS+0(-1)
+;; 5--> b 0: i 13 [ap-0x90]=v3
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(0)FP_REGS+0(-1)
;; 6--> b 0: i 14 [ap-0x80]=v4
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(0)FP_REGS+0(-1)
-;; 7--> b 0: i 15 [ap-0x70]=v5
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(0)FP_REGS+0(-1)
+;; 6--> b 0: i 15 [ap-0x70]=v5
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(0)FP_REGS+0(-1)
;; 7--> b 0: i 16 [ap-0x60]=v6
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(0)FP_REGS+0(-1)
-;; 8--> b 0: i 17 [ap-0x50]=v7
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(0)FP_REGS+0(-1)
+;; 7--> b 0: i 17 [ap-0x50]=v7
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(0)FP_REGS+0(-1)
;; 8--> b 0: i 3 [ap-0x38]=x1
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:@GENERAL_REGS+0(-1)@FP_REGS+0(0)
-;; 9--> b 0: i 4 [ap-0x30]=x2
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(-1)FP_REGS+0(0)
+;; 8--> b 0: i 4 [ap-0x30]=x2
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(-1)FP_REGS+0(0)
;; 9--> b 0: i 5 [ap-0x28]=x3
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(-1)FP_REGS+0(0)
-;; 10--> b 0: i 6 [ap-0x20]=x4
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(-1)FP_REGS+0(0)
+;; 9--> b 0: i 6 [ap-0x20]=x4
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(-1)FP_REGS+0(0)
;; 10--> b 0: i 7 [ap-0x18]=x5
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(-1)FP_REGS+0(0)
-;; 11--> b 0: i 8 [ap-0x10]=x6
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(-1)FP_REGS+0(0)
+;; 10--> b 0: i 8 [ap-0x10]=x6
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(-1)FP_REGS+0(0)
-------
+;; 11--> b 0: i 9 [ap-0x8]=x7
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(-1)FP_REGS+0(0)
------
;; 12--> b 0: i 31 r80=asm_operands
:nothing:GENERAL_REGS+1(1)FP_REGS+0(0)
;; 13--> b 0: i 34 r83=[sfp-0x20]
:(cortex_a53_single_issue+cortex_a53_ls_agen),(cortex_a53_load+cortex_a53_slot0),cortex_a53_load:GENERAL_REGS+2(2)FP_REGS+0(0)
;; 13--> b 0: i 36 r84=[sfp-0x10]
:(cortex_a53_single_issue+cortex_a53_ls_agen),(cortex_a53_load+cortex_a53_slot0),cortex_a53_load:GENERAL_REGS+2(2)FP_REGS+0(0)
;; 14--> b 0: i 35 [sfp-0x40]=r83
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:@GENERAL_REGS+0(-2)@FP_REGS+0(0)
;; 14--> b 0: i 37 [sfp-0x30]=r84
:(cortex_a53_slot_any+cortex_a53_ls_agen),cortex_a53_store:GENERAL_REGS+0(-2)FP_REGS+0(0)
;; 15--> b 0: i 39 x1=sfp-0x40
:cortex_a53_slot_any:GENERAL_REGS+1(1)FP_REGS+0(0)
;; 15--> b 0: i 41 {x0=call [`__real_vprintf'];clobber
x30;}:(cortex_a53_slot_any+cortex_a53_branch):GENERAL_REGS+1(-1)FP_REGS+0(0)
;; 16--> b 0: i 43 asm_operands
:nothing:GENERAL_REGS+0(-1)FP_REGS+0(0)
;; 17--> b 0: i 48 use x0
:nothing:GENERAL_REGS+0(0)FP_REGS+0(0)
;; Ready list (final):
;; total time = 17
;; new head = 24
;; new tail = 48