This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Gcc 3.1 performance regressions with respect to 2.95.3
- From: law at redhat dot com
- To: David Edelsohn <dje at watson dot ibm dot com>
- Cc: Michael Matz <matzmich at cs dot tu-berlin dot de>, gcc-patches at gcc dot gnu dot org
- Date: Tue, 28 May 2002 10:08:21 -0600
- Subject: Re: Gcc 3.1 performance regressions with respect to 2.95.3
- Reply-to: law at redhat dot com
In message <200203290708.CAA26070@makai.watson.ibm.com>, David Edelsohn
writes:
> The reason that use of SCHED_GROUP_P causes the second scheduler
> pass to loop forever appears to be because SCHED_GROUP_P is not cleared in
> scheduling pass 2. SCHED_GROUP_P is not cleared when the instruction is
> consumed at the end of the first scheduler pass. sched_analyze() looks
> like it should clear it at the beginning of each pass and I will
> investigate that tomorrow
The key as to why the code in sched_analyze didn't clear all the SCHED_GROUP_P
bits like we thought it should is that sched_analyze is run one block
at a time as we schedule blocks. We schedule predecessors before their
successors.
So, sched_analyze wipes SCHED_GROUP_P on all the insns in the current
block -- however, we can look outside the current block into a successor
block (where SCHED_GROUP_P hasn't been cleared yet).
compute_forward_dependencies calls group_leader on each insn in the current
block. All goes well until we call group_leader on the _last_ insn in the
current block.
group_leader calls "next_nonnote_insn", which gives us the first nonnote
insn in the _next_ block -- where we haven't wiped SCHED_GROUP_P yet because
we haven't called sched_analyze on the next block yet.
The RTL you cite below shows this pretty clearly:
> Before sched2, the instructions look like:
>
> (note 527 232 233 [bb 12] NOTE_INSN_BASIC_BLOCK)
>
> (insn 233 527 234 (set (reg:CC 68 cr0 [159])
> (compare:CC (reg:SI 10 r10 [157])
> (const_int 128 [0x80]))) 377 {*cmpsi_internal1} (nil)
> (expr_list:REG_DEAD (reg:SI 10 r10 [157])
> (nil)))
>
> (jump_insn 234 233 528 (set (pc)
> (if_then_else (ne (reg:CC 68 cr0 [159])
> (const_int 0 [0x0]))
> (label_ref 255)
> (pc))) 484 {*rs6000.md:12869} (insn_list 233 (nil))
> (expr_list:REG_DEAD (reg:CC 68 cr0 [159])
> (expr_list:REG_BR_PROB (const_int 7100 [0x1bbc])
> (nil))))
>
> (note 528 234 244 [bb 13] NOTE_INSN_BASIC_BLOCK)
>
> (note 244 528 246 NOTE_INSN_DELETED)
>
> (note 246 244 239 NOTE_INSN_DELETED)
>
> (insn/s 239 246 247 (parallel[
> (set (reg:SI 10 r10 [160])
> (and:SI (reg/f:SI 31 r31 [118])
> (const_int 256 [0x100])))
> (clobber (scratch:CC))
> ] ) 81 {andsi3} (nil)
> (expr_list:REG_UNUSED (scratch:CC)
> (expr_list:REG_NO_CONFLICT (reg/v:DI 30 r30 [118])
> (nil))))
>
> compute_forward_dependence examines insn 234 and group_leader() queries
> the next non-note instruction. In sched1, the next non-note instruction
> was the CLOBBER which is part of a SCHED_GROUP because the *following*
> instruction is marked. In sched2, the next non-note instruction is insn
> 239 which has SCHED_GROUP_P set for some reason.
The key being insn 234 is the last insn in block #12 and group_leader will
start looking at the first real insn after insn 234. The first insn it will
find will be in block #13 and we haven't cleared SCHED_GROUP_P on the insns
in that block.
So I think it's pretty clear now how SCHED_GROUP_P is being unexpectedly
left set. The trick now is ensuring it is properly cleared.
I like your code to clear it as it is consumed, but I want to mentally verify
that your code will indeed clear all occurrences of SCHED_GROUP_P. Your code
would also eliminate the need to clear SCHED_GROUP_P in sched_analyze.
jeff