This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: DFA scheduler producing sub-optimal code.
- From: Vladimir Makarov <vmakarov at redhat dot com>
- To: Dan Towner <dant at picochip dot com>
- Cc: gcc at gcc dot gnu dot org
- Date: Wed, 06 Nov 2002 11:50:33 -0500
- Subject: Re: DFA scheduler producing sub-optimal code.
- References: <3DC8F596.40500@picochip.com>
Dan Towner wrote:
>
> Hi,
>
> I am using the DFA scheduler for a 16-bit VLIW. The VLIW has 3
> instruction slots, and one constant value slot. On one of my test cases,
> I get the following schedule (I've added lines to show VLIW packets):
>
> ;; 4--> 10 R9=[R5+0x2] :slot1,nothing
> ------------------------------------------------------------------------
> ;; 5--> 11 R10=[R5+0x4] :slot1,nothing
> ------------------------------------------------------------------------
> ;; 6--> 12 R11=[R5+0x6] :slot1,nothing
> ;; 6--> 22 R3=R9 :slot0|slot1
> ------------------------------------------------------------------------
> ;; 7--> 23 R4=R10 :slot0|slot1
> ------------------------------------------------------------------------
> ;; 8--> 24 R5=R11 :slot0|slot1
> ------------------------------------------------------------------------
> ;; 9--> 70 R0=FP+0x4 :(slot0+slot1+slot2+slotC)
>
> Notice that cycles 7 and 8 could be scheduled into either slot0 or
> slot1. The instructions have been `split' from an SI
> register-to-register move, which has been decomposed into its
> constituent sub-register moves. However, the two instructions in their
> new form use different registers.
>
> Why doesn't the scheduler combine these two instructions into the same
> cycle? I've tried using the sched-verbose option (see output below), but
> I can't see any reason why it has chosen a new cycle for insn 24.
>
The most probably, it is because the latency time of insn #12 is 2
cycles. So insn #24 can not be issued until the data is ready.
> Thanks,
>
> Dan.
>
> ----------------------
>
> ;; Clock 4
> ;; Ready list (t = 4): 70 12 11 10
> ;; 4--> 10 R9=[R5+0x2] :slot1,nothing
> ;; dependences resolved: insn 22 into queue with cost=2
> ;; Ready-->Q: insn 22: queued for 2 cycles.
> ;; Ready list (t = 4): 70 12 11
> ;; Ready-->Q: insn 11: queued for 1 cycles.
> ;; Ready list (t = 4): 70 12
> ;; Ready-->Q: insn 12: queued for 1 cycles.
> ;; Ready list (t = 4): 70
> ;; Ready-->Q: insn 70: queued for 1 cycles.
> ;; Ready list (t = 4):
> ;; Q-->Ready: insn 70: moving to ready without stalls
> ;; Q-->Ready: insn 12: moving to ready without stalls
> ;; Q-->Ready: insn 11: moving to ready without stalls
> ;; Ready list after queue_to_ready: 11 12 70
> ;; Clock 5
> ;; Ready list (t = 5): 70 12 11
> ;; 5--> 11 R10=[R5+0x4] :slot1,nothing
> ;; dependences resolved: insn 23 into queue with cost=2
> ;; Ready-->Q: insn 23: queued for 2 cycles.
> ;; Ready list (t = 5): 70 12
> ;; Ready-->Q: insn 12: queued for 1 cycles.
> ;; Ready list (t = 5): 70
> ;; Ready-->Q: insn 70: queued for 1 cycles.
> ;; Ready list (t = 5):
> ;; Q-->Ready: insn 70: moving to ready without stalls
> ;; Q-->Ready: insn 12: moving to ready without stalls
> ;; Q-->Ready: insn 22: moving to ready without stalls
> ;; Ready list after queue_to_ready: 22 12 70
> ;; Clock 6
> ;; Ready list (t = 6): 70 22 12
> ;; 6--> 12 R11=[R5+0x6] :slot1,nothing
> ;; dependences resolved: insn 24 into queue with cost=2
> ;; Ready-->Q: insn 24: queued for 2 cycles.
> ;; Ready list (t = 6): 70 22
> ;; 6--> 22 R3=R9 :slot0|slot1
> ;; Ready list (t = 6): 70
> ;; Ready-->Q: insn 70: queued for 1 cycles.
> ;; Ready list (t = 6):
> ;; Q-->Ready: insn 70: moving to ready without stalls
> ;; Q-->Ready: insn 23: moving to ready without stalls
> ;; Ready list after queue_to_ready: 23 70
> ;; Clock 7
> ;; Ready list (t = 7): 70 23
> ;; 7--> 23 R4=R10 :slot0|slot1
> ;; Ready list (t = 7): 70
> ;; Ready-->Q: insn 70: queued for 1 cycles.
> ;; Ready list (t = 7):
> ;; Q-->Ready: insn 70: moving to ready without stalls
> ;; Q-->Ready: insn 24: moving to ready without stalls
> ;; Ready list after queue_to_ready: 24 70
> ;; Clock 8
> ;; Ready list (t = 8): 70 24
> ;; 8--> 24 R5=R11 :slot0|slot1
> ;; Ready list (t = 8): 70
> ;; Ready-->Q: insn 70: queued for 1 cycles.
> ;; Ready list (t = 8):
> ;; Q-->Ready: insn 70: moving to ready without stalls
> ;; Ready list after queue_to_ready: 70
> ;; Clock 9
> ;; Ready list (t = 9): 70
> ;; 9--> 70 R0=FP+0x4
> :(slot0+slot1+slot2+slotC)
> ;; dependences resolved: insn 25 into queue with cost=1
> ;; Ready-->Q: insn 25: queued for 1 cycles.
> ;; Ready list (t = 9):
> ;; Q-->Ready: insn 25: moving to ready without stalls
> ;; Ready list after queue_to_ready: 25
>
> =============================================================================
> Daniel Towner
> picoChip Designs Ltd., Riverside Buildings, 108, Walcot Street, BATH,
> BA1 5BG
> dant@picochip.com
> 07786 702589