This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: DFA scheduler producing sub-optimal code.


Dan Towner wrote:
> 
> Hi,
> 
> I am using the DFA scheduler for a 16-bit VLIW. The VLIW has 3
> instruction slots, and one constant value slot. On one of my test cases,
> I get the following schedule (I've added lines to show VLIW packets):
> 
> ;;        4--> 10   R9=[R5+0x2]              :slot1,nothing
> ------------------------------------------------------------------------
> ;;        5--> 11   R10=[R5+0x4]             :slot1,nothing
> ------------------------------------------------------------------------
> ;;        6--> 12   R11=[R5+0x6]             :slot1,nothing
> ;;        6--> 22   R3=R9                    :slot0|slot1
> ------------------------------------------------------------------------
> ;;        7--> 23   R4=R10                   :slot0|slot1
> ------------------------------------------------------------------------
> ;;        8--> 24   R5=R11                   :slot0|slot1
> ------------------------------------------------------------------------
> ;;        9--> 70   R0=FP+0x4                :(slot0+slot1+slot2+slotC)
> 
> Notice that cycles 7 and 8 could be scheduled into either slot0 or
> slot1. The instructions have been `split' from an SI
> register-to-register move, which has been decomposed into its
> constituent sub-register moves. However, the two instructions in their
> new form use different registers.
> 
> Why doesn't the scheduler combine these two instructions into the same
> cycle? I've tried using the sched-verbose option (see output below), but
> I can't see any reason why it has chosen a new cycle for insn 24.
>

The most probably, it is because the latency time of insn #12 is 2
cycles.  So insn #24 can not be issued until the data is ready.
 
> Thanks,
> 
> Dan.
> 
> ----------------------
> 
> ;;      Clock 4
> ;;      Ready list (t =  4):    70  12  11  10
> ;;        4--> 10   R9=[R5+0x2]                        :slot1,nothing
> ;;              dependences resolved: insn 22 into queue with cost=2
> ;;              Ready-->Q: insn 22: queued for 2 cycles.
> ;;      Ready list (t =  4):    70  12  11
> ;;              Ready-->Q: insn 11: queued for 1 cycles.
> ;;      Ready list (t =  4):    70  12
> ;;              Ready-->Q: insn 12: queued for 1 cycles.
> ;;      Ready list (t =  4):    70
> ;;              Ready-->Q: insn 70: queued for 1 cycles.
> ;;      Ready list (t =  4):
> ;;              Q-->Ready: insn 70: moving to ready without stalls
> ;;              Q-->Ready: insn 12: moving to ready without stalls
> ;;              Q-->Ready: insn 11: moving to ready without stalls
> ;;              Ready list after queue_to_ready:    11  12  70
> ;;      Clock 5
> ;;      Ready list (t =  5):    70  12  11
> ;;        5--> 11   R10=[R5+0x4]                       :slot1,nothing
> ;;              dependences resolved: insn 23 into queue with cost=2
> ;;              Ready-->Q: insn 23: queued for 2 cycles.
> ;;      Ready list (t =  5):    70  12
> ;;              Ready-->Q: insn 12: queued for 1 cycles.
> ;;      Ready list (t =  5):    70
> ;;              Ready-->Q: insn 70: queued for 1 cycles.
> ;;      Ready list (t =  5):
> ;;              Q-->Ready: insn 70: moving to ready without stalls
> ;;              Q-->Ready: insn 12: moving to ready without stalls
> ;;              Q-->Ready: insn 22: moving to ready without stalls
> ;;              Ready list after queue_to_ready:    22  12  70
> ;;      Clock 6
> ;;      Ready list (t =  6):    70  22  12
> ;;        6--> 12   R11=[R5+0x6]                       :slot1,nothing
> ;;              dependences resolved: insn 24 into queue with cost=2
> ;;              Ready-->Q: insn 24: queued for 2 cycles.
> ;;      Ready list (t =  6):    70  22
> ;;        6--> 22   R3=R9                              :slot0|slot1
> ;;      Ready list (t =  6):    70
> ;;              Ready-->Q: insn 70: queued for 1 cycles.
> ;;      Ready list (t =  6):
> ;;              Q-->Ready: insn 70: moving to ready without stalls
> ;;              Q-->Ready: insn 23: moving to ready without stalls
> ;;              Ready list after queue_to_ready:    23  70
> ;;      Clock 7
> ;;      Ready list (t =  7):    70  23
> ;;        7--> 23   R4=R10                             :slot0|slot1
> ;;      Ready list (t =  7):    70
> ;;              Ready-->Q: insn 70: queued for 1 cycles.
> ;;      Ready list (t =  7):
> ;;              Q-->Ready: insn 70: moving to ready without stalls
> ;;              Q-->Ready: insn 24: moving to ready without stalls
> ;;              Ready list after queue_to_ready:    24  70
> ;;      Clock 8
> ;;      Ready list (t =  8):    70  24
> ;;        8--> 24   R5=R11                             :slot0|slot1
> ;;      Ready list (t =  8):    70
> ;;              Ready-->Q: insn 70: queued for 1 cycles.
> ;;      Ready list (t =  8):
> ;;              Q-->Ready: insn 70: moving to ready without stalls
> ;;              Ready list after queue_to_ready:    70
> ;;      Clock 9
> ;;      Ready list (t =  9):    70
> ;;        9--> 70   R0=FP+0x4
> :(slot0+slot1+slot2+slotC)
> ;;              dependences resolved: insn 25 into queue with cost=1
> ;;              Ready-->Q: insn 25: queued for 1 cycles.
> ;;      Ready list (t =  9):
> ;;              Q-->Ready: insn 25: moving to ready without stalls
> ;;              Ready list after queue_to_ready:    25
> 
> =============================================================================
> Daniel Towner
> picoChip Designs Ltd., Riverside Buildings, 108, Walcot Street, BATH,
> BA1 5BG
> dant@picochip.com
> 07786 702589


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]