This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: DFA for PPro, P2, P3

In message <>, Jan Hubicka write
 > > Now we have I1, which uses one instance of P01 and P0.  Those are precisel
 > y
 > > the resources we have available, so we go ahead and fire I1.  [ Note this
 > > is based on haifa's notion of cycles and issue rates, meaning the 4:1:1
 > > uop decoder template is not modeled. ]
 > > 
 > > 
 > > Clearly this is not good as we actually over-subscribed the P0 unit with
 > > two uops in a single cycle.  We over-subscribed the P0 unit because we
 > > have not accurately described the pipeline.   And yes, this actually happe
 > ns.
 > No, what we do is to model p0 instrution to use both p0 unit and p01 unit,
 > so the p01 unit does not get over-subscribed.  Only one p01 instruction can
 > be issued then.  The description actually is accurate, just unnatural.
Highly unnatural.  Which hits one of the other DFA benefits -- the ability to
describe processor pipelines in a rational way. :-)

 > The thing you may describe well using DFA and not using old scheme are the
 > decoders - ie PentiumPro has 3 decoders, where decoder 0 is able to decode
 > more than others so you need to order the triples of instructions issued
 > same cycle accordingly.
Yes, there's multiple ways we can approach this -- either by having a cpu
unit for each decoder or using the VLIW packing capabilities.  It's unclear
to me which will best best.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]