This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: DFA for PPro, P2, P3
- From: law at redhat dot com
- To: Jan Hubicka <jh at suse dot cz>
- Cc: gcc at gcc dot gnu dot org
- Date: Fri, 03 May 2002 08:51:09 -0600
- Subject: Re: DFA for PPro, P2, P3
- Reply-to: law at redhat dot com
In message <20020503084721.GI22728@atrey.karlin.mff.cuni.cz>, Jan Hubicka write
s:
> > Now we have I1, which uses one instance of P01 and P0. Those are precisel
> y
> > the resources we have available, so we go ahead and fire I1. [ Note this
> > is based on haifa's notion of cycles and issue rates, meaning the 4:1:1
> > uop decoder template is not modeled. ]
> >
> >
> > Clearly this is not good as we actually over-subscribed the P0 unit with
> > two uops in a single cycle. We over-subscribed the P0 unit because we
> > have not accurately described the pipeline. And yes, this actually happe
> ns.
>
> No, what we do is to model p0 instrution to use both p0 unit and p01 unit,
> so the p01 unit does not get over-subscribed. Only one p01 instruction can
> be issued then. The description actually is accurate, just unnatural.
Highly unnatural. Which hits one of the other DFA benefits -- the ability to
describe processor pipelines in a rational way. :-)
> The thing you may describe well using DFA and not using old scheme are the
> decoders - ie PentiumPro has 3 decoders, where decoder 0 is able to decode
> more than others so you need to order the triples of instructions issued
> same cycle accordingly.
Yes, there's multiple ways we can approach this -- either by having a cpu
unit for each decoder or using the VLIW packing capabilities. It's unclear
to me which will best best.
jeff