This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

DFA lookahead confusion



Reading the documentation for target hook:

TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD

I have a hard time coming to conclusions :-)

Alpha and Pentium are using a value that seems to be dependant upon
the width of the processor.

However, as I read the documentation there is no direct correlation
between width of the processor and the value to use for DFA lookahead.
Even if processor width is 2 (as on some Alpha's) you can still get
better schedules when using larger values for DFA lookahead.

For example, consider a ready list ordered like this:

	INT_OP0
	INT_OP1
	INT_OP2
	INT_OP3
	FPU_OP0
	FPU_OP1	/* inputs depend upon retults of FPU_OP0 */
	FPU_OP2 /* inputs depend upon retults of FPU_OP1 */
	FPU_OP3 /* inputs depend upon retults of FPU_OP2 */

Let us assume that processor may execute 2 instructions per cycle,
there are two INT units and 1 FPU units, plus FPU latency is 1 cycle
(yes I know this processor is stupidly designed, it's just for example
purposes :-)

Optimal schedule would be something like:

	INT_OP0
	FPU_OP0
	INT_OP1
	FPU_OP1	/* inputs depend upon retults of FPU_OP0 */
	INT_OP2
	FPU_OP2 /* inputs depend upon retults of FPU_OP1 */
	INT_OP3
	FPU_OP3 /* inputs depend upon retults of FPU_OP2 */

The documentation suggests that if DFA lookahead is set to
non-positive value the scheduler will begin executing like this:

	INT_OP0
	INT_OP1

Already we've used up all the resources for the first cycle and we
aren't computing FPU results early enough to get an optimal schedule.
Let us set DFA lookahead to 2, scheduler will try:

	INT_OP0
	INT_OP1

and

	INT_OP1
	INT_OP0

which still is not what we want.  It seems that choosing DFA lookahead
value of 5 or 6 would result in desired schedule, because only in this
case would it even consider at all the FPU operations for the first
cycle.

Look at even this strong wording in the Alpha implementation of DFA
lookahead:


/* How many alternative schedules to try.  This should be as wide as the
   scheduling freedom in the DFA, but no wider.  Making this value too
   large results extra work for the scheduler.  */

This comment seems to contradict reality.  Making this value wider
than the scheduling freedom of the DFA does yield benefits and huge
ones in some cases.

The same language is used in the x86 Pentium implementation.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]