Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 chips
Richard Biener
rguenther@suse.de
Fri Oct 25 12:57:00 GMT 2013
On Fri, 25 Oct 2013, Jan Hubicka wrote:
> > > OK, so it is about 2%. Did you try if you need lookahead even in the early pass (before reload)? My guess would be so, but if not, it could cut the cost to half. For -Ofast/-O3 it looks resonable to me, but we will need to announce it on the ML. For other settings I think we need to work on more improvements or cut the expenses.
> >
> > Yes, it is required before reload.
> >
> > I have another idea which can be pondered upon. Currently, can we enable lookahead with the value 4 (pre reload) for default? This will exponentially cut the cost of build time.
> > I have done some measurements on the build time of some benchmarks (mentioned below) with lookahead value 4. The 2% increase in build time with value 8 is now almost gone.
> >
> > dfa4 no_lookahead
> >
> > perlbench - 191s 193s
> > bzip2 - 19s 19s
> > gcc - 429s 429s
> > mcf - 3s 3s
> > gobmk - 116s 115s
> > hmmer - 60s 60s
> > sjeng - 18s 17s
> > libquantum - 6s 6s
> > h264ref - 107s 107s
> > omnetpp - 128s 128s
> > astar - 7s 7s
> > bwaves - 5s 5s
> > gamess - 1964s 1957s
> > milc - 18s 18s
> > GemsFDTD - 273s 272s
> >
> > Lookahead value 4 also helps because, the modified decoder model in bdver3.md is only two cycles deep (though in hardware it is actually 4 cycles deep). This means that we can look another two levels deep for better schedule.
> > GemsFDTD still retains the performance boost of around 6-7% with value 4.
> >
> > Let me know your thoughts.
>
> This seems resonable. I would go for lookahead of 4 for now and 8 for -Ofast
> and we can tune things based on the experience with this setting incrementally.
> Uros, Richard, what do you think?
Well, certainly -O3 not -Ofast.
Richard.
More information about the Gcc-patches
mailing list