This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: profile-driven optimization and the linker?
On Thu, 2004-11-25 at 11:59 -0800, Dan Kegel wrote:
> Jeffrey A Law wrote:
> >>http://gcc.gnu.org/news/profiledriven.html doesn't mention
> >>having ld reorder sections to improve locality of reference
> >>of frequently called functions. Has that technique
> >>been tried with gcc and binutils?
> >
> > Yes. I had some success with it a while back.
>
> Was that on static executables, or dynamically linked ones?
I've done both at different times with different linker technologies.
>
> > With ELF it's not terribly hard since we can arrange for
> > each function to be put in its own section. Thus you get
> > to do function-level reordering by just twiddling a linker
> > map. In fact, that's where -ffunction-sections name from.
> >
> > We had some hacks to gprof which would take profile data
> > and build a linker map. That's the code that really needs
> > to be improved. I never understood the gprof code well
> > enough to do it right.
>
> I may take a whack at this in my copious free time.
> Are any such hacks online, or shall I just start from scratch?
I believe we put all the code into gprof, though I haven't followed
it over time and it may have been removed.
Basically there was two pieces to gprof. The first within gprof
itself to produce an ordering of functions. That code will
probably need to be rewritten (I never liked the way I did it
and I never got the coalescing of information step done).
The second piece was a simple shell script which took the ordering
produced by gprof and created a linker script. Nothing too radical
here.
> I guess further discussion should be on the binutils mailing list.
Probably. From a compiler standpoint I think all the pieces are in
place.
> I first heard of this technique about 1993 or so, when
> Lee Hasiuk implemented it on MS-DOS.
I did a lot of this kind of stuff in the early 90s -- back when
a.out was still popular. With some assembler help we were able
to do function level reordering even on a.out files. There's
some awful memories... I have no idea if OMOS lives on or not.
The gcc/gas/gprof stuff was done in the mid 90s.
> http://mail.kde.org/pipermail/kde-optimize/2003-January/000108.html
> describes something similar, but which only aimed at
> increasing *disk* cache locality when paging in functions from
> disk for the first call.
We were primarily measuring TLB effects with ours -- page locality
for disk access falls out from the obvious algorithms. Though IIRC
we actually did better by first removing all the single call nodes
from the graph and putting them into one contigious region, then
applying the greedy algorithm on the rest of the call graph.
> FWIW, Alan Cox mentioned the technique a couple weeks ago,
> http://marc.theaimsgroup.com/?l=linux-kernel&m=109935741612533&w=2
> but didn't mention any code.
I've heard of GNU Rope, but I've never seen any code...
jeff