[patch] Tree level array prefetching

Zdenek Dvorak rakdver@atrey.karlin.mff.cuni.cz
Tue Jun 14 15:38:00 GMT 2005


Hello,

> > this patch implements prefetching on tree level.  It is the updated and
> > upgraded version of the prefetching pass I have developed on lno branch
> > about a year ago.  I am not sure wheter this type of patch is suitable
> > at the current stage (most likely not), but anyway, comments are
> > welcome.
> > 
> > Description of how the pass works can be found at the beginning of
> > tree-ssa-loop-prefetch.c.  Basically we find memory references, check
> > for reuses to determine those that do not need to be prefetched and
> > those that do not need to be prefetched in every iteration, then
> > we unroll the loop as necessary and inserts the prefetch instructions
> > (calls to builtin_prefetch).
> > 
> > The patch does not remove the rtl profiling pass in order to keep it
> > shorter.  It also includes quite a few changes that are necessary or
> > useful to make other optimizers handle loops after unrolling and with
> > prefetch instructions (updating of frequencies after loop versioning and
> > unrolling, making the order of blocks after unrolling more sensible,
> > change to tree-outof-ssa to prevent TER from increasing register presure
> > too much in unrolled loops, nicer names for temporary variables created
> > by store motion, etc.).  I will submit those separately, as they are
> > interesting regardless of this patch.
> > 
> > The patch was bootstrapped & regtested on i686 and x86_64 with the pass
> > enabled.  Below are the results (compared with the old rtl prefetching
> > pass) of spec2000 on athlon; it seems to be a clear win on specint
> > (with the only noticeable regression on crafty), and performs reasonably
> > on specfp (although there are significant regressions on few tests;
> > I tried to investigate a few of these, and they are caused by reasons
> > that I was not able to fix, like the fact that register allocator
> > sometimes does not handle to assign registers in an unrolled loop
> > as well as it does in non-unrolled one).
> 
> I tried patching this on the mainline, and it didn't apply.  Would it be
> possible to port the patch to the mainline so it can be judged independently?
> I don't know how much you depend on stuff from the branch you were using,
> whether it would be simple or hard to do.

the patch is of course against mainline; probably it just conflicts with
some change from last few days.  If you want I will send you the updated
version privately.

Zdenek



More information about the Gcc-patches mailing list