This is the mail archive of the
mailing list for the GCC project.
Re: prefetch revisited
> In your patch you pass an address, and offset, and an int to
Note that the patch is really more a toy than real patch.
For greedy prefetching according to the article one needs to figure out
what type the memory reference fetch pointer to (this can be doable
with memref tracking) and prefetch all recursive pointers in the structure.
This is where I see problem. I believe no one in C says me that I can access
all fields of the structure when I access one, but I need to check the
standard, so at the moment I do the one level in prefetching needed to get some
performance hit except for trivial testcases, such asunrolled memset.
> gen_prefetch; how is the offset used, and what is the int? IA-64 has a
This is because the pattern is hardcoded for i386, where the integer is
used fo place in the cache hiearchy I want data to be loaded.
> base-update form of prefetch, which adds an offset value to the base
> register. Is your offset for something like that? What do you do with
> the offset on a machine that doesn't support a base-offset form of
I am not passing offset itself, I am offsetting the memory address. On IA-64
that can't offset the code will just abort, but it can be easilly fixed
by legitimizing the address and emitting the whole sequence.
> I'm cleaning up my current changes so I can post a preliminary version
> of the prefetch infrastructure patch for review, with support for ia64
> and i386 (sse and 3dnow!). I'll put the update of your loop
> optimization prefetch support in a separate patch to show how it's used.
> Here's a preliminary description of the prefetch RTL information, based
> on support in IA-64 and i386 variants; there might be fewer if some of
> these aren't practical to use, or more if other machines have nifty
> capabilities that GCC can exploit.
> (prefetch addr off rw temploc cachelev)
> Represents prefetch of memory at address addr plus offset off.
I believe all you need is single address. Gcc already knows how to offset
addresses when needed and I believe there is nothing that makes prefetch
special in this case.
> (Is the offset added before or after the prefetch?)
> The other operands specify which capabilities of the machine's
> prefetch support to use.
One of problem is how to specify these flags properly in RTL. RTL already
do have flag field in each construct, but this is, uhm, hackish. Perhaps
adding them just ac const int in the pattern is easiest way to go.
Otherwise the proposal looks fine to me.