This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: preliminary patch: prefetch support for i386
On Sat, Dec 01, 2001 at 12:54:37AM +0100, Jan Hubicka wrote:
> > On Fri, Nov 30, 2001 at 09:18:19PM +0100, Jan Hubicka wrote:
> > > first of all, thanks for the patch. It is something I really
> > > wanted to do for long time.
> >
> > I know, I'm hoping that the existence of a general framework will
> > inspire you to update your prefetch optimizations for arrays in loops
>
> I will install your updated patch to cfg-branch. As 3.1 is feature freezing at
> 15th, we have some time for 3.2.x release. I hope that we will get working CFG
> based loop optimizer till then and working AST loop optimizer as well. The
> prefetch code then should recognize the possiblities at tree level and emit
> prefetches at lower level, most probably.
First I'd like to update it again; the term "straddle" appears in a few
variables names and should be replaced with "stride", and I think there
were some other things I didn't get to yet.
Is that patch likely to be stable enough to go into 3.1, with specific
performance tweaks added later? The bulk of the code is only used with
a new optimization option, so it shouldn't hurt anything that doesn't
use that option.
> > and perhaps greedy prefetching of addresses in pointers! Please let me
>
> I still have the primitive code to do that. What is missing is to recognize
> the pointers in structures whose addresses are fetched and prefetch them.
> I am not sure this indirection is correct in C. May I assume that if
> I have pointer to structure and I know that program reads some of it's fields,
> the other fields are accessible too?
The new prefetch rtl code is defined to use only non-faulting prefetch
instructions and this optimization will not be turned on by default, so
C correctness isn't an issue, is it?
>
> Having separate switches looks confused to me too, as Joe, the user, probably
> don't know what flavour of SSE, 3dNOW and other features his cpu supports....
Right.
> > > I remember that the property of SSE prefetch is that it is nop for older
> > > CPUs, so I guess it should be controlled by -mcpu instead of -march.
> >
> > It might be best to not generate prefetch instructions for CPUs where
> > they are nops, but then again if there is a call to __builtin_prefetch
> > we could assume that the programmer really wants them. Even as nops,
> > though, they make the code larger without adding anything.
>
> What I was wondering about is switch like -mcpu=pentium4 saying optimize
> for pentium4, but do not use anything incompatible with i386. That still
> can generate the prefetch instructions for SSE, so the setting should
> depdend on CPU selection, while the 3dNOW prefetch is invalid instruction
> for earlier CPUs so it must depdend on ARCH selection.
Yes, that makes sense.
> > > Also writting the program, how I will get informed about whether the
> > > prefetch builtin is supported or not.
> >
> > As things are now, by looking at the generated code. I had thought it
> > was a feature to silently treat __builtin_prefetch as a nop on targets
> > that don't have data prefetch support, but perhaps a warning would be
> > appropriate. There aren't other builtins that are safe to use when not
> > supported, so I didn't have an example to follow.
>
> I guess siletnly ignoring them is OK. The prefetch builtin is more a hint
> to compiler compiler may or may not use. Perhaps we can have
> warning as an option, but I would disable it by default, as code written
> for machine with prefetches should compile on machines w/o prefetches
> and there should be way to make it compile w/o warnings, that would
> need ifdefs otherwise.
I've thought about this some more and think that silently doing nothing
is the right thing to do on a target that doesn't support prefetch.
I'll add that to the __builtin_prefetch documentation before
resubmitting that patch.
Janis