This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: P3 SSE/MMX support: adding the patterns


On Wed, Sep 06, 2000 at 02:41:16PM +0100, Bernd Schmidt wrote:
> > Perhaps we ought to make calls.c generate these?  Seems silly to
> > have to define nine variants of the same thing.
> 
> It is silly, but the easiest thing to do.  I can think of two ways to
> improve this: either implement a macro mechanism for md files, or fix
> emit_push_insn so that it uses pushxx as a named pattern, and falls back
> to add/move when it fails.  Both are likely to be quite a bit of extra
> work (the latter because we may need to change a lot of ports).

I wouldn't have guessed that fixing emit_push_insn would require
changing lots of ports.  In any case it's no big deal, just something
that's sorta irritated me for a while.

> This pattern is generated by the loadhps/storehps builtins, both of which
> ensure that one argument is a MEM.

Ok.

> I have to admit I'm bewildered by the ia64 "mf" pattern.  It creates
>   (mem:BLK (mem:BLK (scratch)))
> and I fail to see the point of the two nested MEMs.

Heh.  The nested mems are actually a cut and paste error.  But
discounting that, we've got a read and a write to unspecified
volatile memory, which should alias with everything, and so prevent
any memory reference from crossing it.

> Isn't the point where the prefetch is added rather critical for getting
> good performance?  At least I think we should disallow scheduling memory
> refernces across a prefetch,

Yeah, it should be early enough, but not too early.  But think of it
the other way around -- with the volatile, the prefetch can't move up
either.  And really, the prefetch should percolate up to the first
pipeline bubble after its address is ready.

> or we might end up moving the prefetch after the "real"
> memory reference that it belongs to.

If the real memory reference is that close, you're wasting your time
with the prefetch.  You should be prefetching memory 4 to 8 iterations
in front of where you're working.  Remember, the point is to overcome
the 35 cycle wait for L2/3 cache or the 100 cycle wait for main memory.
That's a lot of time, which implies you've got to put the prefetch
well in advance of when the data will be needed.



r~

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]