This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: An unusual Performance approach using Synthetic registers

From: Andy Walker <ja_walker at earthlink dot net>
To: Tom Lord <lord at emf dot net>
Cc: dewar at gnat dot com,gcc at gcc dot gnu dot org
Date: Sun, 5 Jan 2003 23:37:47 -0600
Subject: Re: An unusual Performance approach using Synthetic registers
References: <20030105113840.BF53CF28C4@nile.gnat.com> <200301051224.EAA22286@emf.net>

Thank you.  You have clearly stated some things that I have been implying.

On Sunday 05 January 2003 06:24 am, Tom Lord wrote:
>        dewar:
>
> 	This is a bit of an odd statement. In practice on a machine
> 	like the x86, the current stack frame will typically be
> 	resident in L1 cache, and that's where the register allocator
> 	spills to. What some of us still don't see is the difference
> 	in final resulting code between your "synthetic registers" and
> 	normal spill locations from the register allocator.
>
>
> Register spills clearly don't equal synthetic registers.
>
> Presumably, the number of locations dedicated to register spills never
> exceeds (approximately) the maximum number of simultaneously live
> _intermediate_ values minus the number of general purpose registers.
> Any non-intermediate value (i.e., one that has a main memory
> location), rather than being spilled, will be written to its location.
> If that value is later re-used, it will be retrieved from memory.
>
> The number of synthetic registers can be much larger than the number
> of simultaneously live intermediate values.
>
> So, with synthetic registers, some values that are not intermediates
> can be retained (in synthetic registers).  Without synthetic
> registers, the next time those values are used, they have to be
> fetched from (non-special) memory.
>
> In other words, with synthregs, the CPU can ship some value off to
> memory and not care how long it takes to get there or to get back from
> there -- because it also ships it off to the synthreg, which it
> hypothetically has faster access to.
>
> In practice, that means that synthregs will store some values in
> memory twice: once in the location the program text says they go in;
> again in the synthetic register.  If the synthetic register is indeed
> cache-favored, maybe there's a performance win there -- and if so, a
> register allocator is the right algorithm to decide which values to
> keep duplicated in synthetic registers (so the proposed implementation
> strategy is sensible).
>
> (Another weird interaction is intermediate values that can be
> recalculated -- I don't know if GCC ever makes that trade-off --
> if it does, it needs to be tuned for synthregs.)
>
> So, does that hypothesis (that synthreg access is faster than general
> memory access) hold?  Quite possibly.  For example, a re-used synthreg
> inherits cache-presence (at all levels, not just L1) from the previous
> uses.  synthregs may win for some apps for more than just L1 reasons.
>
> This brings in new alignment issues, too.  If you can, you might want
> to make sure that your allocator locates its metadata where it will
> cache-collide with the synthregs, to help push allocated memory out of
> those locations (presuming here that allocator meta-data is relatively
> infrequently accessed).  It's probably not all that hard to do this
> "by accident".  Just in general: do things to protect the
> cache-presence of the synthregs.
>
> It might eventually lead to some hw advances: give synthregs with
> absolute locations cache preference.  Or, if synthregs are on the
> stack, give locations near the frame pointer cache preference (or is
> that done already?).
>
> I'd therefore guess it will be a very system-specific optimization --
> but that it will win often enough to be useful.  And given what I
> understand about trends in architecture, the cases in which it will
> win will sharply increase over time.
>
> No?
>
> -t
>
> p.s.: arch foo thinking about non-disruptive ways to improve gcc's
>       rev ctl practices:
>
>      
> http://lists.fifthvision.net/pipermail/arch-users/2003-January/001856.html
>
>       and some of the follow-ups.   It's a pretty "noisy" list,
>       though.

References:
- Re: An unusual Performance approach using Synthetic registers
  - From: Robert Dewar
- Re: An unusual Performance approach using Synthetic registers
  - From: Tom Lord

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]