This is the mail archive of the
mailing list for the GCC project.
Re: gcc feature request / RFC: extra clobbered regs
- From: "H. Peter Anvin" <hpa at zytor dot com>
- To: Andy Lutomirski <luto at kernel dot org>, gcc at gcc dot gnu dot org, "linux-kernel at vger dot kernel dot org" <linux-kernel at vger dot kernel dot org>, Linus Torvalds <torvalds at linux-foundation dot org>, Ingo Molnar <mingo at kernel dot org>, Thomas Gleixner <tglx at linutronix dot de>
- Date: Tue, 30 Jun 2015 14:32:23 -0700
- Subject: Re: gcc feature request / RFC: extra clobbered regs
- Authentication-results: sourceware.org; auth=none
- References: <CALCETrX6j9vBZR7RirXf8usz1Y4f-1TnVaYTVg0_PgQCeWZnRg at mail dot gmail dot com>
On 06/30/2015 02:22 PM, Andy Lutomirski wrote:
> Hi all-
> I'm working on a massive set of cleanups to Linux's syscall handling.
> We currently have a nasty optimization in which we don't save rbx,
> rbp, r12, r13, r14, and r15 on x86_64 before calling C functions.
> This works, but it makes the code a huge mess. I'd rather save all
> regs in asm and then call C code.
> Unfortunately, this will add five cycles (on SNB) to one of the
> hottest paths in the kernel. To counteract it, I have a gcc feature
> request that might not be all that crazy. When writing C functions
> intended to be called from asm, what if we could do:
> __attribute__((extra_clobber("rbx", "rbp", "r12", "r13", "r14",
> "r15"))) void func(void);
> This will save enough pushes and pops that it could easily give us our
> five cycles back and then some. It's also easy to be compatible with
> old GCC versions -- we could just omit the attribute, since preserving
> a register is always safe.
> Thoughts? Is this totally crazy? Is it easy to implement?
> (I'm not necessarily suggesting that we do this for the syscall bodies
> themselves. I want to do it for the entry and exit helpers, so we'd
> still lose the five cycles in the full fast-path case, but we'd do
> better in the slower paths, and the slower paths are becoming
> increasingly important in real workloads.)
Some gcc targets have done this in the past. There are command-line
options to do that, but using attributes you have to handle cross-ABI
However, I don't see this being done in the upstream gcc.
Keep in mind the runway that we'll need, though.