This is the mail archive of the
mailing list for the GCC project.
Re: [RFC] Design for flag bit outputs from asms
- From: Gabriel Paubert <paubert at iram dot es>
- To: Richard Henderson <rth at redhat dot com>
- Cc: Peter Zijlstra <peterz at infradead dot org>, Linus Torvalds <torvalds at linux-foundation dot org>, Vladimir Makarov <vmakarov at redhat dot com>, Jakub Jelinek <jakub at redhat dot com>, Ingo Molnar <mingo at kernel dot org>, "H. Peter Anvin" <hpa at zytor dot com>, Thomas Gleixner <tglx at linutronix dot de>, Linux Kernel Mailing List <linux-kernel at vger dot kernel dot org>, Borislav Petkov <bp at alien8 dot de>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Tue, 5 May 2015 11:01:05 +0200
- Subject: Re: [RFC] Design for flag bit outputs from asms
- Authentication-results: sourceware.org; auth=none
- References: <20150501151630 dot GH5029 at twins dot programming dot kicks-ass dot net> <CA+55aFwBP9QjpRK50pdVHmc086-+QPCthJRUs8Gq5qJBnXqnJQ at mail dot gmail dot com> <20150501163329 dot GU1751 at tucnak dot redhat dot com> <5543CDC0 dot 6010206 at redhat dot com> <CA+55aFxOd6mJcezgoLHN9Zgds-CsJqsx4Jgkp9OP1xUf11727Q at mail dot gmail dot com> <20150502123958 dot GK5029 at twins dot programming dot kicks-ass dot net> <5547C992 dot 9000703 at redhat dot com>
On Mon, May 04, 2015 at 12:33:38PM -0700, Richard Henderson wrote:
> (3) Note that ppc is both easier and more complicated.
> There we have 8 4-bit registers, although most of the integer
> non-comparisons only write to CR0. And the vector non-comparisons
> only write to CR1, though of course that's of less interest in the
> context of kernel code.
Actually vector (Altivec) write to CR6. Standard FPU optionally write to
CR1, but the written value does not exactly depend on the result of the last
instruction; it is an instead an accrued exception status.
> For the purposes of cr0, the same scheme could certainly work, although
> the hook would not insert a hard register use, but rather a pseudo to
> be allocated to cr0 (constaint "x").
Yes, but we might also want to leave the choice of a cr register to the compiler.
> That said, it's my understanding that "dot insns", setting cr0 are
> expensive in current processor generations.
Not that much if I understand properly power7.md and power8.md:
no (P7) or one (P8) additional clock for common instructions
(add/sub and logical), but nothing else, so they are likely a win.
Shift/rotate/sign extensions seem to have more decoding restrictions:
the recording ("dot") forms are "cracked" and use 2 integer units.
> There's also a lot less
> of the x86-style "operate and set a flag based on something useful".
But there is at least an important one, which I occasionally wished I had:
the conditional stores.
The overflow bit might also be useful, not really
for the kernel, but for applications (and mfxer is slow).