This is the mail archive of the
mailing list for the GCC project.
Re: how to keep a hard register across multiple instrutions?
- From: David Kang <dkang at isi dot edu>
- To: Jeff Law <law at redhat dot com>
- Cc: gcc at gcc dot gnu dot org
- Date: Mon, 3 Nov 2014 13:51:55 -0800 (PST)
- Subject: Re: how to keep a hard register across multiple instrutions?
- Authentication-results: sourceware.org; auth=none
Thank you for the tips.
I tried the following condition for split.
"reload_completed && FP_REG_P (operands)"
But, the registers are still changed.
How can I specify "after register allocation" in the split condition?
----- Original Message -----
> From: "Jeff Law" <email@example.com>
> To: "David Kang" <firstname.lastname@example.org>, email@example.com
> Sent: Monday, November 3, 2014 11:21:58 AM
> Subject: Re: how to keep a hard register across multiple instrutions?
> On 10/31/14 16:01, David Kang wrote:
> > Hi,
> > I'm newbie in gcc porting.
> > The architecture that I'm porting gcc has hardware FPU.
> > But the compiler has to generate code which builds a FPU instruction
> > in a integer register
> > at run-time and writes the value to the FPU command register.
> > To make a single FPU instruction, three instructions are needed.
> > Two instructions make the FPU instruction in 32 bit (cmd,
> > operands, operands, operands) format.
> > Here operands are the FPU register numbers, which can be 0 ~ 32.
> > As an example, f3 = f1 + 2 can be encoded as (code of 'add', 2, 1,
> > 3).
> > And the third instruction write it to a FPU command register.
> > The architecture can issue up to 3 instructions at a time.
> > The difficulty lies in that we need to know the FPU register
> > number
> > for those operands to generate the FPU instruction.
> > The easiest but lowest performance implementation is to generate
> > those three instruction
> > from a single "define_insn" as three consecutive instructions.
> > However, we lose all possible bundling of those 3 instructions with
> > other instructions for optimization.
> > So, I'm trying to find a better way.
> > I used "define_insn_and_split" and split a single FPU instruction
> > into 3 instructions like this:
> > (Here I assume to use register r10, but it can be any integer
> > register.)
> > operands = plus (operands, operands)
> > ==>
> > (1) r10 <- lower half of FPU instruction using
> > (code of 'add', operands, operands, operands)
> > (2) r10 <- r10 | upper half of FPU instruction using (code of 'add',
> > operands, operands, operands)
> > (3) (FPU cmd register) <- r10
> > The problem is that gcc catches that operands is used before
> > the 3rd instruction,
> > and allocates two different hard registers for (1,2) instructions
> > and (3) instruction.
> > So, when the code is generated, the first two instructions are
> > assuming wrong register
> > for operands.
> > This happens especially frequently when '-unroll' option is used.
> > So, I think if there is a way to inform gcc to use the same hard
> > registers for
> > operands across those three instructions.
> > Is it possible?
> > Or would there be any better way to generate efficient FPU code?
> > I will appreciate any advice or pointer to further information.
> Use a define_insn_and_split, but only split it after register
> & reloading.
Dr. Dong-In "David" Kang