This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check



On 09/11/15 08:14, Nikolai Bozhenov wrote:


On 11/06/2015 08:16 PM, Kyrill Tkachov wrote:

On 06/11/15 17:09, Kyrill Tkachov wrote:

On 06/11/15 17:07, Nikolai Bozhenov wrote:
On 11/06/2015 04:46 PM, Ramana Radhakrishnan wrote:
Hi!

I faced the same issue but I had somewhat different RTL for the consumer:

     (insn 20 15 21 2 (set (reg/i:SI 0 r0)
             (minus:SI (subreg:SI (reg:DI 117) 4)
                 (mult:SI (reg:SI 123)
                     (reg:SI 114)))) gasman.c:4 48 {*mulsi3subsi})

where (reg:DI 117) is produced by umulsidi3_v6 instruction. Is it
really true that (subreg:SI (reg:DI 117) 4) may be forwarded in one
cycle in this case?
If the accumulator can be forwarded (i.e. a SImode register), there isn't a reason why a subreg:SI (reg:DI) will not get forwarded.

The subreg:SI is an artifact before register allocation, thus it's a representation issue that the patch is fixing here unless I misunderstand your question.

I mean, in my example it is not the multiplication result that is
forwarded but its upper part. So, shouldn't we check that offset in a
subreg expression is zero? Or is it ok to forward only the upper part
of a multiplication?

Could you please post the full RTL instruction we're talking about here as it appears in the scheduler dump?
So that we're all on the same page about which case we're talking about.


Sorry, missed the above instruction.
This subreg is just a pre-register allocation representation of the instruction and will go away after reload.
This particular function only really has a real effect in post-reload scheduling as it's only there when the final
register numbers are known.


I see. aarch_accumulator_forwarding always returns 0 for virtual
registers. But isn't it overly pessimistic to assume that accumulator
forwarding is never possible at sched1? I wonder if it would be better
to be more optimistic about register allocation outcome. I mean, in
case of virtual registers we could assume forwarding from A to B if B
is the only consumer of A's result. Something like this:

    if (REGNO (dest) >= FIRST_VIRTUAL_REGISTER
        || REGNO (accumulator) >= FIRST_VIRTUAL_REGISTER)
      return (DF_REG_USE_COUNT (REGNO (dest)) == 1)
        && (DF_REF_INSN (DF_REG_USE_CHAIN (REGNO (dest))) == consumer);
    else
      return REGNO (dest) == REGNO (accumulator);


Interesting...
As far as I know sched1 tries to minimise live ranges before register allocation rather than trying to
perform the most exact pipeline modelling, since we only know the exact registers used in sched2.

What you're proposing is a heuristic, so it would need benchmarking results and analysis to be considered.

Thanks,
Kyrill


Thanks,
Nikolai



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]