[PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.

Tamar Christina Tamar.Christina@arm.com
Tue Sep 18 13:12:00 GMT 2018


Hi All,

I'm looking for a backport of this patch to GCC8.

Ok for backport?

Thanks,
Tamar

> -----Original Message-----
> From: Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>
> Sent: Wednesday, August 15, 2018 15:25
> To: Tamar Christina <Tamar.Christina@arm.com>; Thomas Preudhomme
> <thomas.preudhomme@linaro.org>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; Ramana Radhakrishnan
> <Ramana.Radhakrishnan@arm.com>; Richard Earnshaw
> <Richard.Earnshaw@arm.com>; nickc@redhat.com
> Subject: Re: [PATCH][GCC][Arm] Fix subreg crash in different way by
> enabling the FP16 pattern unconditionally.
> 
> Hi Tamar,
> 
> On 26/07/18 12:01, Tamar Christina wrote:
> > Hi Thomas,
> >
> > > -----Original Message-----
> > > From: Thomas Preudhomme <thomas.preudhomme@linaro.org>
> > > Sent: Thursday, July 26, 2018 09:29
> > > To: Tamar Christina <Tamar.Christina@arm.com>
> > > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; Ramana
> Radhakrishnan
> > > <Ramana.Radhakrishnan@arm.com>; Richard Earnshaw
> > > <Richard.Earnshaw@arm.com>; nickc@redhat.com; Kyrylo Tkachov
> > > <Kyrylo.Tkachov@arm.com>
> > > Subject: Re: [PATCH][GCC][Arm] Fix subreg crash in different way by
> > > enabling the FP16 pattern unconditionally.
> > >
> > > Hi Tamar,
> > >
> > > On Wed, 25 Jul 2018 at 16:28, Tamar Christina
> > > <Tamar.Christina@arm.com>
> > > wrote:
> > > >
> > > > Hi Thomas,
> > > >
> > > > Thanks for the review!
> > > >
> > > > > >
> > > > > > I don't believe the TARGET_FP16 guard to be needed, because
> > > > > > the pattern doesn't actually generate code and requires
> > > > > > another pattern for that, and a reg to reg move should always
> > > > > > be possible anyway. So allowing the force to register here is
> > > > > > safe and it allows the compiler to generate a correct error
> > > > > > instead of ICEing in an
> > > infinite loop.
> > > > >
> > > > > How about subreg to subreg move? Doesn't that expand to more
> > > > > insns (subreg to reg and reg to subreg)? Couldn't you improve
> > > > > the logic to check that there is actually a mode change so that
> > > > > if there isn't (like moving from one subreg to another) just expand to
> a single move?
> > > > >
> > > >
> > > > Yes, but that is not a new issue. My patch is simply removing the
> > > > TARGET_FP16 restrictions and merging two patterns that should be
> > > > one
> > > using an iterator and nothing more.
> > > >
> > > > The redundant mov is already there and a different issue than the
> > > > ICE I'm
> > > trying to fix.
> > >
> > > It's there for movv4hf and movv6hf but your patch extends this
> > > problem to movv2sf and movv4sf as well.
> >
> > I don't understand how it can. My patch just replaces one pattern for
> > V4HF and one for V8HF with one pattern operating on VH.
> >
> > ;; Vector modes for 16-bit floating-point support.
> > (define_mode_iterator VH [V8HF V4HF])
> >
> > My pattern has absolutely no effect on V2SF and V4SF or any of the other
> modes.
> >
> > >
> > > >
> > > > None of the code inside the expander is needed at all, the code
> > > > really only has an effect on subreg to subreg moves, as
> > > > `force_reg` doesn't do
> > > anything when it's argument is already a reg.
> > > >
> > > > The comment in the expander (which was already there) is wrong.
> > > > The
> > > > *reason* the ICE is fixed isn't because of the `force_reg`. It's
> > > > because of the mere presence of the expander itself. The expander
> > > > matches the standard mov$a optab and so this prevents
> > > emit_move_insn_1 from doing the move by subwords as it finds a
> > > pattern that's able to do the move.
> > >
> > > Could you then fix the comment in your patch as well? I hadn't
> > > understood the force_reg was not key here. You might want to update
> > > the following sentence from your patch description if you are going
> > > to include it in your commit message:
> >
> > I'll update the comment in the patch. The cover letter won't be
> > included in the commit, But it does accurately reflect the current
> > state of affairs. The patch will do the force_reg, It's just not the reason it
> works.
> >
> > >
> > > The way this is worked around in the back-end is that we have move
> > > patterns in neon.md that usually just force the register instead of
> > > checking with the back-end.
> > >
> > > "The way this is worked around (..) that just force the register" is
> > > what led me to believe the force_reg was important.
> > >
> > > >
> > > > The expander however always falls through and doesn’t stop RTL
> > > > generation. You could remove all the code in there and have it
> > > > properly match the *neon_mov instructions which will do the right
> > > > thing later at code generation time and avoid the redundant moves.
> > > > My
> > > guess is the original `force_reg` was copied from the other patterns
> > > like `movti` and the existing `mov<mode>`. There It makes sense
> > > because the operands can be MEM or anything general_operand.
> > > >
> > > > However the redundant moves are a different problem than what I'm
> > > > trying to solve here. So I think that's another patch which
> > > > requires further
> > > testing.
> > >
> > > I was just thinking of restricting when does the force_reg happens
> > > but if it can be removed completely I agree it should probably be
> > > done in a separate patch.
> > >
> > > Oh by the way, is there something that prevent those expander to
> > > ever be used with a memory operand? Because the GCC internals
> > > contains the following piece for mov standard pattern (bold marks added
> by me):
> > >
> > > "Second, these patterns are not used solely in the RTL generation
> > > pass. Even the reload pass can generate move insns to copy values
> > > from stack slots into temporary registers. When it does so, one of
> > > the operands is a hard register and the other is an operand that can need
> to be reloaded into a register.
> > > Therefore, when given such a pair of operands, the pattern must
> > > generate RTL which needs no reloading and needs no temporary
> > > registers—no registers other than the operands. For example, if you
> > > support the pattern with a define_ expand, then in such a case the
> > > define_expand *mustn’t call
> > > force_reg* or any other such function which might generate new
> > > pseudo registers."
> >
> > When used during expand the operand predicates are checked, this
> prevents that case.
> > When used during reload can_create_pseudo_p () guards the actions in
> the patterns.
> >
> > gcc/rtl.h:#define can_create_pseudo_p() (!reload_in_progress &&
> > !reload_completed)
> >
> > So during or after reload it would just be an empty fall through
> > pattern, so it won't do anything to MEM or anything else.
> >
> 
> This is ok for trunk.
> Thanks for looking at it as well Thomas,
> 
> Kyrill
> 
> > Regards,
> > Tamar
> >
> > >
> > > Best regards,
> > >
> > > Thomas
> > >
> > > >
> > > > Regards,
> > > > Tamar
> > > >
> > > > > Best regards,
> > > > >
> > > > > Thomas
> > > > >
> > > > > >
> > > > > > This patch ensures gcc.target/arm/big-endian-subreg.c is fixed
> > > > > > without introducing any regressions while fixing
> > > > > >
> > > > > > gcc.dg/vect/vect-nop-move.c execution test
> > > > > > g++.dg/torture/vshuf-v2si.C   -O3 -g execution test
> > > > > > g++.dg/torture/vshuf-v4si.C   -O3 -g execution test
> > > > > > g++.dg/torture/vshuf-v8hi.C   -O3 -g execution test
> > > > > >
> > > > > > Regtested on armeb-none-eabi and no regressions.
> > > > > > Bootstrapped on arm-none-linux-gnueabihf and no issues.
> > > > > >
> > > > > >
> > > > > > Ok for trunk?
> > > > > >
> > > > > > Thanks,
> > > > > > Tamar
> > > > > >
> > > > > > gcc/
> > > > > > 2018-07-23  Tamar Christina <tamar.christina@arm.com>
> > > > > >
> > > > > >         PR target/84711
> > > > > >         * config/arm/arm.c (arm_can_change_mode_class):
> > > > > > Disallow
> > > subreg.
> > > > > >         * config/arm/neon.md (movv4hf, movv8hf): Refactored to..
> > > > > >         (mov<mov>): ..this and enable unconditionally.
> > > > > >
> > > > > > --



More information about the Gcc-patches mailing list