102211 – [12 regression] ICE introduced by r12-3277

Bug 102211 - [12 regression] ICE introduced by r12-3277

Summary: [12 regression] ICE introduced by r12-3277

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	target (show other bugs)
Version:	12.0

Importance:	P3 normal
Target Milestone:	12.0
Assignee:	Not yet assigned to anyone

URL:
Keywords:	build, ice-on-valid-code

Depends on:	102154
Blocks:
	Show dependency tree / graph

Reported:	2021-09-06 02:53 UTC by Hongtao.liu
Modified:	2022-01-17 13:33 UTC (History)
CC List:	5 users (show)

See Also:
Host:
Target:	risv64
Build:
Known to work:
Known to fail:
Last reconfirmed:	2021-09-08 00:00:00

Attachments
proposed RISC-V backend solution (3.43 KB, patch) 2021-09-13 17:35 UTC, Jim Wilson	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Hongtao.liu 2021-09-06 02:53:39 UTC

cat test.c

float
foo (float a, long b)
{
  union{float a[2];
    long b;}c;
  c.b = b;
  return c.a[0] * a;
}


gcc test.c -O2 -S

test.c:8:1: error: this is the insn:
(insn 12 19 13 2 (set (reg/i:SF 42 fa0)
        (mult:SF (reg:SF 77)
            (subreg:SF (reg:DI 80 [78]) 0))) "test.c":8:1 17 {mulsf3}
     (expr_list:REG_DEAD (reg:DI 80 [78])
        (expr_list:REG_DEAD (reg:SF 77)
            (nil))))
during RTL pass: reload

Comment 1 Hongtao.liu 2021-09-06 04:38:20 UTC

But it's ok for

float
foo (float a, long b)
{
  union{float a[2];
    long b;}c;
  c.b = b;
  return c.a[0];
}


foo:
	fmv.w.x	fa0,a0
	ret

Which means movement between gpr and float reg is allowed.

Comment 2 Hongtao.liu 2021-09-06 04:50:05 UTC

According to *movsi_internal and *movdi_64bit, SImode, and DImode can be placed into FP_REGS, but in riscv_hard_regno_mode_ok, SImode/DImode is not allowed to be allocated as FP_REGS, the mismatch here caues the ICE.

Simple hack as

modified   gcc/config/riscv/riscv.c
@@ -4553,7 +4553,9 @@ riscv_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
 	return false;
 
       if (GET_MODE_CLASS (mode) != MODE_FLOAT
-	  && GET_MODE_CLASS (mode) != MODE_COMPLEX_FLOAT)
+	  && GET_MODE_CLASS (mode) != MODE_COMPLEX_FLOAT
+	  && mode != E_SImode
+	  && mode != E_DImode)
 	return false;
 
       /* Only use callee-saved registers if a potential callee is guaranteed

Solved the issue.

foo:
	fmv.d.x	fa5,a0
	fmul.s	fa0,fa0,fa5

Comment 3 Andreas Schwab 2021-09-07 08:00:00 UTC

This also breaks bootstrap.

Comment 4 Jim Wilson 2021-09-08 05:55:06 UTC

Yes, moving SI/DI values to FP regs is OK. However, RISC-V requires that FP values in FP registers be stored NaN-boxed. So an SFmode value in a 64-bit FP reg has the upper 32-bits of all ones, and the lower 32-bits is the value. Thus if accessed as a 64-bit value, you get a NaN. The hardware may trap if you access a 32-bit value which is not properly NaN-boxed. Using qemu to check this may not be good enough, as last time I looked at qemu it wasn't handling NaN-boxing correctly, but this was over a year ago, so maybe it has been fixed since. I don't know.

Anyways, this code sequence is OK
foo:
fmv.w.x fa0,a0
ret
because we are moving a 32-bit SImode value to an FP reg and then treating it as SFmode, and the 32-bit move will properly NaN-box the SFmode value.

This code sequence is not OK
foo:
fmv.d.x fa5,a0
fmul.s fa0,fa0,fa5
because we are moving a 64-bit DImode value to an FP reg and then treating it as SFmode, which is not OK because the value won't be NaN-boxed and may trap at run time.

validate_subreg used to prevent the bad subreg from being created.

I would think that TARGET_CAN_CHANGE_MODE_CLASS could help here, but it isn't being called inside general_operand when called from fwprop1 where the bad substitution happens. Because we have a pseudo-register, and it is only called for hard registers.

I don't see a way to fix this as a backend change with current validate_subreg, other than by replacing register_operand with riscv_register_operand, and putting the subreg check I need inside riscv_register_operand. And likewise for any other affected predicate, like move_operand. This will be a big change, though a lot of it will be mechanical. As an optimization, we can continue to use register_operand in any pattern that can't use FP registers.

As a middle end change, I need a new hook in general_operand to reject subregs that we can't support on RISC-V.

Or maybe re-add the check I need to validate_subreg as a hook, so it can be conditionally enabled for RISC-V.

We can allow (subreg:SF (reg:DI)) if it gets allocated to an integer register. It is only when it is allocated to an FP register that it can't work. I don't know offhand if that can be described. But disallowing the subreg always for RISC-V is simpler and also works.

Comment 5 Jim Wilson 2021-09-09 05:31:07 UTC

I have a WIP fix that lets me build newlib and glibc via riscv-gnu-toolchain.  I haven't tried a bootstrap yet.  I created a new predicate that uses the small bit of deleted code I need from validate_subregs, and then modified most patterns with an f constraint to use it.

+;; This predicate should be used for any pattern that has constraints that
+;; accept FP registers.
+(define_predicate "f_register_operand"
+  (match_operand 0 "register_operand")
+{
+  /* We can't allow a subreg that changes size in an FP reg, as that violates
+     the NaN-boxing requirement.  Copied from old validate_subregs code.  */
+  if (GET_CODE (op) == SUBREG)
+    {
+      machine_mode imode = GET_MODE (SUBREG_REG (op));
+      machine_mode omode = GET_MODE (op);
+      if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))
+	{
+	  poly_uint64 isize = GET_MODE_SIZE (imode);
+	  poly_uint64 osize = GET_MODE_SIZE (omode);
+
+	  if (! (known_eq (isize, osize)
+		 /* LRA can use subreg to store a floating point value in
+		    an integer mode.  Although the floating point and the
+		    integer modes need the same number of hard registers,
+		    the size of floating point mode can be less than the
+		    integer mode.  LRA also uses subregs for a register
+		    should be used in different mode in on insn.  */
+		 || lra_in_progress))
+	    return false;
+	}
+    }
+  return true;
+})

I haven't fixed the move patterns yet.  I need a few more new predicates for that.

The riscv_hard_regno_mode_ok function looks odd as Hongtao Liu mentioned.  The mov*i patterns have f constraints, but we don't allow FP values in integer registers here.  I think this is a missed optimization, or maybe good register allocation creates some no-op moves to hide the problem.  My current patch does not need a riscv_hard_regno_mode_ok change to work.

The riscv_can_change_mode_class also looks a little odd.  The MIPS one that we copied doesn't allow mode changes in FP regs, but the alpha one does allow mode changes in FP regs if the modes have the same size.  I think the alpha one can work for RISC-V and that this is another missed optimization, though again maybe good regalloc hides the problem with no-op moves.

But I will worry about the possible missed optimizations after I get bootstrap working again.

Comment 6 Segher Boessenkool 2021-09-09 23:30:14 UTC

(In reply to Jim Wilson from comment #4)
> Yes, moving SI/DI values to FP regs is OK.  However, RISC-V requires that FP
> values in FP registers be stored NaN-boxed.  So an SFmode value in a 64-bit
> FP reg has the upper 32-bits of all ones, and the lower 32-bits is the
> value.  Thus if accessed as a 64-bit value, you get a NaN.

On Power, a (scalar) SF value is usually stored in DF format.  It is the insns
that force the outputs to SP, the inputs in general can be anything.

> The hardware may
> trap if you access a 32-bit value which is not properly NaN-boxed.

We used to have such games as well :-)  Thankfully it was largely transparent.

> Using
> qemu to check this may not be good enough, as last time I looked at qemu it
> wasn't handling NaN-boxing correctly, but this was over a year ago, so maybe
> it has been fixed since.  I don't know.

QEMU in general optimises for speed, not for correct emulation.  If you have
inputs that are invalid it will have more surprising outputs.

> This code sequence is not OK
> foo:
> 	fmv.d.x	fa5,a0
> 	fmul.s	fa0,fa0,fa5
> because we are moving a 64-bit DImode value to an FP reg and then treating
> it as SFmode, which is not OK because the value won't be NaN-boxed and may
> trap at run time.

A situation very similar to the Power problem.

> I would think that TARGET_CAN_CHANGE_MODE_CLASS could help here, but it
> isn't being called inside general_operand when called from fwprop1 where the
> bad substitution happens.  Because we have a pseudo-register, and it is only
> called for hard registers.

The documentation says

===
@cindex @code{TARGET_CAN_CHANGE_MODE_CLASS} and subreg semantics
The rules above apply to both pseudo @var{reg}s and hard @var{reg}s.
If the semantics are not correct for particular combinations of
@var{m1}, @var{m2} and hard @var{reg}, the target-specific code
must ensure that those combinations are never used.  For example:

@smallexample
TARGET_CAN_CHANGE_MODE_CLASS (@var{m2}, @var{m1}, @var{class})
@end smallexample

must be false for every class @var{class} that includes @var{reg}.
===

so the code does not do what the documentation says?

> I don't see a way to fix this as a backend change with current
> validate_subreg, other than by replacing register_operand with
> riscv_register_operand, and putting the subreg check I need inside
> riscv_register_operand.  And likewise for any other affected predicate, like
> move_operand.  This will be a big change, though a lot of it will be
> mechanical.  As an optimization, we can continue to use register_operand in
> any pattern that can't use FP registers.

The first thing that will have to be done is to restore the status quo, to
make your, my, and all other affected targets work again (there very likely
are more, the problems are all in not-so-very normal cases, not to mention
not all targets are tested so heavily -- which reinforces my argument that
there should have been testsuite cases added that trap this on all targets).

After that, yeah, our backends should be improved.  That requires some new
stuff in the middle end as well afaics, there really are reasons Power and
RISC-V both did such terrible thing -- but it certainly should not be done
with a knife at the throat, this is some serious re-engineering, it cannot
be done at a snap of the fingers.

> As a middle end change, I need a new hook in general_operand to reject
> subregs that we can't support on RISC-V.

That may be a good design that is suitable for others as well, it is quite
nicely general.  Mike, will that do all we need for the SF subregs as well?

A little problem with this is most of our operands do not inherit from
general_operand.  A somewhat bigger problem is: what about predicates, do
those not need changes as well, for good machine code quality?

> Or maybe re-add the check I need to validate_subreg as a hook, so it can be
> conditionally enabled for RISC-V.

Yeah...  But it should be re-enabled *by default*, to start with.

> We can allow (subreg:SF (reg:DI)) if it gets allocated to an integer
> register.  It is only when it is allocated to an FP register that it can't
> work.  I don't know offhand if that can be described.  But disallowing the
> subreg always for RISC-V is simpler and also works.

(subreg:SF (reg:DI)) does two things at once: an actual subreg, and a bit_cast.
We should not express both of those with the same rtx code, since we do not
allow subregs of subregs (and that is a good thing!)

Comment 7 Andrew Waterman 2021-09-09 23:50:13 UTC

On Tue, Sep 7, 2021 at 10:55 PM wilson at gcc dot gnu.org via Gcc-bugs
<gcc-bugs@gcc.gnu.org> wrote:
>
> The hardware may trap if
> you access a 32-bit value which is not properly NaN-boxed.

I don't think the following affects the resolution of the ticket, but
just for the record, trapping is _not_ an option the ISA permits.  The
only legal outcome from using an improperly NaN-boxed value as an
argument to an instruction that requires NaN-boxed inputs is that the
canonical NaN is substituted in its place.

Comment 8 GCC Commits 2021-09-10 21:55:58 UTC

The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:57b7c432cce893e1ba60d9b94a9606df6b419379

commit r12-3457-g57b7c432cce893e1ba60d9b94a9606df6b419379
Author: liuhongt <hongtao.liu@intel.com>
Date:   Fri Sep 10 20:02:25 2021 +0800

    Revert "Get rid of all float-int special cases in validate_subreg."
    
    This reverts commit d2874d905647a1d146dafa60199d440e837adc4d.
    
    PR target/102254
    PR target/102154
    PR target/102211

Comment 9 Jim Wilson 2021-09-13 17:35:50 UTC

Created attachment 51456 [details]
proposed RISC-V backend solution

Comment 10 Jim Wilson 2021-09-13 17:43:49 UTC

I attached a patch which is my proposed solution to the RISC-V backend.  It adds a new f_register_operand predicate and modifies patterns that use the f constraint to use it instead of register_operand.

This was tested with an default language gcc build, glibc build, and glibc check on an unmatched running OpenEmbedded.  And an all language gcc build, glibc build, and glibc check on an unleashed running Fedora with an old 4.15 kernel.  Both succeeded as well as expected.  I'll be trying gcc check next.

Meanwhile, the validate_subregs patch was reverted, so we don't immediately need this to fix build errors.  But it still might be useful if validate_subregs changes again.  Or if it happens to give better code, though I think it won't do anything if validate_subregs is rejecting the subregs we are checking for in f_register_operand.

Comment 11 Richard Biener 2022-01-17 13:33:59 UTC

Fixed since reverted.