[PATCH/RFC] LRA: Don't emit move for substituted CONSTATNT_P operand [PR116170]

Mon Aug 12 08:55:07 GMT 2024

"Kewen.Lin" <linkw@linux.ibm.com> writes:
> Hi,
>
> Commit r15-2084 exposes one ICE in LRA.  Firstly, before
> r15-2084 KFmode has 126 bit precision while V1TImode has 128
> bit precision, so the subreg (subreg:V1TI (reg:KF 131) 0) is
> paradoxical_subreg_p, which stops some passes from doing
> some optimization.  After r15-2084, KFmode has the same mode
> precision as V1TImode, passes are able to optimize more, but
> it causes this ICE in LRA as described below:
>
> For insn 106 (set (mem:V1TI ...) (subreg:V1TI (reg:KF 133) 0)),
> which matches pattern
>
> (define_insn "*vsx_le_perm_store_<mode>"
>   [(set (match_operand:VSX_LE_128 0 "memory_operand" "=Z,Q")
>         (match_operand:VSX_LE_128 1 "vsx_register_operand" "+wa,r"))]

Is it well-formed to have "+" on an operand that is only semantically
an input?  df and most passes would consider it to be a pure input
and so would disregard any write that is intended to take place.

E.g. if we have:

  (set (reg:M R2) (reg:M R1)) ;; R1 dead
  (set (reg:M R3) (reg:M R2))

then it would be valid to change that to:

  (set (reg:M R2) (reg:M R1))
  (set (reg:M R3) (reg:M R1))

without considering the "+" on the input to the first instruction.

But if we do stick with the patch...

>   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR
>    && !altivec_indexed_or_indirect_operand (operands[0], <MODE>mode)"
>   "@
>    #
>    #"
>   [(set_attr "type" "vecstore,store")
>    (set_attr "length" "12,8")
>    (set_attr "isa" "<VSisa>,*")])
>
> LRA makes equivalence substitution on r133 with const double
> (const_double:KF 0.0), selects alternative 0 and fixes up
> operand 1 for constraint "wa", because operand 1 is OP_INOUT,
> so it considers assigning back to it as well, that is:
>
>   lra_emit_move (type == OP_INOUT ? copy_rtx (old) : old, new_reg);
>
> But because old has been changed to const_double in equivalence
> substitution, the move is actually assigning to const_double,
> which is invalid and cause ICE.
>
> Considering reg:KF 133 is equivalent with (const_double:KF 0.0)
> even though this operand is OP_INOUT, IMHO there should not be
> any following uses of reg:KF 133, otherwise it doesn't have the
> chance to be equivalent to (const_double:KF 0.0).  From this
> perspective, I think we can guard the lra_emit_move with
> nonimmediate_operand to exclude such case.
>
> Does it sound reasonable?
>
> btw, I also tried with disallowing equivalence substitution with
> CONSTANT_P value if the corresponding operand is OP_INOUT or
> OP_OUT, it can also fix this issue, but with more thinking it
> seems not necessary to stop such substitution if we can handle it
> later as above.
>
> Bootstrapped and regtested on x86_64-redhat-linux and
> powerpc64{,le}-linux-gnu.
>
> BR,
> Kewen
> -----
> 	PR rtl-optimization/116170
>
> gcc/ChangeLog:
>
> 	* lra-constraints.cc (curr_insn_transform): Don't emit move back to
> 	old operand if it's nonimmediate_operand.
>
> gcc/testsuite/ChangeLog:
>
> 	* gcc.target/powerpc/pr116170.c: New test.
> ---
>  gcc/lra-constraints.cc                      |  3 ++-
>  gcc/testsuite/gcc.target/powerpc/pr116170.c | 18 ++++++++++++++++++
>  2 files changed, 20 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr116170.c
>
> diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
> index 92b343fa99a..024c85c87d9 100644
> --- a/gcc/lra-constraints.cc
> +++ b/gcc/lra-constraints.cc
> @@ -4742,7 +4742,8 @@ curr_insn_transform (bool check_only_p)
>  	    }
>  	  *loc = new_reg;
>  	  if (type != OP_IN
> -	      && find_reg_note (curr_insn, REG_UNUSED, old) == NULL_RTX)
> +	      && find_reg_note (curr_insn, REG_UNUSED, old) == NULL_RTX
> +	      && nonimmediate_operand (old, GET_MODE (old)))

...IMO it would be better to use CONSTANT_P instead.  It's sometimes
useful to accept constants in specific instructions only, meaning that
they aren't general enough to be immediate_operands.

Thanks,
Richard

>  	    {
>  	      start_sequence ();
>  	      lra_emit_move (type == OP_INOUT ? copy_rtx (old) : old, new_reg);
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr116170.c b/gcc/testsuite/gcc.target/powerpc/pr116170.c
> new file mode 100644
> index 00000000000..6f6ca0f1ae9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr116170.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target ppc_float128_sw } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O2 -fstack-protector-strong -ffloat-store" } */
> +
> +/* Verify there is no ICE.  */
> +
> +int a, d;
> +_Float128 b, c;
> +void
> +e ()
> +{
> +  int f = 0;
> +  if (a)
> +    if (b || c)
> +      f = 1;
> +  if (d)
> +    e (f ? 0 : b);
> +}
> --
> 2.43.5