[PATCH, rs6000] Fix PR target/67808, LRA ICE on double to long double conversion

Mon Oct 5 17:12:00 GMT 2015

On Fri, Oct 02, 2015 at 02:04:48PM -0500, Peter Bergner wrote:
> PR67808 exposes a problem with the constraints in the *extenddftf2_internal
> pattern, in that it allows TFmode operands to occupy Altivec registers
> which they are not allowed to do.  Reload was able to work around the
> problem, but LRA is more pedantic and it caused it to go into an infinite
> spill loop until it ICEd.  The following patch from Mike changes the TFmode
> output operand to use the "d" constraint instead of "ws".  It also allows
> using the "ws" constraint for the two input operands, since that is allowed
> for DFmode operands.
> 
> This passed bootstraps (with reload on by default and lra on by default)
> and shows no testsuite regressions.  Is this ok for trunk?
> 
> The bug is also present in the FSF 5 branch (4.9 is ok), is this ok for
> that too, assuming my bootstrap/regtesting there are clean?
> 
> Peter
> 
> 
> gcc/
> 	PR target/67808
> 	* config/rs6000/rs6000.md (*extenddftf2_internal): Fix constraints.
> 
> gcc/testsuite/
> 
> 	* gcc.target/powerpc/pr67808.c: New test.
> 

In looking at the constraints in more detail, after the patch we have the
following alternatives:

  #1:	op0 = m,  op1 = ws, op2 = ws
  #2:	op0 = Y,  op1 = r,  op2 = r
  #3:	op0 = d,  op1 = md, op2 = j
  #4:	op0 = d,  op1 = md, op2 = m
  #5:	op0 = &d, op1 = md, op2 = ws

I.e.

  #1:	Store result, input in VSX register, 0.0 in VSX register (VSX only)
  #2:	Store result, input in GPR register, 0.0 in GPR register
  #3:	Result in FPR register, input in FPR or memory, 0.0 direct (VSX only)
  #4:	Result in FPR register, input in FPR or memory, 0.0 in memory
  #5:	Result in FPR reg (no overlap), input in FPR/memory, 0.0 in VSX reg

So, the non-VSX case (were ws is NO_REGS) only deals with alternatives #2 and
#4.

I think (but I don't have a test case) that alternative #1 is potentially a
problem if the input register is ever allocated to an Altivec register and the
address mode is reg+offset (in which case we would not be able to form the
address after the insn is split post-reload.

I have attached a better version of the patch.

This gives the constraints:

  #1:	op0 = m,  op1 = d,  op2 = d
  #2:	op0 = Y,  op1 = r,  op2 = r
  #3:	op0 = d,  op1 = ws, op2 = j
  #4:	op0 = d,  op1 = md, op2 = m
  #5:	op0 = &d, op1 = m,  op2 = md

I.e.

  #1:	Store result, input in FPR register, 0.0 in FPR register
  #2:	Store result, input in GPR register, 0.0 in GPR register
  #3:	Result in FPR reg, input in VSX reg, 0.0 direct (VSX only)
  #4:	Result in FPR reg, input in FPR/memory, 0.0 in memory
  #5:	Result in FPR reg, input in FPR/memory, 0.0 in FPR/memory (no overlap)

[gcc]
2015-10-05  Peter Bergner <bergner@vnet.ibm.com>
	    Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/67808
	* config/rs6000/rs6000.md (extenddftf2_internal): Fix up
	constraints.

[gcc/testsuite]
2015-10-05  Peter Bergner <bergner@vnet.ibm.com>

	PR target/67808
	* gcc.target/powerpc/pr67808.c: New test.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797
-------------- next part --------------
Index: gcc/config/rs6000/rs6000.md
===================================================================

--- gcc/config/rs6000/rs6000.md	(revision 228495)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -6505,9 +6505,9 @@ (define_expand "extenddftf2_fprs"
 })
 
 (define_insn_and_split "*extenddftf2_internal"
-  [(set (match_operand:TF 0 "nonimmediate_operand" "=m,Y,ws,d,&d")
-       (float_extend:TF (match_operand:DF 1 "input_operand" "d,r,md,md,md")))
-   (use (match_operand:DF 2 "zero_reg_mem_operand" "d,r,j,m,d"))]
+  [(set (match_operand:TF 0 "nonimmediate_operand" "=m,Y,d,d,&d")
+       (float_extend:TF (match_operand:DF 1 "input_operand" "d,r,ws,md,md")))
+   (use (match_operand:DF 2 "zero_reg_mem_operand" "d,r,j,m,md"))]
   "!TARGET_IEEEQUAD
    && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT 
    && TARGET_LONG_DOUBLE_128"
Index: gcc/testsuite/gcc.target/powerpc/pr67808.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pr67808.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr67808.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O1 -mvsx -mlra" } */
+
+/* PR 67808: LRA ICEs on simple double to long double conversion test case */
+
+void
+foo (long double *ldb1, double *db1)
+{
+  *ldb1 = *db1;
+}