Bug 34091 - [4.2 Regression] ICE in reload_cse_simplify_operands, at postreload.c:392
Summary: [4.2 Regression] ICE in reload_cse_simplify_operands, at postreload.c:392
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.2.2
: P1 normal
Target Milestone: 4.2.3
Assignee: Not yet assigned to anyone
URL:
Keywords: ice-on-valid-code
Depends on:
Blocks:
 
Reported: 2007-11-14 08:51 UTC by Martin Michlmayr
Modified: 2007-12-10 03:26 UTC (History)
5 users (show)

See Also:
Host:
Target: hppa-linux-gnu
Build:
Known to work: 4.1.2
Known to fail:
Last reconfirmed: 2007-11-24 18:59:54


Attachments
preprocessed source (59.71 KB, application/octet-stream)
2007-11-14 08:52 UTC, Martin Michlmayr
Details
Reduced testcase (1.44 KB, text/plain)
2007-11-14 08:53 UTC, Martin Michlmayr
Details
pa-cannot-4.3.d.8 (4.42 KB, text/plain)
2007-11-24 19:27 UTC, dave
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Michlmayr 2007-11-14 08:51:17 UTC
gcc 4.1 can compile the attached source without any problems, but gcc-4.2
(at -O) and trunk (20070916, at -O2) ICE.

paer% gcc-4.2 -c -O s_texfilter.i
swrast/s_texfilter.c: In function ‘sample_lambda_2d’:
swrast/s_texfilter.c:1420: error: insn does not satisfy its constraints:
(insn 2621 1258 1259 96 (set (mem/c:HI (plus:SI (reg/f:SI 30 %r30)
                (const_int -474 [0xfffffe26])) [0 S2 A16])
        (reg:HI 68 %fr22)) 53 {*pa.md:3126} (nil)
    (nil))
swrast/s_texfilter.c:1420: internal compiler error: in reload_cse_simplify_operands, at postreload.c:392
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://gcc.gnu.org/bugs.html> for instructions.
For Debian GNU/Linux specific bug reporting instructions,
see <URL:file:///usr/share/doc/gcc-4.2/README.Bugs>.
paer%

paer% /usr/lib/gcc-snapshot/bin/gcc -c -O2 s_texfilter.i
swrast/s_texfilter.c: In function 'sample_lambda_2d':
swrast/s_texfilter.c:1420: error: insn does not satisfy its constraints:
(insn 2415 1100 1137 102 swrast/s_texfilter.c:164 (set (mem/c:HI (plus:SI (reg/f:SI 30 %r30)
                (const_int -458 [0xfffffe36])) [601 S2 A16])
        (reg:HI 78 %fr27)) 53 {*pa.md:3184} (nil))
swrast/s_texfilter.c:1420: internal compiler error: in reload_cse_simplify_operands, at postreload.c:395
Please submit a full bug report,
Comment 1 Martin Michlmayr 2007-11-14 08:52:28 UTC
Forgot to say that this is Debian bug http://bugs.debian.org/451047
Just for the reference.
Comment 2 Martin Michlmayr 2007-11-14 08:52:54 UTC
Created attachment 14551 [details]
preprocessed source
Comment 3 Martin Michlmayr 2007-11-14 08:53:33 UTC
Created attachment 14552 [details]
Reduced testcase
Comment 4 dave 2007-11-14 15:57:52 UTC
Subject: Re:   New: [4.2/4.3 Regression] ICE in reload_cse_simplify_operands, at postreload.c:392

> gcc 4.1 can compile the attached source without any problems, but gcc-4.2
> (at -O) and trunk (20070916, at -O2) ICE.

There is no significant difference in the backend between 4.1, 4.2
and the trunk in the handling of QImode and HImode in the floating
registers.  What's changed is reload.

> paer% gcc-4.2 -c -O s_texfilter.i
> swrast/s_texfilter.c: In function ‘sample_lambda_2d’:
> swrast/s_texfilter.c:1420: error: insn does not satisfy its constraints:
> (insn 2621 1258 1259 96 (set (mem/c:HI (plus:SI (reg/f:SI 30 %r30)
>                 (const_int -474 [0xfffffe26])) [0 S2 A16])
>         (reg:HI 68 %fr22)) 53 {*pa.md:3126} (nil)
>     (nil))
> swrast/s_texfilter.c:1420: internal compiler error: in
> reload_cse_simplify_operands, at postreload.c:392

The simplest fix is probably not to allow QImode and HImode values
in the floating point registers as there's no instructions that operate
on them.

Dave
Comment 5 dave 2007-11-22 03:12:54 UTC
Subject: Re:   New: [4.2/4.3 Regression] ICE in reload_cse_simplify_operands, at postreload.c:392

> > paer% gcc-4.2 -c -O s_texfilter.i
> > swrast/s_texfilter.c: In function ‘sample_lambda_2d’:
> > swrast/s_texfilter.c:1420: error: insn does not satisfy its constraints:
> > (insn 2621 1258 1259 96 (set (mem/c:HI (plus:SI (reg/f:SI 30 %r30)
> >                 (const_int -474 [0xfffffe26])) [0 S2 A16])
> >         (reg:HI 68 %fr22)) 53 {*pa.md:3126} (nil)
> >     (nil))
> > swrast/s_texfilter.c:1420: internal compiler error: in
> > reload_cse_simplify_operands, at postreload.c:392
> 
> The simplest fix is probably not to allow QImode and HImode values
> in the floating point registers as there's no instructions that operate
> on them.

I have implemented this, but it doesn't fix the problem.  There
are problems in the backend handling spills for floating floating point
instructions.  The issue has been around for a long time and never
resolved completely.  It's papered over by register copies to
the general registers, and usually we don't get into trouble since
the architecture has quite a few registers.

I made an initial stab at fixing this by tightening up GO_IF_LEGITIMATE_ADDRESS
but I have run into problems with pseudos not being allocated hard registers.
This probably means I don't have the change quite right.  I also have a
problem with paradoxical SUBREGS when the inner register is spilled.  I'm
not clear on how this is to be handled on a big endian target with strict
alignment.  The documentation says reload is supposed to prevent this from
happening, but it doesn't seem to.  I see this with the testcase from this
PR.  It's combine that creates the paradoxical SUBREG.  Tracing back
why reload doesn't allocate a hard register is tough...

I'm going to try another approach.  Use a lax definition for
GO_IF_LEGITIMATE_ADDRESS and try to fix things up using pa_secondary_reload.

I view this as a critical "target" bug.  However, if we find a fix, I
don't think it should be applied to 4.2 and earlier since it's very likely
to break something else.

Dave
Comment 6 Eric Botcazou 2007-11-24 18:59:54 UTC
With a cross-compiler.
Comment 7 dave 2007-11-24 19:27:35 UTC
Subject: Re:  [4.2/4.3 Regression] ICE in
	reload_cse_simplify_operands, at postreload.c:392

The attached patch seems to fix the problem on the trunk.  However,
it doesn't work on 4.2.  It makes the problem reported in PR middle-end/32889
worse and I get bootstrap failure in stage2.

I've experimented with reload patterns to handle (mem (plus reg const))
but this doesn't seem to help with 32889.  So, I think there's a problem
with the handling of reg_equiv_alt_mem_list in 4.2.

Dave
Comment 8 dave 2007-11-24 19:27:35 UTC
Created attachment 14631 [details]
pa-cannot-4.3.d.8
Comment 9 Eric Botcazou 2007-11-24 20:31:48 UTC
> This probably means I don't have the change quite right.  I also have a
> problem with paradoxical SUBREGS when the inner register is spilled.  I'm
> not clear on how this is to be handled on a big endian target with strict
> alignment.  The documentation says reload is supposed to prevent this from
> happening, but it doesn't seem to.  I see this with the testcase from this
> PR.  It's combine that creates the paradoxical SUBREG.

Here is how the SPARC port deals with this specific case:

/* Return the register class of a scratch register needed to load IN into
   a register of class CLASS in MODE.

   We need a temporary when loading/storing a HImode/QImode value
   between memory and the FPU registers.  This can happen when combine puts
   a paradoxical subreg in a float/fix conversion insn.

   We need a temporary when loading/storing a DFmode value between
   unaligned memory and the upper FPU registers.  */

#define SECONDARY_INPUT_RELOAD_CLASS(CLASS, MODE, IN)		\

[...]

#define SECONDARY_OUTPUT_RELOAD_CLASS(CLASS, MODE, IN)		\
Comment 10 dave 2007-11-24 21:31:09 UTC
Subject: Re:  [4.2/4.3 Regression] ICE in reload_cse_simplify_operands, at postreload.c:392

>    We need a temporary when loading/storing a HImode/QImode value
>    between memory and the FPU registers.  This can happen when combine puts
>    a paradoxical subreg in a float/fix conversion insn.

This looks like an alternative solution to the HImode/QImode problem,
although the SECONDARY_INPUT_RELOAD_CLASS and SECONDARY_OUTPUT_RELOAD_CLASS
macros are now obsolete.  Guess the same could be done with
TARGET_SECONDARY_RELOAD.

Is there a reason why you want to load/store HImode/QImode values in
the FPU registers on sparc?  On the pa, there aren't any insns that
operate on them.  So, the only reason we supported this was because
these modes are tied to SImode and DImode.  Not supporting FPU loads
and stores of HImode/QImode values allowed me to eliminates some
duplication of patterns.

I haven't seen the paradoxical subreg in a float/fix conversion
insns with the current patch.  I did see this in some of the first
versions of pa_cannot_change_mode_class.  I think I eliminated
this problem by prevent mode changes in the FP registers:

  if (MAYBE_FP_REG_CLASS_P (class))
      return true;

Due you think this problem is latent?

I stopped using secondary memory on the 32-bit targets because
FPU loads and stores only support 5-bit immediates.  Because of this,
the port was using SECONDARY_MEMORY_NEEDED_RTX and a free stack
slot in the frame marker of the caller.  However, this failed in
the rare case when a const function got placed between the load
and store.

I moved the SECONDARY_MEMORY_NEEDED define as I saw some code
while debugging that wasn't optimised away on the 32-bit port.

Dave
Comment 11 Eric Botcazou 2007-11-24 21:47:51 UTC
> Is there a reason why you want to load/store HImode/QImode values in
> the FPU registers on sparc?  On the pa, there aren't any insns that
> operate on them.  So, the only reason we supported this was because
> these modes are tied to SImode and DImode.

Same on SPARC, there are no HImode/QImode move insns either.

> I haven't seen the paradoxical subreg in a float/fix conversion
> insns with the current patch.  I did see this in some of the first
> versions of pa_cannot_change_mode_class.  I think I eliminated
> this problem by prevent mode changes in the FP registers:
> 
>   if (MAYBE_FP_REG_CLASS_P (class))
>       return true;
> 
> Due you think this problem is latent?

On SPARC or on the PA even with CANNOT_CHANGE_MODE_CLASS?
Comment 12 dave 2007-11-24 22:14:07 UTC
Subject: Re:  [4.2/4.3 Regression] ICE in reload_cse_simplify_operands, at postreload.c:392

> > I haven't seen the paradoxical subreg in a float/fix conversion
> > insns with the current patch.  I did see this in some of the first
> > versions of pa_cannot_change_mode_class.  I think I eliminated
> > this problem by prevent mode changes in the FP registers:
> > 
> >   if (MAYBE_FP_REG_CLASS_P (class))
> >       return true;
> > 
> > Due you think this problem is latent?
> 
> On SPARC or on the PA even with CANNOT_CHANGE_MODE_CLASS?

I meant on the PA using the CANNOT_CHANGE_MODE_CLASS patch.

What I saw when mode changes were allowed was something like
the following:

set (subreg:SI (mem:HI)) (fix:SI (fix:SF))

where the mem was a spilled HImode pseudo.  At the time,
I thought I had stopped mode changes to QImode/HImode in
the FPU registers but I believe they got tied to DImode.

Dave
Comment 13 Eric Botcazou 2007-11-24 23:36:45 UTC
> What I saw when mode changes were allowed was something like
> the following:
> 
> set (subreg:SI (mem:HI)) (fix:SI (fix:SF))
> 
> where the mem was a spilled HImode pseudo.  At the time,
> I thought I had stopped mode changes to QImode/HImode in
> the FPU registers but I believe they got tied to DImode.

My understanding is that your CANNOT_CHANGE_MODE_CLASS change will prevent
the problem from arising.  It's a stronger counter-measure than that of the
SPARC port.  The CANNOT_CHANGE_MODE_CLASS of the SPARC port is exactly the
same as that of the unpatched PA port (look at the comment above it :-) and
was added relatively recently, years after SECONDARY_INPUT_RELOAD_CLASS and
SECONDARY_OUTPUT_RELOAD_CLASS were defined.


Comment 14 John David Anglin 2007-12-05 17:19:13 UTC
The regression was introduced by the following change:

2005-11-21  Jan Hubicka  <jh@suse.cz>

        PR tree-optimization/24653
        * tree-ssa-ccp.c (ccp_fold): Strip down useless conversions.

I have to think the target reload issues that the above exposes were latent.
The ICE has morphed a bit over time.  This is the error at revision 107304.

dave@gsyprf11:~/gcc_test$ ../gcc-4.3/objdir/gcc/stage1/xgcc -B../gcc-4.3/objdir/gcc/stage1/ -S -O2 s_texfilter.c
s_texfilter.c: In function 'sample_lambda_2d':
s_texfilter.c:166: error: unable to find a register to spill in class 'FP_REGS'
s_texfilter.c:166: error: this is the insn:
(insn 235 257 238 20 (set (subreg:SI (reg:HI 235) 0)
        (fix:SI (fix:DF (reg:DF 68 %fr22 [234])))) 95 {fix_truncdfsi2} (insn_list:REG_DEP_TRUE 233 (insn_list:REG_DEP_ANTI 227 (nil)))
    (expr_list:REG_DEAD (reg:DF 68 %fr22 [234])
        (nil)))
s_texfilter.c:166: internal compiler error: in spill_failure, at reload1.c:1901

There are two target issues involved here.

1) We can't allow mode paradoxical subregs in FP_REGS as this can't be handled
if the pseudo is spilled.

2) pa_secondary_reload() requests a secondary scratch register reload for
essentially everything when CLASS is FP_REGS.  However, reload is treating
this reload as optional, resulting in spill failures and out of range REG+D
addresses.  We do this because there is an asymmetry between the offsets
allowed for integer loads and stores, and floating point loads and stores.
Integer loads and stores support 14-bit offsets while PA 1.x floating point
loads and stores only support 5-bit offsets.  It is a *MAJOR* compromise to
restrict integer loads and stores to 5-bit offsets.  So, the target has always
used secondary reloads to rewrite floating-point loads and stores into a
form that provides correct code during register elimination. 
Comment 15 Eric Botcazou 2007-12-05 17:42:26 UTC
> 2) pa_secondary_reload() requests a secondary scratch register reload for
> essentially everything when CLASS is FP_REGS.  However, reload is treating
> this reload as optional, resulting in spill failures and out of range REG+D
> addresses. 

You should request a secondary reload when you need one, like on the SPARC.
Currently the only return value of pa_secondary_reload is NO_REGS.
Comment 16 dave 2007-12-05 19:01:04 UTC
Subject: Re:  [4.2/4.3 Regression] ICE in reload_cse_simplify_operands, at postreload.c:392

> You should request a secondary reload when you need one, like on the SPARC.
> Currently the only return value of pa_secondary_reload is NO_REGS.

The sparc hasn't been updated to use the new TARGET_SECONDARY_RELOAD
mechanism.  It is my understanding that setting sri->icode and
returing NO_REGS requests a secondary reload with a scratch register
meeting the constraint for operand[2] in the specified reload pattern.

It's true that we currently never request and intermediate register.
This would be specified by returning a different register class
(e.g., GENERAL_REGS).  Returning GENERAL_REGS for some of the cases
covered here

  /* Handle out of range displacement for integer mode loads/stores of
     FP registers.  */
  if (((regno >= FIRST_PSEUDO_REGISTER || regno == -1)
       && GET_MODE_CLASS (mode) == MODE_INT
       && FP_REG_CLASS_P (class))
      || (class == SHIFT_REGS
	  && FP_REG_CLASS_P (REGNO_REG_CLASS (regno)))))
    {
      sri->icode = in_p ? reload_in_optab[mode] : reload_out_optab[mode];
      return NO_REGS;
    }

can "fix" this PR.  However, copying through the general registers
is less efficient than the code currently generated by emit_move_sequence.
Further, I find that I still need the current reload code when x is
a SUBREG.  Requesting a general register intermediate doesn't work.

In theory, we shouldn't have these problems when generating PA 2.0 code
since integer and floating point loads/stores both support 14-bit offsets,
but in practice the mode change limitations force use to do a secondary reload.

This patch fixes this PR and PR 32889, and there are no regressions
with any PA target that I test on.  However, I'm worried because this whole
process requires that secondary reloads not be optional.  We also have
an asymmetry in the handling of (subreg (mem)) and (mem).

Index: config/pa/pa.md
===================================================================
--- config/pa/pa.md	(revision 130587)
+++ config/pa/pa.md	(working copy)
@@ -3183,60 +3183,11 @@
 
 (define_insn ""
   [(set (match_operand:HI 0 "move_dest_operand"
-	 		  "=r,r,r,r,r,Q,!*q,!r,!*f,?r,?*f")
-	(match_operand:HI 1 "move_src_operand"
-			  "r,J,N,K,RQ,rM,!rM,!*q,!*fM,*f,r"))]
-  "(register_operand (operands[0], HImode)
-    || reg_or_0_operand (operands[1], HImode))
-   && !TARGET_SOFT_FLOAT
-   && !TARGET_64BIT"
-  "@
-   copy %1,%0
-   ldi %1,%0
-   ldil L'%1,%0
-   {zdepi|depwi,z} %Z1,%0
-   ldh%M1 %1,%0
-   sth%M0 %r1,%0
-   mtsar %r1
-   {mfctl|mfctl,w} %sar,%0
-   fcpy,sgl %f1,%0
-   {fstws|fstw} %1,-16(%%sp)\n\t{ldws|ldw} -16(%%sp),%0
-   {stws|stw} %1,-16(%%sp)\n\t{fldws|fldw} -16(%%sp),%0"
-  [(set_attr "type" "move,move,move,shift,load,store,move,move,move,fpstore_load,store_fpload")
-   (set_attr "pa_combine_type" "addmove")
-   (set_attr "length" "4,4,4,4,4,4,4,4,4,8,8")])
-
-(define_insn ""
-  [(set (match_operand:HI 0 "move_dest_operand"
-	 		  "=r,r,r,r,r,Q,!*q,!r,!*f")
-	(match_operand:HI 1 "move_src_operand"
-			  "r,J,N,K,RQ,rM,!rM,!*q,!*fM"))]
-  "(register_operand (operands[0], HImode)
-    || reg_or_0_operand (operands[1], HImode))
-   && !TARGET_SOFT_FLOAT
-   && TARGET_64BIT"
-  "@
-   copy %1,%0
-   ldi %1,%0
-   ldil L'%1,%0
-   {zdepi|depwi,z} %Z1,%0
-   ldh%M1 %1,%0
-   sth%M0 %r1,%0
-   mtsar %r1
-   {mfctl|mfctl,w} %sar,%0
-   fcpy,sgl %f1,%0"
-  [(set_attr "type" "move,move,move,shift,load,store,move,move,move")
-   (set_attr "pa_combine_type" "addmove")
-   (set_attr "length" "4,4,4,4,4,4,4,4,4")])
-
-(define_insn ""
-  [(set (match_operand:HI 0 "move_dest_operand"
 	 		  "=r,r,r,r,r,Q,!*q,!r")
 	(match_operand:HI 1 "move_src_operand"
 			  "r,J,N,K,RQ,rM,!rM,!*q"))]
   "(register_operand (operands[0], HImode)
-    || reg_or_0_operand (operands[1], HImode))
-   && TARGET_SOFT_FLOAT"
+    || reg_or_0_operand (operands[1], HImode))"
   "@
    copy %1,%0
    ldi %1,%0
@@ -3356,60 +3307,11 @@
 
 (define_insn ""
   [(set (match_operand:QI 0 "move_dest_operand"
-			  "=r,r,r,r,r,Q,!*q,!r,!*f,?r,?*f")
-	(match_operand:QI 1 "move_src_operand"
-			  "r,J,N,K,RQ,rM,!rM,!*q,!*fM,*f,r"))]
-  "(register_operand (operands[0], QImode)
-    || reg_or_0_operand (operands[1], QImode))
-   && !TARGET_SOFT_FLOAT
-   && !TARGET_64BIT"
-  "@
-   copy %1,%0
-   ldi %1,%0
-   ldil L'%1,%0
-   {zdepi|depwi,z} %Z1,%0
-   ldb%M1 %1,%0
-   stb%M0 %r1,%0
-   mtsar %r1
-   {mfctl|mfctl,w} %%sar,%0
-   fcpy,sgl %f1,%0
-   {fstws|fstw} %1,-16(%%sp)\n\t{ldws|ldw} -16(%%sp),%0
-   {stws|stw} %1,-16(%%sp)\n\t{fldws|fldw} -16(%%sp),%0"
-  [(set_attr "type" "move,move,move,shift,load,store,move,move,move,fpstore_load,store_fpload")
-   (set_attr "pa_combine_type" "addmove")
-   (set_attr "length" "4,4,4,4,4,4,4,4,4,8,8")])
-
-(define_insn ""
-  [(set (match_operand:QI 0 "move_dest_operand"
-			  "=r,r,r,r,r,Q,!*q,!r,!*f")
-	(match_operand:QI 1 "move_src_operand"
-			  "r,J,N,K,RQ,rM,!rM,!*q,!*fM"))]
-  "(register_operand (operands[0], QImode)
-    || reg_or_0_operand (operands[1], QImode))
-   && !TARGET_SOFT_FLOAT
-   && TARGET_64BIT"
-  "@
-   copy %1,%0
-   ldi %1,%0
-   ldil L'%1,%0
-   {zdepi|depwi,z} %Z1,%0
-   ldb%M1 %1,%0
-   stb%M0 %r1,%0
-   mtsar %r1
-   {mfctl|mfctl,w} %%sar,%0
-   fcpy,sgl %f1,%0"
-  [(set_attr "type" "move,move,move,shift,load,store,move,move,move")
-   (set_attr "pa_combine_type" "addmove")
-   (set_attr "length" "4,4,4,4,4,4,4,4,4")])
-
-(define_insn ""
-  [(set (match_operand:QI 0 "move_dest_operand"
 			  "=r,r,r,r,r,Q,!*q,!r")
 	(match_operand:QI 1 "move_src_operand"
 			  "r,J,N,K,RQ,rM,!rM,!*q"))]
   "(register_operand (operands[0], QImode)
-    || reg_or_0_operand (operands[1], QImode))
-   && TARGET_SOFT_FLOAT"
+    || reg_or_0_operand (operands[1], QImode))"
   "@
    copy %1,%0
    ldi %1,%0
Index: config/pa/pa-protos.h
===================================================================
--- config/pa/pa-protos.h	(revision 130587)
+++ config/pa/pa-protos.h	(working copy)
@@ -175,6 +175,9 @@
 					 unsigned HOST_WIDE_INT,
 					 unsigned int);
 extern void pa_hpux_asm_output_external (FILE *, tree, const char *);
+extern bool pa_cannot_change_mode_class (enum machine_mode, enum machine_mode,
+					 enum reg_class);
+extern bool pa_modes_tieable_p (enum machine_mode, enum machine_mode);
 
 extern const int magic_milli[];
 extern int shadd_constant_p (int);
Index: config/pa/pa-64.h
===================================================================
--- config/pa/pa-64.h	(revision 130587)
+++ config/pa/pa-64.h	(working copy)
@@ -84,3 +84,17 @@
    want aggregates padded down.  */
 
 #define PAD_VARARGS_DOWN (!AGGREGATE_TYPE_P (type))
+
+/* In the PA architecture, it is not possible to directly move data
+   between GENERAL_REGS and FP_REGS.  On the 32-bit port, we use the
+   location at SP-16 because PA 1.X only supports 5-bit immediates for
+   floating-point loads and stores.  We don't expose this location in
+   the RTL to avoid scheduling related problems.  For example, the
+   store and load could be separated by a call to a pure or const
+   function which has no frame and this function might also use SP-16.
+   We have 14-bit immediates on the 64-bit port, so we use secondary
+   memory for the copies.  */
+#define SECONDARY_MEMORY_NEEDED(CLASS1, CLASS2, MODE) \
+  (MAYBE_FP_REG_CLASS_P (CLASS1) != FP_REG_CLASS_P (CLASS2)		\
+   || MAYBE_FP_REG_CLASS_P (CLASS2) != FP_REG_CLASS_P (CLASS1))
+
Index: config/pa/pa.c
===================================================================
--- config/pa/pa.c	(revision 130587)
+++ config/pa/pa.c	(working copy)
@@ -5684,30 +5684,66 @@
   if (regno >= FIRST_PSEUDO_REGISTER || GET_CODE (x) == SUBREG)
     regno = true_regnum (x);
 
-  /* Handle out of range displacement for integer mode loads/stores of
-     FP registers.  */
-  if (((regno >= FIRST_PSEUDO_REGISTER || regno == -1)
-       && GET_MODE_CLASS (mode) == MODE_INT
-       && FP_REG_CLASS_P (class))
-      || (class == SHIFT_REGS && (regno <= 0 || regno >= 32)))
+  /* In order to allow 14-bit displacements in integer loads and stores,
+     we need to prevent reload from generating out of range integer mode
+     loads and stores to the floating point registers.  Previously, we
+     used to call for a secondary reload and have emit_move_sequence()
+     fix the instruction sequence.  However, reload occasionally wouldn't
+     generate the reload and we would end up with an invalid REG+D memory
+     address.  So, now we use an intermediate general register for most
+     memory loads and stores.  */
+  if ((regno >= FIRST_PSEUDO_REGISTER || regno == -1)
+      && GET_MODE_CLASS (mode) == MODE_INT
+      && FP_REG_CLASS_P (class))
     {
+      /* Reload passes (mem:SI (reg/f:DI 30 %r30) when it wants to check
+	 the secondary reload needed for a pseudo.  It never passes a
+	 REG+D address.  */
+      if (GET_CODE (x) == MEM)
+	{
+	  x = XEXP (x, 0);
+
+	  /* We don't need an intermediate for indexed and LO_SUM DLT
+	     memory addresses.  When INT14_OK_STRICT is true, it might
+	     appear that we could directly allow register indirect
+	     memory addresses.  However, this doesn't work because we
+	     don't support SUBREGs in floating-point register copies
+	     and reload doesn't tell us when it's going to use a SUBREG.  */
+	  if (IS_INDEX_ADDR_P (x)
+	      || IS_LO_SUM_DLT_ADDR_P (x))
+	    return NO_REGS;
+
+	  /* Otherwise, we need an intermediate general register.  */
+	  return GENERAL_REGS;
+	}
+
+      /* Request a secondary reload with a general scratch register
+	 for everthing else.  ??? Could symbolic operands be handled
+	 directly when generating non-pic PA 2.0 code?  */
       sri->icode = in_p ? reload_in_optab[mode] : reload_out_optab[mode];
       return NO_REGS;
     }
 
+  /* We need a secondary register (GPR) for copies between the SAR
+     and anything other than a general register.  */
+  if (class == SHIFT_REGS && (regno <= 0 || regno >= 32))
+    {
+      sri->icode = in_p ? reload_in_optab[mode] : reload_out_optab[mode];
+      return NO_REGS;
+    }
+
   /* A SAR<->FP register copy requires a secondary register (GPR) as
      well as secondary memory.  */
   if (regno >= 0 && regno < FIRST_PSEUDO_REGISTER
-      && ((REGNO_REG_CLASS (regno) == SHIFT_REGS && FP_REG_CLASS_P (class))
-	  || (class == SHIFT_REGS
-	      && FP_REG_CLASS_P (REGNO_REG_CLASS (regno)))))
+      && (REGNO_REG_CLASS (regno) == SHIFT_REGS
+      && FP_REG_CLASS_P (class)))
     {
       sri->icode = in_p ? reload_in_optab[mode] : reload_out_optab[mode];
       return NO_REGS;
     }
 
   /* Secondary reloads of symbolic operands require %r1 as a scratch
-     register when we're generating PIC code and the operand isn't
+     register when we're generating PIC code and when the operand isn't
      readonly.  */
   if (GET_CODE (x) == HIGH)
     x = XEXP (x, 0);
@@ -9567,4 +9603,62 @@
 }
 #endif
 
+/* Return true if a change from mode FROM to mode TO for a register
+   in register class CLASS is invalid.  */
+
+bool
+pa_cannot_change_mode_class (enum machine_mode from, enum machine_mode to,
+			     enum reg_class class)
+{
+  if (from == to)
+    return false;
+
+  /* Reject changes to/from complex and vector modes.  */
+  if (COMPLEX_MODE_P (from) || VECTOR_MODE_P (from)
+      || COMPLEX_MODE_P (to) || VECTOR_MODE_P (to))
+    return true;
+      
+  if (GET_MODE_SIZE (from) == GET_MODE_SIZE (to))
+    return false;
+
+  /* There is no way to load QImode or HImode values directly from
+     memory.  SImode loads to the FP registers are not zero extended.
+     On the 64-bit target, this conflicts with the definition of
+     LOAD_EXTEND_OP.  Thus, we can't allow changing between modes
+     with different sizes in the floating-point registers.  */
+  if (MAYBE_FP_REG_CLASS_P (class))
+    return true;
+
+  /* HARD_REGNO_MODE_OK places modes with sizes larger than a word
+     in specific sets of registers.  Thus, we cannot allow changing
+     to a larger mode when it's larger than a word.  */
+  if (GET_MODE_SIZE (to) > UNITS_PER_WORD
+      && GET_MODE_SIZE (to) > GET_MODE_SIZE (from))
+    return true;
+
+  return false;
+}
+
+/* Returns TRUE if it is a good idea to tie two pseudo registers
+   when one has mode MODE1 and one has mode MODE2.
+   If HARD_REGNO_MODE_OK could produce different values for MODE1 and MODE2,
+   for any hard reg, then this must be FALSE for correct output.
+   
+   We should return FALSE for QImode and HImode because these modes
+   are not ok in the floating-point registers.  However, this prevents
+   tieing these modes to SImode and DImode in the general registers.
+   So, this isn't a good idea.  We rely on HARD_REGNO_MODE_OK and
+   CANNOT_CHANGE_MODE_CLASS to prevent these modes from being used
+   in the floating-point registers.  */
+
+bool
+pa_modes_tieable_p (enum machine_mode mode1, enum machine_mode mode2)
+{
+  /* Don't tie modes in different classes.  */
+  if (GET_MODE_CLASS (mode1) != GET_MODE_CLASS (mode2))
+    return false;
+
+  return true;
+}
+
 #include "gt-pa.h"
Index: config/pa/pa.h
===================================================================
--- config/pa/pa.h	(revision 130587)
+++ config/pa/pa.h	(working copy)
@@ -349,7 +349,7 @@
    If HARD_REGNO_MODE_OK could produce different values for MODE1 and MODE2,
    for any hard reg, then this must be 0 for correct output.  */
 #define MODES_TIEABLE_P(MODE1, MODE2) \
-  (GET_MODE_CLASS (MODE1) == GET_MODE_CLASS (MODE2))
+  pa_modes_tieable_p (MODE1, MODE2)
 
 /* Specify the registers used for certain standard purposes.
    The values of these macros are register numbers.  */
@@ -497,17 +497,6 @@
 #define MAYBE_FP_REG_CLASS_P(CLASS) \
   reg_classes_intersect_p ((CLASS), FP_REGS)
 
-/* On the PA it is not possible to directly move data between
-   GENERAL_REGS and FP_REGS.  On the 32-bit port, we use the
-   location at SP-16.  We don't expose this location in the RTL to
-   avoid scheduling related problems.  For example, the store and
-   load could be separated by a call to a pure or const function
-   which has no frame and uses SP-16.  */
-#define SECONDARY_MEMORY_NEEDED(CLASS1, CLASS2, MODE)			\
-  (TARGET_64BIT								\
-   && (MAYBE_FP_REG_CLASS_P (CLASS1) != FP_REG_CLASS_P (CLASS2)		\
-       || MAYBE_FP_REG_CLASS_P (CLASS2) != FP_REG_CLASS_P (CLASS1)))
-
 
 /* Stack layout; function entry, exit and calling.  */
 
@@ -1121,6 +1110,24 @@
    && REG_OK_FOR_BASE_P (XEXP (OP, 0))			\
    && GET_CODE (XEXP (OP, 1)) == UNSPEC)
 
+/* Nonzero if 14-bit offsets can be used for all loads and stores.
+   This is not possible when generating PA 1.x code as floating point
+   loads and stores only support 5-bit offsets.  Note that we do not
+   forbid the use of 14-bit offsets in GO_IF_LEGITIMATE_ADDRESS.
+   Instead, we use pa_secondary_reload() to reload integer mode
+   REG+D memory addresses used in floating point loads and stores.
+
+   FIXME: the ELF32 linker clobbers the LSB of the FP register number
+   in PA 2.0 floating-point insns with long displacements.  This is
+   because R_PARISC_DPREL14WR and other relocations like it are not
+   yet supported by GNU ld.  For now, we reject long displacements
+   on this target.  */
+
+#define INT14_OK_STRICT \
+  (TARGET_SOFT_FLOAT                                                   \
+   || TARGET_DISABLE_FPREGS                                            \
+   || (TARGET_PA_20 && !TARGET_ELF32))
+
 /* The macros REG_OK_FOR..._P assume that the arg is a REG rtx
    and check its validity for a certain class.
    We have two alternate definitions for each of them.
@@ -1139,16 +1146,18 @@
 /* Nonzero if X is a hard reg that can be used as an index
    or if it is a pseudo reg.  */
 #define REG_OK_FOR_INDEX_P(X) \
-(REGNO (X) && (REGNO (X) < 32 || REGNO (X) >= FIRST_PSEUDO_REGISTER))
+  (REGNO (X) && (REGNO (X) < 32 || REGNO (X) >= FIRST_PSEUDO_REGISTER))
+
 /* Nonzero if X is a hard reg that can be used as a base reg
    or if it is a pseudo reg.  */
 #define REG_OK_FOR_BASE_P(X) \
-(REGNO (X) && (REGNO (X) < 32 || REGNO (X) >= FIRST_PSEUDO_REGISTER))
+  (REGNO (X) && (REGNO (X) < 32 || REGNO (X) >= FIRST_PSEUDO_REGISTER))
 
 #else
 
 /* Nonzero if X is a hard reg that can be used as an index.  */
 #define REG_OK_FOR_INDEX_P(X) REGNO_OK_FOR_INDEX_P (REGNO (X))
+
 /* Nonzero if X is a hard reg that can be used as a base reg.  */
 #define REG_OK_FOR_BASE_P(X) REGNO_OK_FOR_BASE_P (REGNO (X))
 
@@ -1200,12 +1209,8 @@
 
    We treat a SYMBOL_REF as legitimate if it is part of the current
    function's constant-pool, because such addresses can actually be
-   output as REG+SMALLINT. 
+   output as REG+SMALLINT.  */
 
-   Note we only allow 5-bit immediates for access to a constant address;
-   doing so avoids losing for loading/storing a FP register at an address
-   which will not fit in 5 bits.  */
-
 #define VAL_5_BITS_P(X) ((unsigned HOST_WIDE_INT)(X) + 0x10 < 0x20)
 #define INT_5_BITS(X) VAL_5_BITS_P (INTVAL (X))
 
@@ -1232,7 +1237,8 @@
   ((TARGET_64BIT && (MODE) == DImode)					\
    || (MODE) == SImode							\
    || (MODE) == HImode							\
-   || (!TARGET_SOFT_FLOAT && ((MODE) == DFmode || (MODE) == SFmode)))
+   || (MODE) == SFmode							\
+   || (MODE) == DFmode)
 
 /* These are the modes that we allow for unscaled indexing.  */
 #define MODE_OK_FOR_UNSCALED_INDEXING_P(MODE) \
@@ -1240,11 +1246,13 @@
    || (MODE) == SImode							\
    || (MODE) == HImode							\
    || (MODE) == QImode							\
-   || (!TARGET_SOFT_FLOAT && ((MODE) == DFmode || (MODE) == SFmode)))
+   || (MODE) == SFmode							\
+   || (MODE) == DFmode)
 
 #define GO_IF_LEGITIMATE_ADDRESS(MODE, X, ADDR) \
 {									\
-  if ((REG_P (X) && REG_OK_FOR_BASE_P (X))				\
+  if ((REG_P (X) && REG_OK_FOR_BASE_P (X)				\
+       && REGNO (X) != STACK_POINTER_REGNUM)				\
       || ((GET_CODE (X) == PRE_DEC || GET_CODE (X) == POST_DEC		\
 	   || GET_CODE (X) == PRE_INC || GET_CODE (X) == POST_INC)	\
 	  && REG_P (XEXP (X, 0))					\
@@ -1274,20 +1282,10 @@
 			   || (INTVAL (index) % 8) == 0))		\
 		   /* Similarly, the base register for SFmode/DFmode	\
 		      loads and stores with long displacements must	\
-		      be aligned.					\
-									\
-		      FIXME: the ELF32 linker clobbers the LSB of	\
-		      the FP register number in PA 2.0 floating-point	\
-		      insns with long displacements.  This is because	\
-		      R_PARISC_DPREL14WR and other relocations like	\
-		      it are not supported.  For now, we reject long	\
-		      displacements on this target.  */			\
+		      be aligned.  */					\
 		   || (((MODE) == SFmode || (MODE) == DFmode)		\
-		       && (TARGET_SOFT_FLOAT				\
-			   || (TARGET_PA_20				\
-			       && !TARGET_ELF32				\
-			       && (INTVAL (index)			\
-				   % GET_MODE_SIZE (MODE)) == 0)))))	\
+		       && INT14_OK_STRICT				\
+		       && (INTVAL (index) % GET_MODE_SIZE (MODE)) == 0))) \
 	       || INT_5_BITS (index)))					\
 	goto ADDR;							\
       if (!TARGET_DISABLE_INDEXING					\
@@ -1387,7 +1385,7 @@
   rtx new, temp = NULL_RTX;						\
 									\
   mask = (GET_MODE_CLASS (MODE) == MODE_FLOAT				\
-	  ? (TARGET_PA_20 && !TARGET_ELF32 ? 0x3fff : 0x1f) : 0x3fff);	\
+	  ? (INT14_OK_STRICT ? 0x3fff : 0x1f) : 0x3fff);		\
 									\
   if (optimize && GET_CODE (AD) == PLUS)				\
     temp = simplify_binary_operation (PLUS, Pmode,			\
@@ -1409,9 +1407,10 @@
 	newoffset = offset & ~mask;					\
 									\
       /* Ensure that long displacements are aligned.  */		\
-      if (!VAL_5_BITS_P (newoffset)					\
-	  && GET_MODE_CLASS (MODE) == MODE_FLOAT)			\
-	newoffset &= ~(GET_MODE_SIZE (MODE) -1);			\
+      if (mask == 0x3fff						\
+	  && (GET_MODE_CLASS (MODE) == MODE_FLOAT			\
+	      || (TARGET_64BIT && (MODE) == DImode)))			\
+	newoffset &= ~(GET_MODE_SIZE (MODE) - 1);			\
 									\
       if (newoffset != 0 && VAL_14_BITS_P (newoffset))			\
 	{								\
Index: config/pa/pa32-regs.h
===================================================================
--- config/pa/pa32-regs.h	(revision 130587)
+++ config/pa/pa32-regs.h	(working copy)
@@ -172,8 +172,7 @@
 #define VALID_FP_MODE_P(MODE)						\
   ((MODE) == SFmode || (MODE) == DFmode					\
    || (MODE) == SCmode || (MODE) == DCmode				\
-   || (MODE) == QImode || (MODE) == HImode || (MODE) == SImode		\
-   || (TARGET_PA_11 && (MODE) == DImode))
+   || (MODE) == SImode || (TARGET_PA_11 && (MODE) == DImode))
 
 /* Value is 1 if hard register REGNO can hold a value of machine-mode MODE.
 
@@ -288,6 +287,11 @@
   {0x00000000, 0x00000000, 0x01000000},	/* SHIFT_REGS */		\
   {0xfffffffe, 0xffffffff, 0x01ffffff}}	/* ALL_REGS */
 
+/* Defines invalid mode changes.  */
+
+#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) \
+  pa_cannot_change_mode_class (FROM, TO, CLASS)
+
 /* Return the class number of the smallest class containing
    reg number REGNO.  This could be a conditional expression
    or could index an array.  */
Index: config/pa/pa64-regs.h
===================================================================
--- config/pa/pa64-regs.h	(revision 130587)
+++ config/pa/pa64-regs.h	(working copy)
@@ -156,8 +156,7 @@
 #define VALID_FP_MODE_P(MODE)						\
   ((MODE) == SFmode || (MODE) == DFmode					\
    || (MODE) == SCmode || (MODE) == DCmode				\
-   || (MODE) == QImode || (MODE) == HImode || (MODE) == SImode		\
-   || (MODE) == DImode)
+   || (MODE) == SImode || (MODE) == DImode)
 
 /* Value is 1 if hard register REGNO can hold a value of machine-mode MODE.
    On the HP-PA, the cpu registers can hold any mode.  We
@@ -242,18 +241,11 @@
   {0x00000000, 0x10000000},	/* SHIFT_REGS */		\
   {0xfffffffe, 0x1fffffff}}	/* ALL_REGS */
 
-/* Defines invalid mode changes.
+/* Defines invalid mode changes.  */
 
-   SImode loads to floating-point registers are not zero-extended.
-   The definition for LOAD_EXTEND_OP specifies that integer loads
-   narrower than BITS_PER_WORD will be zero-extended.  As a result,
-   we inhibit changes from SImode unless they are to a mode that is
-   identical in size.  */
+#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) \
+  pa_cannot_change_mode_class (FROM, TO, CLASS)
 
-#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS)		\
-  ((FROM) == SImode && GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO)       \
-   ? reg_classes_intersect_p (CLASS, FP_REGS) : 0)
-
 /* Return the class number of the smallest class containing
    reg number REGNO.  This could be a conditional expression
    or could index an array.  */


Comment 17 John David Anglin 2007-12-09 18:02:32 UTC
Subject: Bug 34091

Author: danglin
Date: Sun Dec  9 18:02:08 2007
New Revision: 130725

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=130725
Log:
	PR middle-end/32889
	PR target/34091
	* pa.md: Consolidate HImode and QImode move patterns into one pattern
	each, eliminating floating-point alternatives.
	* pa-protos.h (pa_cannot_change_mode_class, pa_modes_tieable_p):
	Declare functions.
	* pa-64.h (SECONDARY_MEMORY_NEEDED): Define here.
	* pa.c (pa_secondary_reload): Use an intermediate general register
	for copies to/from floating-point register classes.  Simplify code
	SHIFT_REGS class.  Provide additional comments.
	(pa_cannot_change_mode_class, pa_modes_tieable_p): New functions.
	* pa.h (MODES_TIEABLE_P): Use pa_modes_tieable_p.
	(SECONDARY_MEMORY_NEEDED): Delete define.
	(INT14_OK_STRICT): Define.
	(MODE_OK_FOR_SCALED_INDEXING_P): Allow SFmode and DFmode when using
	soft float.
	(MODE_OK_FOR_UNSCALED_INDEXING_P): Likewise.
	(GO_IF_LEGITIMATE_ADDRESS): Use INT14_OK_STRICT in REG+D case for
	SFmode and DFmode.
	(LEGITIMIZE_RELOAD_ADDRESS): Use INT14_OK_STRICT in mask selection.
	Align DImode offsets when generating 64-bit code.
	* pa32-regs.h (VALID_FP_MODE_P): Remove QImode and HImode.
	(CANNOT_CHANGE_MODE_CLASS): Define.
	* pa64-regs.h (VALID_FP_MODE_P): Remove QImode and HImode.
	(CANNOT_CHANGE_MODE_CLASS): Define using pa_cannot_change_mode_class.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/pa/pa-64.h
    trunk/gcc/config/pa/pa-protos.h
    trunk/gcc/config/pa/pa.c
    trunk/gcc/config/pa/pa.h
    trunk/gcc/config/pa/pa.md
    trunk/gcc/config/pa/pa32-regs.h
    trunk/gcc/config/pa/pa64-regs.h

Comment 18 Jakub Jelinek 2007-12-09 19:50:11 UTC
Can this be closed now?  Should the testcase (e.g. the #c3 one) go into the testsuite?
Comment 19 dave 2007-12-09 20:39:00 UTC
Subject: Re:  [4.2/4.3 Regression] ICE in reload_cse_simplify_operands, at postreload.c:392

> Can this be closed now?  Should the testcase (e.g. the #c3 one) go into the
> testsuite?

I'm still pondering what to do about 4.2.  See my comments to gcc-patches:
http://gcc.gnu.org/ml/gcc-patches/2007-12/msg00423.html
If I had more confidence in the change, I would say yes.

I believe adding the testcase is a good idea and will work on a change
to add it.

Dave
Comment 20 Jakub Jelinek 2007-12-09 21:31:17 UTC
Fixed for 4.3 then.
Comment 21 John David Anglin 2007-12-10 03:17:38 UTC
Subject: Bug 34091

Author: danglin
Date: Mon Dec 10 03:17:24 2007
New Revision: 130735

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=130735
Log:
	PR middle-end/32889
	PR target/34091
	* pa.md: Consolidate HImode and QImode move patterns into one pattern
	each, eliminating floating-point alternatives.
	* pa-protos.h (pa_cannot_change_mode_class, pa_modes_tieable_p):
	Declare functions.
	* pa-64.h (SECONDARY_MEMORY_NEEDED): Define here.
	* pa.c (pa_secondary_reload): Use an intermediate general register
	for copies to/from floating-point register classes.  Simplify code
	SHIFT_REGS class.  Provide additional comments.
	(pa_cannot_change_mode_class, pa_modes_tieable_p): New functions.
	* pa.h (MODES_TIEABLE_P): Use pa_modes_tieable_p.
	(SECONDARY_MEMORY_NEEDED): Delete define.
	(INT14_OK_STRICT): Define.
	(MODE_OK_FOR_SCALED_INDEXING_P): Allow SFmode and DFmode when using
	soft float.
	(MODE_OK_FOR_UNSCALED_INDEXING_P): Likewise.
	(GO_IF_LEGITIMATE_ADDRESS): Use INT14_OK_STRICT in REG+D case for
	SFmode and DFmode.
	(LEGITIMIZE_RELOAD_ADDRESS): Use INT14_OK_STRICT in mask selection.
	Align DImode offsets when generating 64-bit code.
	* pa32-regs.h (VALID_FP_MODE_P): Remove QImode and HImode.
	(CANNOT_CHANGE_MODE_CLASS): Define.
	* pa64-regs.h (VALID_FP_MODE_P): Remove QImode and HImode.
	(CANNOT_CHANGE_MODE_CLASS): Define using pa_cannot_change_mode_class.


Modified:
    branches/gcc-4_2-branch/gcc/ChangeLog
    branches/gcc-4_2-branch/gcc/config/pa/pa-64.h
    branches/gcc-4_2-branch/gcc/config/pa/pa-protos.h
    branches/gcc-4_2-branch/gcc/config/pa/pa.c
    branches/gcc-4_2-branch/gcc/config/pa/pa.h
    branches/gcc-4_2-branch/gcc/config/pa/pa.md
    branches/gcc-4_2-branch/gcc/config/pa/pa32-regs.h
    branches/gcc-4_2-branch/gcc/config/pa/pa64-regs.h

Comment 22 John David Anglin 2007-12-10 03:26:18 UTC
Fixed by patch.
Comment 23 John David Anglin 2007-12-14 01:02:13 UTC
Subject: Bug 34091

Author: danglin
Date: Fri Dec 14 01:01:58 2007
New Revision: 130927

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=130927
Log:
	PR target/34091
	* gcc.c-torture/compile/pr34091.c: New test.


Added:
    trunk/gcc/testsuite/gcc.c-torture/compile/pr34091.c
Modified:
    trunk/gcc/testsuite/ChangeLog