Bug 18641

Summary:	[4.0 Regression] Another ICE caused by reload of a pseudo reg into f0 for a DImode expr
Product:	gcc	Reporter:	Fariborz Jahanian <fjahanian>
Component:	middle-end	Assignee:	Ulrich Weigand <uweigand>
Status:	RESOLVED FIXED
Severity:	normal	CC:	aj, dje, gcc-bugs, micis, pinskia, rth
Priority:	P2	Keywords:	ice-on-valid-code
Version:	4.0.0
Target Milestone:	4.0.0
Host:	powerpc-apple-darwin7.0.0	Target:	powerpc-apple-darwin7.0.0
Build:	powerpc-apple-darwin7.0.0	Known to work:	3.4.0 3.3.2
Known to fail:	4.0.0	Last reconfirmed:	2004-11-24 01:27:57

Description Fariborz Jahanian 2004-11-24 01:16:33 UTC

This is similar to PR/152866. In the following test case compiled with -O0
gcc-4.0 produces following patter in reload phase:

(insn 68 47 67 7 (set (reg:DI 32 f0)
        (const_int 4294967295 [0xffffffff])) 354 {*movdi_internal32} (nil)
    (nil))

This pattern cause ICE in gen_reg_rtx.

This is the usual problem. Reload decides to use a float register for a 'long long' expression, a
constant in this case because this is legit. for powerpc. But ppc patterns cannot handle it.

/* Test case */
void crc()
{
    int  toread;
    long long nleft;
    unsigned char buf[(128 * 1024)];

    nleft = 0;
    while (toread = (nleft < (2147483647 * 2U + 1U)) ? nleft: (2147483647 * 2U + 1U) )
        ;
}

Comment 1 Andrew Pinski 2004-11-24 01:27:57 UTC

Confirmed, looking into a little more.

Comment 2 Andrew Pinski 2004-11-24 01:53:18 UTC

Back in 20041007, we got a different ICE:
t.c: In function 'crc':
t.c:11: error: unrecognizable insn:
(insn 88 47 89 (set (subreg:SI (reg:DI 32 f0) 0)
        (const_int 0 [0x0])) -1 (nil)
    (nil))
t.c:11: internal compiler error: in extract_insn, at /recog.c:2034
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://gcc.gnu.org/bugs.html> for instructions.

Comment 3 Andrew Pinski 2004-11-24 01:55:19 UTC

This is a regression from 3.4.0.

Comment 4 Andrew Pinski 2004-11-24 01:59:57 UTC

It worked with:
3.3.2 20030908
3.4 20030525
3.5-tree-ssa 20030620 (merged 20030525)
3.5-tree-ssa 20030928 (merged 20030923)
3.5.0 20040620
3.5.0 20040808

But it failed with:
4.0.0 20040924

Comment 5 Andrew Pinski 2004-11-24 02:04:05 UTC

(In reply to comment #4)
> But it failed with:
> 4.0.0 20040924
And anything after that.

Comment 6 Andrew Pinski 2004-11-24 05:07:18 UTC

3.5.0 20040829 fails also.

Comment 7 Andrew Pinski 2004-11-24 05:55:59 UTC

It fails in:
3.5.0 20040824
But passes in:
3.5.0 20040822

Comment 8 Andrew Pinski 2004-11-24 06:01:42 UTC

Reverting:
2004-08-22  Ulrich Weigand  <uweigand@de.ibm.com>

        * reload.c (find_reloads_address): Make return value tri-state.
        Return -1 if LEGITIMIZE_RELOAD_ADDRESS succeeded.
        (find_reloads): Assume that reloaded addresses match 'o' or
        EXTRA_MEMORY_CONSTRAINT constraints only if find_reloads_address
        returned 1 (not -1).  Omit optional reloads for address operands
        only if find_reloads_address returned 1 (not -1).

Fixes the ICE.

Comment 9 Andrew Pinski 2004-11-24 06:05:19 UTC

*** Bug 17606 has been marked as a duplicate of this bug. ***

Comment 10 Andrew Pinski 2004-11-24 06:08:30 UTC

I should mention I also see this in a testcase in Ada acats testsuite on ppc-darwin.

Comment 11 Ulrich Weigand 2004-11-24 14:50:13 UTC

We have here the following situation before reload:
(insn 27 46 28 7 (set (reg:DI 118 [ D.1118 ])
        (const_int 4294967295 [0xffffffff])) 313 {*movdi_internal32} (nil)
    (nil))
where reg 118 gets assigned a stack slot, and due to the large
stack frame size this slot is not directly addressable.

Now, when reverting my patch, what happens is that LEGITIMIZE_RELOAD_ADDRESS
finds a way to construct the stack slot address, and the constant is reloaded
into a register (pair):

(insn 67 46 66 7 (set (reg:DI 2 r2)
        (const_int 4294967295 [0xffffffff])) 313 {*movdi_internal32} (nil)
    (nil))

(insn 66 67 27 7 (set (reg:SI 9 r9)
        (plus:SI (reg/f:SI 30 r30)
            (const_int 131072 [0x20000]))) 64 {*addsi3_internal1} (nil)
    (nil))

(insn 27 66 28 7 (set (mem:DI (plus:SI (reg:SI 9 r9)
                (const_int 40 [0x28])) [0 D.1118+0 S8 A8])
        (reg:DI 2 r2)) 313 {*movdi_internal32} (nil)
    (nil))

Note that insn 27 now uses the r -> o alternative of *movdi_internal32;
this is allowed only if the address is offsettable.

Now, that address is the one that was constructed by 
LEGITIMIZE_RELOAD_ADDRESS.  Unfortunately, reload common code has
no way to actually look at what LEGITIMIZE_RELOAD_ADDRESS does,
so that it could decide whether or not the address constructed
is in fact an offsettable one or not.

Before my 2004-08-22 patch, reload simply always assumed the address
is offsettable, due to an apparent bug in LEGITIMIZE_RELOAD_ADDRESS
handling: reload assumed that L_R_A would always completely replace
the address by a single base register (which of course implies the
address is offsettable).  This bug caused problems on s390.

My patch removed this erroneous assumption that L_R_A always completly
replaces the address; as we don't know anything further we then have
to make the conservative assumption that addresses constructed by L_R_A
are never offsettable.  Thus reload doesn't accept the r -> o alternative
of *movdi_internal32 any more; the one it chooses instead is f -> m.

This implies reloading the constant into a floating point register,
and that's what reload goes ahead and does:

(insn 67 46 66 7 (set (reg:DI 32 f0)
        (const_int 4294967295 [0xffffffff])) 313 {*movdi_internal32} (nil)
    (nil))

(insn 66 67 27 7 (set (reg:SI 2 r2)
        (plus:SI (reg/f:SI 30 r30)
            (const_int 131072 [0x20000]))) 64 {*addsi3_internal1} (nil)
    (nil))

(insn 27 66 28 7 (set (mem:DI (plus:SI (reg:SI 2 r2)
                (const_int 40 [0x28])) [0 D.1118+0 S8 A8])
        (reg:DI 32 f0)) 313 {*movdi_internal32} (nil)
    (nil))

However, the insn 67 emitted thus is not actually implemented by
any of the alternatives or splitters; thus the crash later on.

This would appear to be a latent bug in the rs6000 back end.
Reload insns generated during the gen_reload phase *must* be
implemented by the backend.  I'm not familiar enough with the
platform to suggested the best way to do so; one obvious option
would be force the constant to memory using either from within
the movdi expander or via a secondary input reload.


The next question is whether the new code, if implemented 
correctly, is better or worse than the old code -- again this
is a rs6000 back-end issue.  

However, one middle-end question remains: while it is obviously
wrong for reload to assume that L_R_A always results in a simple
base register, at least on rs6000 is appears to be the case that
L_R_A always results in an *offsettable* address.  If this is true,
we might be missing optimizations by not exploiting that knowledge
in reload any more.

One way might be to extend the L_R_A interface in a way that would
allow the back end to inform the middle end about properties of
the address is has constructed; I'm not sure how this interface 
should look in detail.

The other option would be to simply go back to having the middle end
assume the L_R_A constructed addresses are always offsettable.  This
could be implemented by something like the following patch:
Index: reload.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/reload.c,v
retrieving revision 1.258
diff -c -p -r1.258 reload.c
*** reload.c    9 Nov 2004 17:29:02 -0000       1.258
--- reload.c    24 Nov 2004 14:49:04 -0000
*************** find_reloads (rtx insn, int replace, int
*** 3189,3196 ****
                     && ((ind_levels ? offsettable_memref_p (operand)
                          : offsettable_nonstrict_memref_p (operand))
                         /* A reloaded address is offsettable because it is now
!                           just a simple register indirect.  */
!                        || address_reloaded[i] == 1))
                    || (REG_P (operand)
                        && REGNO (operand) >= FIRST_PSEUDO_REGISTER
                        && reg_renumber[REGNO (operand)] < 0
--- 3189,3198 ----
                     && ((ind_levels ? offsettable_memref_p (operand)
                          : offsettable_nonstrict_memref_p (operand))
                         /* A reloaded address is offsettable because it is now
!                           just a simple register indirect.  Addresses built
!                           via LEGITIMIZE_RELOAD_ADDRESS must always be
!                           offsettable as well.  */
!                        || address_reloaded[i]))
                    || (REG_P (operand)
                        && REGNO (operand) >= FIRST_PSEUDO_REGISTER
                        && reg_renumber[REGNO (operand)] < 0

(This requirement should then be documented in the L_R_A docs as well.)

Comment 12 David Edelsohn 2004-11-24 18:18:59 UTC

Allowing the middle-end to know that the L_R_A address is offsettable looks 
like a better solution to me.  The design is an issue for RTH.  One 
possibility is a target macro to decide if L_R_A addresses should be assumed 
offsettable:

#if LRA_OFFSETTABLE
      || address_reloaded[i] > 0
#else
      || address_reloaded[i] == 1
#endif
      )

Finer granlarity information from LRA is more complicated.

Comment 13 Richard Henderson 2004-11-25 20:31:41 UTC

I think the most ideal solution would be to tell L_R_A whether the result is 
required to be offsettable.  It can then fail the transformation if it cannot
produce the required offsettable result, and then normal reload things will
happen to fix things up.

How feasable is this with the current code base?

Comment 14 Ulrich Weigand 2004-11-25 21:53:18 UTC

This is not very feasible, since whether or not the address is required to be offsettable depends on which alternative is selected; and the current flow of find_reloads does all the address reloading (including L_R_A) *before* even looking  at the alternatives.  So we currently first make all addresses valid (for the most general type of address the target supports), then select an alternative, and if that alternative requires a more strict form of address, we simply reload into a base register.  This is of course somewhat inefficient, but changing it would require major rework of find_reloads.

Comment 15 Ulrich Weigand 2004-11-25 21:57:31 UTC

(Trying again, this time hopefully with proper formatting.)

This is not very feasible, since whether or not the address
is required to be offsettable depends on which alternative is
selected; and the current flow of find_reloads does all the
address reloading (including L_R_A) *before* even looking
at the alternatives.

So we currently first make all addresses valid (for the most
general type of address the target supports), then select an
alternative, and if that alternative requires a more strict
form of address, we simply reload into a base register.

This is of course somewhat inefficient, but changing it would
require major rework of find_reloads.

Comment 16 Richard Henderson 2004-11-25 22:44:11 UTC

Then I think that we have to assume that the result of L_R_A is not offsettable.

Even in the case of rs6000, I believe the definition is only *usually*
offsettable, but that this is not a 100% iron-clad guarantee.  In particular,
an offset like 0x17fff would get rendered as 0x10000 + 0x7fff, and there's no
room left in the low part for any further offsetting.

Comment 17 Laurent GUERBY 2004-11-28 18:12:39 UTC

(In reply to comment #10)
> I should mention I also see this in a testcase in Ada acats testsuite on
ppc-darwin.

For the record this is ACATS c761010 and it's ppc only (works on x86 and
x86_64), from Andrew's log:

/Users/pinskia/src/local2/gcc/objdir/gcc/xgcc -c
-B/Users/pinskia/src/local2/gcc/objdir/gcc/ -gnatws -O2
-I/Users/pinskia/src/local2/gcc/objdir/gcc/testsuite/ada/acats/support
c761010_1-var_strings-types.adb
c761010_1-var_strings-types.ads: In function 'C761010_1.Var_Strings.Types._Elabs':
c761010_1-var_strings-types.ads:2: error: insn does not satisfy its constraints:
(insn 1498 1466 109 6 (set (reg:DI 32 f0)
        (const_int 8 [0x8])) 313 {*movdi_internal32} (nil)
    (nil))
+===========================GNAT BUG DETECTED==============================+
| 4.0.0 20041127 (experimental) (powerpc-apple-darwin7.6.0) GCC error:     |
| in reload_cse_simplify_operands, at postreload.c:391                     |
| Error detected at c761010_1-var_strings-types.adb:103:68                 |

Comment 18 Fariborz Jahanian 2004-12-01 22:07:36 UTC

Regardless of how we fix this specific problem; by reverting Ulrich's patch to find_reloads_address, 
making the small change he proposed in find_reloads, or something else, there remains the
problem each time a 64-bit integer constant is loaded into an FPR. This is a cronic problem which
we need to address. Now is as good as ever. Being a newby in this area please bear with me.

I see a couple of solutions:

1) Do not use FPR for a 64-bit constant integers. This is indeed what happens when reverting Ulrich's
    patch. What benefit do we gain by allowing use of FPR for these cases? Don't we always need to
    eventually load the constant into a pair of GPRs, via going to memory first. What are the cases where
    using FPR is beneficial (to reduce register pressure is one answer, but then we still need to go
    to memory and back to GPRs for any useful operation).

2) Handle this special case in the splitter which is used. But this requires going to memory. Can this
    be done in the splitter? This seems to be a better solution if 1) cannot be disallowed.

Comment 19 Ulrich Weigand 2004-12-02 16:14:47 UTC

As to 1), this can be achieved by marking the 'f' alternatives
in the *movdi_internal32 pattern as '!*f'; this will prevent 
both regclass and reload from using the alternatives unless
there was a floating-point register involved already before
reload.

As to 2), the proper way to go via memory would be to return
NO_REGS from PREFERRED_RELOAD_CLASS when asked how to load
a constant into (a superclass of) FLOAT_REGS.  Then reload
takes care of force_const_mem and everything related.

Comment 20 David Edelsohn 2004-12-05 06:09:05 UTC

For option (2), are you suggesting changing PREFERRED_RELOAD_CLASS from

((GET_CODE (X) == CONST_DOUBLE && GET_MODE_CLASS (GET_MODE (X)) == MODE_FLOAT)
 ? NO_REGS : ...)

to

((GET_CODE (X) == CONST_DOUBLE && reg_classes_intersect_p ((CLASS), FLOAT_REGS))
 ? NO_REGS : ...)

to reload any CONST_DOUBLE into an FPR via memory?

Comment 21 Ulrich Weigand 2004-12-05 15:45:49 UTC

Basically yes, except that it's not only about CONST_DOUBLE,
but also CONST_INT.  Maybe even:

((CONSTANT_P (X) && reg_classes_intersect_p ((CLASS), FLOAT_REGS))
 ? NO_REGS : ...)

Comment 22 David Edelsohn 2004-12-06 17:16:44 UTC

Subject: Re:  [4.0 Regression] Another ICE caused by reload of a psuedo reg into f0 for a DImode expr 

	The following patch implements the two suggestions.  It fixes the
ICE on AIX, but continues to produce an ICE on Mac OS X.

	Also, the CONST_INT case only should be relevant in 64-bit mode.

Index: rs6000.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.h,v
retrieving revision 1.348
diff -c -p -r1.348 rs6000.h
*** rs6000.h	27 Nov 2004 22:45:24 -0000	1.348
--- rs6000.h	6 Dec 2004 17:14:18 -0000
*************** enum reg_class
*** 1397,1404 ****
   */
  
  #define PREFERRED_RELOAD_CLASS(X,CLASS)			\
!   (((GET_CODE (X) == CONST_DOUBLE			\
!      && GET_MODE_CLASS (GET_MODE (X)) == MODE_FLOAT)	\
      ? NO_REGS 						\
      : (GET_MODE_CLASS (GET_MODE (X)) == MODE_INT 	\
         && (CLASS) == NON_SPECIAL_REGS)			\
--- 1397,1404 ----
   */
  
  #define PREFERRED_RELOAD_CLASS(X,CLASS)			\
!   (((CONSTANT_P (X)					\
!      && reg_classes_intersect_p ((CLASS), FLOAT_REGS))	\
      ? NO_REGS 						\
      : (GET_MODE_CLASS (GET_MODE (X)) == MODE_INT 	\
         && (CLASS) == NON_SPECIAL_REGS)			\
Index: rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.336
diff -c -p -r1.336 rs6000.md
*** rs6000.md	1 Dec 2004 17:18:38 -0000	1.336
--- rs6000.md	6 Dec 2004 17:14:18 -0000
***************
*** 8496,8502 ****
  ; List r->r after r->"o<>", otherwise reload will try to reload a
  ; non-offsettable address by using r->r which won't make progress.
  (define_insn "*movdi_internal32"
!   [(set (match_operand:DI 0 "nonimmediate_operand" "=o<>,r,r,f,f,m,r")
  	(match_operand:DI 1 "input_operand" "r,r,m,f,m,f,IJKnGHF"))]
    "! TARGET_POWERPC64
     && (gpc_reg_operand (operands[0], DImode)
--- 8496,8502 ----
  ; List r->r after r->"o<>", otherwise reload will try to reload a
  ; non-offsettable address by using r->r which won't make progress.
  (define_insn "*movdi_internal32"
!   [(set (match_operand:DI 0 "nonimmediate_operand" "=o<>,r,r,!*f,!*f,m,r")
  	(match_operand:DI 1 "input_operand" "r,r,m,f,m,f,IJKnGHF"))]
    "! TARGET_POWERPC64
     && (gpc_reg_operand (operands[0], DImode)
***************
*** 8567,8573 ****
  }")
  
  (define_insn "*movdi_internal64"
!   [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,m,r,r,r,r,??f,f,m,r,*h,*h")
  	(match_operand:DI 1 "input_operand" "r,m,r,I,L,nF,R,f,m,f,*h,r,0"))]
    "TARGET_POWERPC64
     && (gpc_reg_operand (operands[0], DImode)
--- 8567,8573 ----
  }")
  
  (define_insn "*movdi_internal64"
!   [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,m,r,r,r,r,!*f,!*f,m,r,*h,*h")
  	(match_operand:DI 1 "input_operand" "r,m,r,I,L,nF,R,f,m,f,*h,r,0"))]
    "TARGET_POWERPC64
     && (gpc_reg_operand (operands[0], DImode)

Comment 23 Andrew Pinski 2004-12-06 17:29:16 UTC

T(In reply to comment #22)
> Subject: Re:  [4.0 Regression] Another ICE caused by reload of a psuedo reg into f0 for a DImode expr 
> 
>         The following patch implements the two suggestions.  It fixes the
> ICE on AIX, but continues to produce an ICE on Mac OS X.

That is because darwin defines its own PREFERRED_RELOAD_CLASS, why I don't know.

Comment 24 Fariborz Jahanian 2004-12-06 17:55:55 UTC

I applied the patch to fsf-mainline (including darwin.h) and it worked for me.
I will do the bootstrap, dejagnu testing and let you know how it went. - Thanks.

Comment 25 Fariborz Jahanian 2004-12-06 23:32:31 UTC

David's patch (including darwin.h patch attached here) successufully bootstrapped, dejagnu tested
on apple-ppc-darwin. Please apply the patch to mainline.

Index: darwin.h
===============================================================
====
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/darwin.h,v
retrieving revision 1.72
diff -c -p -r1.72 darwin.h
*** darwin.h    27 Nov 2004 22:45:22 -0000      1.72
--- darwin.h    6 Dec 2004 17:56:34 -0000
*************** do {                                                                    \
*** 344,351 ****
  
  #undef PREFERRED_RELOAD_CLASS
  #define PREFERRED_RELOAD_CLASS(X,CLASS)                               \
!   ((GET_CODE (X) == CONST_DOUBLE                              \
!     && GET_MODE_CLASS (GET_MODE (X)) == MODE_FLOAT)           \
     ? NO_REGS                                                  \
     : ((GET_CODE (X) == SYMBOL_REF || GET_CODE (X) == HIGH)    \
        && reg_class_subset_p (BASE_REGS, (CLASS)))             \
--- 344,351 ----
  
  #undef PREFERRED_RELOAD_CLASS
  #define PREFERRED_RELOAD_CLASS(X,CLASS)                               \
!   ((CONSTANT_P (X)                                            \
!       && reg_classes_intersect_p ((CLASS), FLOAT_REGS))        \
     ? NO_REGS                                                  \
     : ((GET_CODE (X) == SYMBOL_REF || GET_CODE (X) == HIGH)    \
        && reg_class_subset_p (BASE_REGS, (CLASS)))             \

Comment 26 David Edelsohn 2004-12-07 03:10:29 UTC

Subject: Re:  [4.0 Regression] Another ICE caused by reload of a psuedo reg into f0 for a DImode expr 

	There are two open questions:

1) Do we want to change the register preferencing?

2) Should we need to return NO_REGS for FLOAT_MODE constant loaded into
GPRs? 

David

Comment 27 Andrew Pinski 2004-12-07 15:11:30 UTC

*** Bug 18864 has been marked as a duplicate of this bug. ***

Comment 28 Alan Modra 2004-12-07 23:26:29 UTC

FWIW, http://gcc.gnu.org/ml/gcc-patches/2004-12/msg00451.html cures this problem
by teaching the rs6000 back-end how to handle multi-register access to
non-offsettable memory.

Comment 29 GCC Commits 2004-12-11 17:37:34 UTC

Subject: Bug 18641

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	dje@gcc.gnu.org	2004-12-11 17:37:26

Modified files:
	gcc            : ChangeLog 
	gcc/config/rs6000: darwin.h rs6000.h rs6000.md 

Log message:
	2004-12-11  David Edelsohn  <edelsohn@gnu.org>
	Ulrich Weigand  <uweigand@de.ibm.com>
	
	PR target/18641
	* config/rs6000/darwin.h (PREFERRED_RELOAD_CLASS): Reload all
	constants into all register classes intersecting with FLOAT_REGS
	via memory.
	* config/rs6000/rs6000.h (PREFERRED_RELOAD_CLASS): Same.
	* config/rs6000/rs6000.md (movdi_internal32): Ignore FPRs when
	choosing register preferences.
	(movdi_internal64): Same.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.6782&r2=2.6783
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/rs6000/darwin.h.diff?cvsroot=gcc&r1=1.73&r2=1.74
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/rs6000/rs6000.h.diff?cvsroot=gcc&r1=1.348&r2=1.349
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/rs6000/rs6000.md.diff?cvsroot=gcc&r1=1.336&r2=1.337

Comment 30 David Edelsohn 2004-12-11 17:55:09 UTC

Patch applied.