Summary: | [4.0 Regression] Another ICE caused by reload of a pseudo reg into f0 for a DImode expr | ||
---|---|---|---|
Product: | gcc | Reporter: | Fariborz Jahanian <fjahanian> |
Component: | middle-end | Assignee: | Ulrich Weigand <uweigand> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | aj, dje, gcc-bugs, micis, pinskia, rth |
Priority: | P2 | Keywords: | ice-on-valid-code |
Version: | 4.0.0 | ||
Target Milestone: | 4.0.0 | ||
Host: | powerpc-apple-darwin7.0.0 | Target: | powerpc-apple-darwin7.0.0 |
Build: | powerpc-apple-darwin7.0.0 | Known to work: | 3.4.0 3.3.2 |
Known to fail: | 4.0.0 | Last reconfirmed: | 2004-11-24 01:27:57 |
Description
Fariborz Jahanian
2004-11-24 01:16:33 UTC
Confirmed, looking into a little more. Back in 20041007, we got a different ICE: t.c: In function 'crc': t.c:11: error: unrecognizable insn: (insn 88 47 89 (set (subreg:SI (reg:DI 32 f0) 0) (const_int 0 [0x0])) -1 (nil) (nil)) t.c:11: internal compiler error: in extract_insn, at /recog.c:2034 Please submit a full bug report, with preprocessed source if appropriate. See <URL:http://gcc.gnu.org/bugs.html> for instructions. This is a regression from 3.4.0. It worked with: 3.3.2 20030908 3.4 20030525 3.5-tree-ssa 20030620 (merged 20030525) 3.5-tree-ssa 20030928 (merged 20030923) 3.5.0 20040620 3.5.0 20040808 But it failed with: 4.0.0 20040924 (In reply to comment #4) > But it failed with: > 4.0.0 20040924 And anything after that. 3.5.0 20040829 fails also. It fails in: 3.5.0 20040824 But passes in: 3.5.0 20040822 Reverting: 2004-08-22 Ulrich Weigand <uweigand@de.ibm.com> * reload.c (find_reloads_address): Make return value tri-state. Return -1 if LEGITIMIZE_RELOAD_ADDRESS succeeded. (find_reloads): Assume that reloaded addresses match 'o' or EXTRA_MEMORY_CONSTRAINT constraints only if find_reloads_address returned 1 (not -1). Omit optional reloads for address operands only if find_reloads_address returned 1 (not -1). Fixes the ICE. *** Bug 17606 has been marked as a duplicate of this bug. *** I should mention I also see this in a testcase in Ada acats testsuite on ppc-darwin. We have here the following situation before reload: (insn 27 46 28 7 (set (reg:DI 118 [ D.1118 ]) (const_int 4294967295 [0xffffffff])) 313 {*movdi_internal32} (nil) (nil)) where reg 118 gets assigned a stack slot, and due to the large stack frame size this slot is not directly addressable. Now, when reverting my patch, what happens is that LEGITIMIZE_RELOAD_ADDRESS finds a way to construct the stack slot address, and the constant is reloaded into a register (pair): (insn 67 46 66 7 (set (reg:DI 2 r2) (const_int 4294967295 [0xffffffff])) 313 {*movdi_internal32} (nil) (nil)) (insn 66 67 27 7 (set (reg:SI 9 r9) (plus:SI (reg/f:SI 30 r30) (const_int 131072 [0x20000]))) 64 {*addsi3_internal1} (nil) (nil)) (insn 27 66 28 7 (set (mem:DI (plus:SI (reg:SI 9 r9) (const_int 40 [0x28])) [0 D.1118+0 S8 A8]) (reg:DI 2 r2)) 313 {*movdi_internal32} (nil) (nil)) Note that insn 27 now uses the r -> o alternative of *movdi_internal32; this is allowed only if the address is offsettable. Now, that address is the one that was constructed by LEGITIMIZE_RELOAD_ADDRESS. Unfortunately, reload common code has no way to actually look at what LEGITIMIZE_RELOAD_ADDRESS does, so that it could decide whether or not the address constructed is in fact an offsettable one or not. Before my 2004-08-22 patch, reload simply always assumed the address is offsettable, due to an apparent bug in LEGITIMIZE_RELOAD_ADDRESS handling: reload assumed that L_R_A would always completely replace the address by a single base register (which of course implies the address is offsettable). This bug caused problems on s390. My patch removed this erroneous assumption that L_R_A always completly replaces the address; as we don't know anything further we then have to make the conservative assumption that addresses constructed by L_R_A are never offsettable. Thus reload doesn't accept the r -> o alternative of *movdi_internal32 any more; the one it chooses instead is f -> m. This implies reloading the constant into a floating point register, and that's what reload goes ahead and does: (insn 67 46 66 7 (set (reg:DI 32 f0) (const_int 4294967295 [0xffffffff])) 313 {*movdi_internal32} (nil) (nil)) (insn 66 67 27 7 (set (reg:SI 2 r2) (plus:SI (reg/f:SI 30 r30) (const_int 131072 [0x20000]))) 64 {*addsi3_internal1} (nil) (nil)) (insn 27 66 28 7 (set (mem:DI (plus:SI (reg:SI 2 r2) (const_int 40 [0x28])) [0 D.1118+0 S8 A8]) (reg:DI 32 f0)) 313 {*movdi_internal32} (nil) (nil)) However, the insn 67 emitted thus is not actually implemented by any of the alternatives or splitters; thus the crash later on. This would appear to be a latent bug in the rs6000 back end. Reload insns generated during the gen_reload phase *must* be implemented by the backend. I'm not familiar enough with the platform to suggested the best way to do so; one obvious option would be force the constant to memory using either from within the movdi expander or via a secondary input reload. The next question is whether the new code, if implemented correctly, is better or worse than the old code -- again this is a rs6000 back-end issue. However, one middle-end question remains: while it is obviously wrong for reload to assume that L_R_A always results in a simple base register, at least on rs6000 is appears to be the case that L_R_A always results in an *offsettable* address. If this is true, we might be missing optimizations by not exploiting that knowledge in reload any more. One way might be to extend the L_R_A interface in a way that would allow the back end to inform the middle end about properties of the address is has constructed; I'm not sure how this interface should look in detail. The other option would be to simply go back to having the middle end assume the L_R_A constructed addresses are always offsettable. This could be implemented by something like the following patch: Index: reload.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/reload.c,v retrieving revision 1.258 diff -c -p -r1.258 reload.c *** reload.c 9 Nov 2004 17:29:02 -0000 1.258 --- reload.c 24 Nov 2004 14:49:04 -0000 *************** find_reloads (rtx insn, int replace, int *** 3189,3196 **** && ((ind_levels ? offsettable_memref_p (operand) : offsettable_nonstrict_memref_p (operand)) /* A reloaded address is offsettable because it is now ! just a simple register indirect. */ ! || address_reloaded[i] == 1)) || (REG_P (operand) && REGNO (operand) >= FIRST_PSEUDO_REGISTER && reg_renumber[REGNO (operand)] < 0 --- 3189,3198 ---- && ((ind_levels ? offsettable_memref_p (operand) : offsettable_nonstrict_memref_p (operand)) /* A reloaded address is offsettable because it is now ! just a simple register indirect. Addresses built ! via LEGITIMIZE_RELOAD_ADDRESS must always be ! offsettable as well. */ ! || address_reloaded[i])) || (REG_P (operand) && REGNO (operand) >= FIRST_PSEUDO_REGISTER && reg_renumber[REGNO (operand)] < 0 (This requirement should then be documented in the L_R_A docs as well.) Allowing the middle-end to know that the L_R_A address is offsettable looks like a better solution to me. The design is an issue for RTH. One possibility is a target macro to decide if L_R_A addresses should be assumed offsettable: #if LRA_OFFSETTABLE || address_reloaded[i] > 0 #else || address_reloaded[i] == 1 #endif ) Finer granlarity information from LRA is more complicated. I think the most ideal solution would be to tell L_R_A whether the result is required to be offsettable. It can then fail the transformation if it cannot produce the required offsettable result, and then normal reload things will happen to fix things up. How feasable is this with the current code base? This is not very feasible, since whether or not the address is required to be offsettable depends on which alternative is selected; and the current flow of find_reloads does all the address reloading (including L_R_A) *before* even looking at the alternatives. So we currently first make all addresses valid (for the most general type of address the target supports), then select an alternative, and if that alternative requires a more strict form of address, we simply reload into a base register. This is of course somewhat inefficient, but changing it would require major rework of find_reloads. (Trying again, this time hopefully with proper formatting.) This is not very feasible, since whether or not the address is required to be offsettable depends on which alternative is selected; and the current flow of find_reloads does all the address reloading (including L_R_A) *before* even looking at the alternatives. So we currently first make all addresses valid (for the most general type of address the target supports), then select an alternative, and if that alternative requires a more strict form of address, we simply reload into a base register. This is of course somewhat inefficient, but changing it would require major rework of find_reloads. Then I think that we have to assume that the result of L_R_A is not offsettable. Even in the case of rs6000, I believe the definition is only *usually* offsettable, but that this is not a 100% iron-clad guarantee. In particular, an offset like 0x17fff would get rendered as 0x10000 + 0x7fff, and there's no room left in the low part for any further offsetting. (In reply to comment #10) > I should mention I also see this in a testcase in Ada acats testsuite on ppc-darwin. For the record this is ACATS c761010 and it's ppc only (works on x86 and x86_64), from Andrew's log: /Users/pinskia/src/local2/gcc/objdir/gcc/xgcc -c -B/Users/pinskia/src/local2/gcc/objdir/gcc/ -gnatws -O2 -I/Users/pinskia/src/local2/gcc/objdir/gcc/testsuite/ada/acats/support c761010_1-var_strings-types.adb c761010_1-var_strings-types.ads: In function 'C761010_1.Var_Strings.Types._Elabs': c761010_1-var_strings-types.ads:2: error: insn does not satisfy its constraints: (insn 1498 1466 109 6 (set (reg:DI 32 f0) (const_int 8 [0x8])) 313 {*movdi_internal32} (nil) (nil)) +===========================GNAT BUG DETECTED==============================+ | 4.0.0 20041127 (experimental) (powerpc-apple-darwin7.6.0) GCC error: | | in reload_cse_simplify_operands, at postreload.c:391 | | Error detected at c761010_1-var_strings-types.adb:103:68 | Regardless of how we fix this specific problem; by reverting Ulrich's patch to find_reloads_address, making the small change he proposed in find_reloads, or something else, there remains the problem each time a 64-bit integer constant is loaded into an FPR. This is a cronic problem which we need to address. Now is as good as ever. Being a newby in this area please bear with me. I see a couple of solutions: 1) Do not use FPR for a 64-bit constant integers. This is indeed what happens when reverting Ulrich's patch. What benefit do we gain by allowing use of FPR for these cases? Don't we always need to eventually load the constant into a pair of GPRs, via going to memory first. What are the cases where using FPR is beneficial (to reduce register pressure is one answer, but then we still need to go to memory and back to GPRs for any useful operation). 2) Handle this special case in the splitter which is used. But this requires going to memory. Can this be done in the splitter? This seems to be a better solution if 1) cannot be disallowed. As to 1), this can be achieved by marking the 'f' alternatives in the *movdi_internal32 pattern as '!*f'; this will prevent both regclass and reload from using the alternatives unless there was a floating-point register involved already before reload. As to 2), the proper way to go via memory would be to return NO_REGS from PREFERRED_RELOAD_CLASS when asked how to load a constant into (a superclass of) FLOAT_REGS. Then reload takes care of force_const_mem and everything related. For option (2), are you suggesting changing PREFERRED_RELOAD_CLASS from ((GET_CODE (X) == CONST_DOUBLE && GET_MODE_CLASS (GET_MODE (X)) == MODE_FLOAT) ? NO_REGS : ...) to ((GET_CODE (X) == CONST_DOUBLE && reg_classes_intersect_p ((CLASS), FLOAT_REGS)) ? NO_REGS : ...) to reload any CONST_DOUBLE into an FPR via memory? Basically yes, except that it's not only about CONST_DOUBLE, but also CONST_INT. Maybe even: ((CONSTANT_P (X) && reg_classes_intersect_p ((CLASS), FLOAT_REGS)) ? NO_REGS : ...) Subject: Re: [4.0 Regression] Another ICE caused by reload of a psuedo reg into f0 for a DImode expr The following patch implements the two suggestions. It fixes the ICE on AIX, but continues to produce an ICE on Mac OS X. Also, the CONST_INT case only should be relevant in 64-bit mode. Index: rs6000.h =================================================================== RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.h,v retrieving revision 1.348 diff -c -p -r1.348 rs6000.h *** rs6000.h 27 Nov 2004 22:45:24 -0000 1.348 --- rs6000.h 6 Dec 2004 17:14:18 -0000 *************** enum reg_class *** 1397,1404 **** */ #define PREFERRED_RELOAD_CLASS(X,CLASS) \ ! (((GET_CODE (X) == CONST_DOUBLE \ ! && GET_MODE_CLASS (GET_MODE (X)) == MODE_FLOAT) \ ? NO_REGS \ : (GET_MODE_CLASS (GET_MODE (X)) == MODE_INT \ && (CLASS) == NON_SPECIAL_REGS) \ --- 1397,1404 ---- */ #define PREFERRED_RELOAD_CLASS(X,CLASS) \ ! (((CONSTANT_P (X) \ ! && reg_classes_intersect_p ((CLASS), FLOAT_REGS)) \ ? NO_REGS \ : (GET_MODE_CLASS (GET_MODE (X)) == MODE_INT \ && (CLASS) == NON_SPECIAL_REGS) \ Index: rs6000.md =================================================================== RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v retrieving revision 1.336 diff -c -p -r1.336 rs6000.md *** rs6000.md 1 Dec 2004 17:18:38 -0000 1.336 --- rs6000.md 6 Dec 2004 17:14:18 -0000 *************** *** 8496,8502 **** ; List r->r after r->"o<>", otherwise reload will try to reload a ; non-offsettable address by using r->r which won't make progress. (define_insn "*movdi_internal32" ! [(set (match_operand:DI 0 "nonimmediate_operand" "=o<>,r,r,f,f,m,r") (match_operand:DI 1 "input_operand" "r,r,m,f,m,f,IJKnGHF"))] "! TARGET_POWERPC64 && (gpc_reg_operand (operands[0], DImode) --- 8496,8502 ---- ; List r->r after r->"o<>", otherwise reload will try to reload a ; non-offsettable address by using r->r which won't make progress. (define_insn "*movdi_internal32" ! [(set (match_operand:DI 0 "nonimmediate_operand" "=o<>,r,r,!*f,!*f,m,r") (match_operand:DI 1 "input_operand" "r,r,m,f,m,f,IJKnGHF"))] "! TARGET_POWERPC64 && (gpc_reg_operand (operands[0], DImode) *************** *** 8567,8573 **** }") (define_insn "*movdi_internal64" ! [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,m,r,r,r,r,??f,f,m,r,*h,*h") (match_operand:DI 1 "input_operand" "r,m,r,I,L,nF,R,f,m,f,*h,r,0"))] "TARGET_POWERPC64 && (gpc_reg_operand (operands[0], DImode) --- 8567,8573 ---- }") (define_insn "*movdi_internal64" ! [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,m,r,r,r,r,!*f,!*f,m,r,*h,*h") (match_operand:DI 1 "input_operand" "r,m,r,I,L,nF,R,f,m,f,*h,r,0"))] "TARGET_POWERPC64 && (gpc_reg_operand (operands[0], DImode) T(In reply to comment #22) > Subject: Re: [4.0 Regression] Another ICE caused by reload of a psuedo reg into f0 for a DImode expr > > The following patch implements the two suggestions. It fixes the > ICE on AIX, but continues to produce an ICE on Mac OS X. That is because darwin defines its own PREFERRED_RELOAD_CLASS, why I don't know. I applied the patch to fsf-mainline (including darwin.h) and it worked for me. I will do the bootstrap, dejagnu testing and let you know how it went. - Thanks. David's patch (including darwin.h patch attached here) successufully bootstrapped, dejagnu tested on apple-ppc-darwin. Please apply the patch to mainline. Index: darwin.h =============================================================== ==== RCS file: /cvs/gcc/gcc/gcc/config/rs6000/darwin.h,v retrieving revision 1.72 diff -c -p -r1.72 darwin.h *** darwin.h 27 Nov 2004 22:45:22 -0000 1.72 --- darwin.h 6 Dec 2004 17:56:34 -0000 *************** do { \ *** 344,351 **** #undef PREFERRED_RELOAD_CLASS #define PREFERRED_RELOAD_CLASS(X,CLASS) \ ! ((GET_CODE (X) == CONST_DOUBLE \ ! && GET_MODE_CLASS (GET_MODE (X)) == MODE_FLOAT) \ ? NO_REGS \ : ((GET_CODE (X) == SYMBOL_REF || GET_CODE (X) == HIGH) \ && reg_class_subset_p (BASE_REGS, (CLASS))) \ --- 344,351 ---- #undef PREFERRED_RELOAD_CLASS #define PREFERRED_RELOAD_CLASS(X,CLASS) \ ! ((CONSTANT_P (X) \ ! && reg_classes_intersect_p ((CLASS), FLOAT_REGS)) \ ? NO_REGS \ : ((GET_CODE (X) == SYMBOL_REF || GET_CODE (X) == HIGH) \ && reg_class_subset_p (BASE_REGS, (CLASS))) \ Subject: Re: [4.0 Regression] Another ICE caused by reload of a psuedo reg into f0 for a DImode expr There are two open questions: 1) Do we want to change the register preferencing? 2) Should we need to return NO_REGS for FLOAT_MODE constant loaded into GPRs? David *** Bug 18864 has been marked as a duplicate of this bug. *** FWIW, http://gcc.gnu.org/ml/gcc-patches/2004-12/msg00451.html cures this problem by teaching the rs6000 back-end how to handle multi-register access to non-offsettable memory. Subject: Bug 18641 CVSROOT: /cvs/gcc Module name: gcc Changes by: dje@gcc.gnu.org 2004-12-11 17:37:26 Modified files: gcc : ChangeLog gcc/config/rs6000: darwin.h rs6000.h rs6000.md Log message: 2004-12-11 David Edelsohn <edelsohn@gnu.org> Ulrich Weigand <uweigand@de.ibm.com> PR target/18641 * config/rs6000/darwin.h (PREFERRED_RELOAD_CLASS): Reload all constants into all register classes intersecting with FLOAT_REGS via memory. * config/rs6000/rs6000.h (PREFERRED_RELOAD_CLASS): Same. * config/rs6000/rs6000.md (movdi_internal32): Ignore FPRs when choosing register preferences. (movdi_internal64): Same. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.6782&r2=2.6783 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/rs6000/darwin.h.diff?cvsroot=gcc&r1=1.73&r2=1.74 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/rs6000/rs6000.h.diff?cvsroot=gcc&r1=1.348&r2=1.349 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/rs6000/rs6000.md.diff?cvsroot=gcc&r1=1.336&r2=1.337 Patch applied. |