Summary: | [4.2 Regression] likely codegen bug | ||
---|---|---|---|
Product: | gcc | Reporter: | John Regehr <regehr> |
Component: | target | Assignee: | Michael Matz <matz> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | bonzini, cnstar9988, fang, gcc-bugs, matz, rguenth, uros, uweigand |
Priority: | P1 | Keywords: | wrong-code |
Version: | 4.4.0 | ||
Target Milestone: | 4.3.2 | ||
URL: | http://gcc.gnu.org/ml/gcc-patches/2008-07/msg00722.html | ||
Host: | i686-pc-linux-gnu | Target: | i686-pc-linux-gnu |
Build: | i686-pc-linux-gnu | Known to work: | 3.4.6 4.0.3 4.0.4 |
Known to fail: | 4.0.2 4.1.0 4.3.1 | Last reconfirmed: | 2008-06-24 11:42:22 |
Description
John Regehr
2008-06-24 04:14:49 UTC
This worked with: Using built-in specs. Target: i386-apple-darwin8.11.1 Configured with: /Users/apinski/src/local/gcc/configure --prefix=/Users/apinski/local-gcc --disable-multilib Thread model: posix gcc version 4.4.0 20080622 (experimental) [trunk revision 137020] (GCC) Fails since 4.1.0, still broken on the trunk. We expand 25 return left << right; to sall %cl, %ecx but we initialized %ecx from movb %dl, %cl so later the comparison against zero fails due to the upper part of ecx being uninitialized. Actually this may be a reload/df problem -- in reload we have ;; bb 4 artificial_defs: { } ;; bb 4 artificial_uses: { u-1(6){ }u-1(7){ }} ;; lr in 1 [dx] 6 [bp] 7 [sp] 20 [frame] ;; lr use 1 [dx] 6 [bp] 7 [sp] ;; lr def 2 [cx] 17 [flags] ;; live in 1 [dx] 6 [bp] 7 [sp] 20 [frame] ;; live gen 2 [cx] ;; live kill 17 [flags] ;; Pred edge 3 [50.0%] (fallthru) (note:HI 14 13 15 4 [bb 4] NOTE_INSN_BASIC_BLOCK) (insn:HI 15 14 65 4 t.c:25 (parallel [ (set (reg/v:SI 2 cx [orig:58 p_48.12 ] [58]) (ashift:SI (reg/v:SI 2 cx [orig:58 p_48.12 ] [58]) (reg:QI 2 cx))) (clobber (reg:CC 17 flags)) ]) 459 {*ashlsi3_1} (nil)) but cx should be live in as it is used by the ashift. Wrong code on primary arch. Simplified testcase (fails at -Os -m32): /* PR target/36613 */ extern void abort (void); static inline int lshifts (int val, int cnt) { if (val < 0) return val; return val << cnt; } static inline unsigned int lshiftu (unsigned int val, unsigned int cnt) { if (cnt >= sizeof (unsigned int) * __CHAR_BIT__ || val > ((__INT_MAX__ * 2U) >> cnt)) return val; return val << cnt; } static inline int rshifts (int val, unsigned int cnt) { if (val < 0 || cnt >= sizeof (int) * __CHAR_BIT__) return val; return val >> cnt; } int foo (unsigned int val) { return rshifts (1 + val, lshifts (lshiftu (val, val), 1)); } int main (void) { if (foo (1) != 0) abort (); return 0; } Blaeh. The bug is in code that is there since the dawn of revision control, under a comment that starts with "... This is tricky ..." and ends with "I am not sure whether the algorithm here is always right ...". One thing after another. The problem is in insn 15 (with Jakubs testcase), that effectively is: p58 <- p63 << p63:QI It so happens that p63 is allocated to $edx and p58 to $ecx. This is all fine and would result in such insn: %ecx = %edx << %dl Then find_reloads comes and determines that the insn as is is invalid: 1) The output and first input have to match and 2) the second input needs to be in So, it generates the first reload to make op0 and op1 match. in = inreg = (reg:SI dx) out = outreg = (reg:SI cx) inmode = outmode = SImode class = GENERAL_REGS type = RELOAD_OTHER perfectly fine. find_reloads will also already set reg_rtx (i.e. the register that is used to carry out the reload) to (reg:SI cx), also quite fine and correct. Then it tries to generate another reload to fit the "c" constraint for operand 2: in = (subreg:QI (reg:SI dx)) out = 0 inmode = QImode class = CREG type = RELOAD_FOR_INPUT Fairly early push_reload will substitute in (still a SUBREG) with the real hardreg: in = (reg:QI dx). Then it tries to find a mergable reload, and indeed finds the first one. That is also correct, the classes overlap, the already assigned reg_rtx is member of that class (CREG) the types of reloads are okay and so on. Okay, so we don't generate a second reload for this operand, but simply reuse the first one, so we have to merge the info of this to-be reload with the one we have already. That for instance would widen the existing mode if our new reload would be wider. See push_reload around line 1369 for what exactly that merging does. Among the things it does, it also overwrites .in: if (in != 0) { ... rld[i].in = in; rld[i].in_reg = in_reg; } This might look strange, but in itself is okay. It looks strange, because our reload now looks like: in = inreg = (reg:QI dx) out = outreg = (reg:SI cx) inmode = outmode = SImode class = CREG type = RELOAD_OTHER reg_rtx = (reg:SI cx) Note in particular that the .in reg is a QImode register, while the inmode (and outmode) are still SImode. This in itself is still not incorrect, if this reload would be emitted as a SImode register move (from dx to cx), as requested by the modes of this reload all would work. I.e. the .in (and hence .out) registers are not to be used as real register references, but merely as container for the hardreg _numbers_ we want to use. Now, if we're going to emit this reload we go over do_input_reload, and right at the beginning we have this: [ here old = rl->in; ] if (old && reg_rtx) { enum machine_mode mode; /* Determine the mode to reload in. This is very tricky because we have three to choose from. ... I am not sure whether the algorithm here is always right, but it does the right things in those cases. */ mode = GET_MODE (old); if (mode == VOIDmode) mode = rl->inmode; So, it determines the mode _from the .in member_ by default, and only uses the mode from the reload when that one is a constant. Argh! This means we are now emitting a QImode move only loading %cl from %dl, whereas we would need to load all of %ecx from %edx. We emit this wrong reload insn, which then later get's removed because we already have another reload %cl <- %dl in the basic block before this one, which we are inheriting then. This just confuses the analysis somewhat, but the problem really is this too narrow reload. This code originally comes from do_input_reload, which in turn has it from emit_reload_insns, and had this comment already in r120 of reload1.c . I see basically two ways to fix this: 1) make push_reload "merge" also the .in member, instead of just overwriting it. By merging I mean that if rld.in and in are both registers (in which case they have the same numbers already, as that is checked by find_reusable_reload) that leave in the larger of the two. Same for .out. 2) Change do_input_reload (and also do_output_reload) to listen to inmode and outmode first, before looking at the mode of .in. The comment above this looks scary and lists some problematic cases, but given it's age I'm not at all sure if these indeed are still a problem. I actually think that inmode and outmode are currently the most precise descriptions of what this reload is about, unlike .in and .out whose significance lies more or less in the reg number only. (2) might be the harder change, as it would require also reinterpreting anything in .in or .out in the mode, in case they don't match. I checked that (1) fixes the testcase, maybe I'm regstrapping that, it's the simpler change. Closing 4.1 branch. Michael, any news? I submitted a patch here: http://gcc.gnu.org/ml/gcc-patches/2008-07/msg00722.html but got no feedback or review. I'll have a look tomorrow ... Subject: Bug 36613 Author: matz Date: Wed Aug 6 15:34:45 2008 New Revision: 138807 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=138807 Log: PR target/36613 * reload.c (push_reload): Merge in,out,in_reg,out_reg members for reused reload, instead of overwriting them. * gcc.target/i386/pr36613.c: New testcase. Added: trunk/gcc/testsuite/gcc.target/i386/pr36613.c Modified: trunk/gcc/ChangeLog trunk/gcc/reload.c trunk/gcc/testsuite/ChangeLog Do you plan to commit this to 4.3 as well? Subject: Re: [4.2/4.3 Regression] likely codegen bug
On Mon, 11 Aug 2008, jakub at gcc dot gnu dot org wrote:
> ------- Comment #13 from jakub at gcc dot gnu dot org 2008-08-11 08:12 -------
> Do you plan to commit this to 4.3 as well?
Ulrich asked for some time on the trunk (we have built all of our
packages against a patched 4.3 tree now with no appearant problems as
well).
Richard.
(In reply to comment #14) > Ulrich asked for some time on the trunk (we have built all of our > packages against a patched 4.3 tree now with no appearant problems as > well). OK, in that case I have no further concern. I'll leave it up to you as release manager to decide when you want it to go into 4.3 ... committed as r138955 into 4_3 branch. Subject: Bug 36613 Author: matz Date: Mon Aug 11 16:22:00 2008 New Revision: 138955 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=138955 Log: PR target/36613 * reload.c (push_reload): Merge in,out,in_reg,out_reg members for reused reload, instead of overwriting them. * gcc.target/i386/pr36613.c: New testcase. Added: branches/gcc-4_3-branch/gcc/testsuite/gcc.target/i386/pr36613.c Modified: branches/gcc-4_3-branch/gcc/ChangeLog branches/gcc-4_3-branch/gcc/reload.c branches/gcc-4_3-branch/gcc/testsuite/ChangeLog |