Bug 36613 - [4.2 Regression] likely codegen bug
Summary: [4.2 Regression] likely codegen bug
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.4.0
: P1 normal
Target Milestone: 4.3.2
Assignee: Michael Matz
URL: http://gcc.gnu.org/ml/gcc-patches/200...
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2008-06-24 04:14 UTC by John Regehr
Modified: 2008-08-11 16:23 UTC (History)
8 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu
Known to work: 3.4.6 4.0.3 4.0.4
Known to fail: 4.0.2 4.1.0 4.3.1
Last reconfirmed: 2008-06-24 11:42:22


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description John Regehr 2008-06-24 04:14:49 UTC
This is seen on svn 137045.

[regehr@babel tmp27]$ current-gcc -O -fwrapv small.c -o small
[regehr@babel tmp27]$ ./small
0
[regehr@babel tmp27]$ current-gcc -Os -fwrapv small.c -o small
[regehr@babel tmp27]$ ./small
2

[regehr@babel gcc-current]$ current-gcc -v       
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: ../configure --program-prefix=current- --enable-languages=c,c++ --prefix=/home/regehr
Thread model: posix
gcc version 4.4.0 20080623 (experimental) (GCC) 





#include <limits.h>
#include <stdio.h>

static inline int
lshift_s_s(int left, int right)
{
	if ((left < 0)
		|| (right < 0)
		|| (right >= sizeof(int)*CHAR_BIT)
		|| (left > (INT_MAX >> right))) {
		/* Avoid undefined behavior. */
		return left;
	}
	return left << right;
}

static inline unsigned int
lshift_u_u(unsigned int left, unsigned int right)
{
	if ((right >= sizeof(unsigned int)*CHAR_BIT)
		|| (left > (UINT_MAX >> right))) {
		/* Avoid undefined behavior. */
		return left;
	}
	return left << right;
}

static inline int
rshift_s_u(int left, unsigned int right)
{
	if ((left < 0)
		|| (right >= sizeof(int)*CHAR_BIT)) {
		/* Avoid implementation-defined and undefined behavior. */
		return left;
	}
	return left >> right;
}

int func_47(unsigned int p_48 ) ;
int func_47(unsigned int p_48 ) 
{ 
  int tmp = lshift_u_u(p_48, p_48);
  int tmp___0 = lshift_s_s(tmp, 1);
  int tmp___1 = rshift_s_u(1 + p_48, tmp___0);
  return (tmp___1);
}

int main(void) 
{ 
  int x = func_47(1);
  printf("%d\n", x);
  return 0;
}
Comment 1 Andrew Pinski 2008-06-24 04:32:41 UTC
This worked with:
Using built-in specs.
Target: i386-apple-darwin8.11.1
Configured with: /Users/apinski/src/local/gcc/configure --prefix=/Users/apinski/local-gcc --disable-multilib
Thread model: posix
gcc version 4.4.0 20080622 (experimental) [trunk revision 137020] (GCC) 

Comment 2 Richard Biener 2008-06-24 11:42:22 UTC
Fails since 4.1.0, still broken on the trunk.
Comment 3 Richard Biener 2008-06-24 12:28:45 UTC
We expand

25        return left << right;

to

        sall    %cl, %ecx

but we initialized %ecx from

        movb    %dl, %cl

so later the comparison against zero fails due to the upper part of ecx
being uninitialized.
Comment 4 Richard Biener 2008-06-24 12:40:59 UTC
Actually this may be a reload/df problem -- in reload we have

;; bb 4 artificial_defs: { }
;; bb 4 artificial_uses: { u-1(6){ }u-1(7){ }}
;; lr  in        1 [dx] 6 [bp] 7 [sp] 20 [frame]
;; lr  use       1 [dx] 6 [bp] 7 [sp]
;; lr  def       2 [cx] 17 [flags]
;; live  in      1 [dx] 6 [bp] 7 [sp] 20 [frame]
;; live  gen     2 [cx]
;; live  kill    17 [flags]

;; Pred edge  3 [50.0%]  (fallthru)
(note:HI 14 13 15 4 [bb 4] NOTE_INSN_BASIC_BLOCK)

(insn:HI 15 14 65 4 t.c:25 (parallel [
            (set (reg/v:SI 2 cx [orig:58 p_48.12 ] [58])
                (ashift:SI (reg/v:SI 2 cx [orig:58 p_48.12 ] [58])
                    (reg:QI 2 cx)))
            (clobber (reg:CC 17 flags))
        ]) 459 {*ashlsi3_1} (nil))

but cx should be live in as it is used by the ashift.
Comment 5 Richard Biener 2008-06-25 11:51:26 UTC
Wrong code on primary arch.
Comment 6 Jakub Jelinek 2008-06-25 12:00:38 UTC
Simplified testcase (fails at -Os -m32):
/* PR target/36613 */

extern void abort (void);

static inline int
lshifts (int val, int cnt)
{
  if (val < 0)
    return val;
  return val << cnt;
}

static inline unsigned int
lshiftu (unsigned int val, unsigned int cnt)
{
  if (cnt >= sizeof (unsigned int) * __CHAR_BIT__
      || val > ((__INT_MAX__ * 2U) >> cnt))
    return val;
  return val << cnt;
}

static inline int
rshifts (int val, unsigned int cnt)
{
  if (val < 0 || cnt >= sizeof (int) * __CHAR_BIT__)
    return val;
  return val >> cnt;
}

int
foo (unsigned int val)
{
  return rshifts (1 + val, lshifts (lshiftu (val, val), 1));
}

int
main (void)
{
  if (foo (1) != 0)
    abort ();
  return 0;
}
Comment 7 Michael Matz 2008-07-01 19:17:21 UTC
Blaeh.  The bug is in code that is there since the dawn of revision control,
under a comment that starts with "... This is tricky ..." and ends with
"I am not sure whether the algorithm here is always right ...".

One thing after another.  The problem is in insn 15 (with Jakubs testcase),
that effectively is:
   p58 <- p63 << p63:QI
It so happens that p63 is allocated to $edx and p58 to $ecx.  This is all
fine and would result in such insn:  %ecx = %edx << %dl

Then find_reloads comes and determines that the insn as is is invalid:
1) The output and first input have to match and
2) the second input needs to be in

So, it generates the first reload to make op0 and op1 match.
  in = inreg = (reg:SI dx)
  out = outreg = (reg:SI cx)
  inmode = outmode = SImode
  class = GENERAL_REGS
  type = RELOAD_OTHER
perfectly fine.  find_reloads will also already set reg_rtx (i.e. the register
that is used to carry out the reload) to (reg:SI cx), also quite fine and
correct.

Then it tries to generate another reload to fit the "c" constraint for operand
2:
  in = (subreg:QI (reg:SI dx))
  out = 0
  inmode = QImode
  class = CREG
  type = RELOAD_FOR_INPUT
Fairly early push_reload will substitute in (still a SUBREG) with the real
hardreg: in = (reg:QI dx).  Then it tries to find a mergable reload, and
indeed finds the first one.  That is also correct, the classes overlap,
the already assigned reg_rtx is member of that class (CREG) the types of
reloads are okay and so on.

Okay, so we don't generate a second reload for this operand, but simply reuse
the first one, so we have to merge the info of this to-be reload with the one
we have already.  That for instance would widen the existing mode if our new
reload would be wider.  See push_reload around line 1369 for what exactly
that merging does.  Among the things it does, it also overwrites .in:

   if (in != 0) {
     ...
     rld[i].in = in;
     rld[i].in_reg = in_reg;
   }

This might look strange, but in itself is okay.  It looks strange, because
our reload now looks like:
  in = inreg = (reg:QI dx)
  out = outreg = (reg:SI cx)
  inmode = outmode = SImode
  class = CREG
  type = RELOAD_OTHER
  reg_rtx = (reg:SI cx)

Note in particular that the .in reg is a QImode register, while the inmode
(and outmode) are still SImode.  This in itself is still not incorrect,
if this reload would be emitted as a SImode register move (from dx to cx),
as requested by the modes of this reload all would work.

I.e. the .in (and hence .out) registers are not to be used as real register
references, but merely as container for the hardreg _numbers_ we want to use.

Now, if we're going to emit this reload we go over do_input_reload,
and right at the beginning we have this:

  [ here old = rl->in; ]
  if (old && reg_rtx)
    {
      enum machine_mode mode;
      /* Determine the mode to reload in.
         This is very tricky because we have three to choose from.
      ...
         I am not sure whether the algorithm here is always right,
         but it does the right things in those cases.  */
      mode = GET_MODE (old);
      if (mode == VOIDmode)
        mode = rl->inmode;

So, it determines the mode _from the .in member_ by default, and only uses
the mode from the reload when that one is a constant.  Argh!  This means we
are now emitting a QImode move only loading %cl from %dl, whereas we would
need to load all of %ecx from %edx.  We emit this wrong reload insn, which
then later get's removed because we already have another reload %cl <- %dl
in the basic block before this one, which we are inheriting then.  This just
confuses the analysis somewhat, but the problem really is this too narrow
reload.
This code originally comes from do_input_reload, which in turn has it from
emit_reload_insns, and had this comment already in r120 of reload1.c .

I see basically two ways to fix this:
1) make push_reload "merge" also the .in member, instead of just overwriting
   it.  By merging I mean that if rld.in and in are both registers (in which
   case they have the same numbers already, as that is checked by
   find_reusable_reload) that leave in the larger of the two.  Same for
   .out.

2) Change do_input_reload (and also do_output_reload) to listen
   to inmode and outmode first, before looking at the mode of .in.
   The comment above this looks scary and lists some problematic cases, but
   given it's age I'm not at all sure if these indeed are still a problem.
   I actually think that inmode and outmode are currently the most precise
   descriptions of what this reload is about, unlike .in and .out whose
   significance lies more or less in the reg number only.

(2) might be the harder change, as it would require also reinterpreting
anything in .in or .out in the mode, in case they don't match.

I checked that (1) fixes the testcase, maybe I'm regstrapping that, it's the
simpler change.
Comment 8 Joseph S. Myers 2008-07-04 22:42:25 UTC
Closing 4.1 branch.
Comment 9 Paolo Bonzini 2008-07-31 12:58:44 UTC
Michael, any news?
Comment 10 Michael Matz 2008-07-31 16:13:01 UTC
I submitted a patch here:
  http://gcc.gnu.org/ml/gcc-patches/2008-07/msg00722.html
but got no feedback or review.
Comment 11 Ulrich Weigand 2008-07-31 19:31:38 UTC
I'll have a look tomorrow ...
Comment 12 Michael Matz 2008-08-06 15:36:07 UTC
Subject: Bug 36613

Author: matz
Date: Wed Aug  6 15:34:45 2008
New Revision: 138807

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=138807
Log:
        PR target/36613

        * reload.c (push_reload): Merge in,out,in_reg,out_reg members
        for reused reload, instead of overwriting them.

        * gcc.target/i386/pr36613.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.target/i386/pr36613.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/reload.c
    trunk/gcc/testsuite/ChangeLog

Comment 13 Jakub Jelinek 2008-08-11 08:12:37 UTC
Do you plan to commit this to 4.3 as well?
Comment 14 rguenther@suse.de 2008-08-11 09:48:46 UTC
Subject: Re:  [4.2/4.3 Regression] likely codegen bug

On Mon, 11 Aug 2008, jakub at gcc dot gnu dot org wrote:

> ------- Comment #13 from jakub at gcc dot gnu dot org  2008-08-11 08:12 -------
> Do you plan to commit this to 4.3 as well?

Ulrich asked for some time on the trunk (we have built all of our
packages against a patched 4.3 tree now with no appearant problems as 
well).

Richard.
Comment 15 Ulrich Weigand 2008-08-11 15:12:11 UTC
(In reply to comment #14)
> Ulrich asked for some time on the trunk (we have built all of our
> packages against a patched 4.3 tree now with no appearant problems as 
> well).

OK, in that case I have no further concern.  I'll leave it up to you as
release manager to decide when you want it to go into 4.3 ...
Comment 16 Michael Matz 2008-08-11 16:22:34 UTC
committed as r138955 into 4_3 branch.
Comment 17 Michael Matz 2008-08-11 16:23:22 UTC
Subject: Bug 36613

Author: matz
Date: Mon Aug 11 16:22:00 2008
New Revision: 138955

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=138955
Log:
        PR target/36613

        * reload.c (push_reload): Merge in,out,in_reg,out_reg members
        for reused reload, instead of overwriting them.

        * gcc.target/i386/pr36613.c: New testcase.

Added:
    branches/gcc-4_3-branch/gcc/testsuite/gcc.target/i386/pr36613.c
Modified:
    branches/gcc-4_3-branch/gcc/ChangeLog
    branches/gcc-4_3-branch/gcc/reload.c
    branches/gcc-4_3-branch/gcc/testsuite/ChangeLog