Bug 32575 - [4.2 regression] With -ftree-vrp miscompiles a single line of code in SQLite
Summary: [4.2 regression] With -ftree-vrp miscompiles a single line of code in SQLite
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.2.0
: P1 normal
Target Milestone: 4.3.0
Assignee: Not yet assigned to anyone
URL:
Keywords: alias, wrong-code
Depends on:
Blocks:
 
Reported: 2007-07-01 22:07 UTC by D. Richard Hipp
Modified: 2014-02-16 13:12 UTC (History)
10 users (show)

See Also:
Host:
Target:
Build:
Known to work: 4.3.0
Known to fail: 4.2.5
Last reconfirmed:


Attachments
Add a param to squelch runaway memory consumption (1.84 KB, patch)
2007-11-01 14:02 UTC, Nick Clifton
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description D. Richard Hipp 2007-07-01 22:07:03 UTC
A bug reported against SQLite appears to be a case of GCC 4.3.0
miscompiling a single line of code within SQLite.  The problem only
appears with -O2 or -Os.  The problem goes away if we add the
-fno-tree-vrp option.  The original bug report can be found at

   http://www.sqlite.org/cvstrac/tktview?tn=2469

The line of code that is miscompiled is found in the source file
named vdbe.c (version 1.635) on line 4309.

  4308  for(j=0; j<nRoot; j++){
  4309    aRoot[j] = pTos[-j].u.i;
  4310  }
  4311  aRoot[j] = 0;

By setting a breakpoint on line 4311 and examining the values
of aRoot[] one finds that all nRoot entries of aRoot[] are being
filled from pTos[0].u.i instead of being filled from pTos[0].u.i,
pTos[-1].u.i, pTos[-2].u.i, and so forth as the loop intends.

I will be happy to supply any additional debugging information
that might help in fixing this problem (such as vdbe.s files 
compiled both with and without -fno-tree-vrp).  I regret that 
I have so far been unable to replicate this problem in a small 
test program.
Comment 1 Andrew Pinski 2007-07-01 22:16:43 UTC
Can you at least provide the preprocessed source of vdbe.c ?
Comment 2 D. Richard Hipp 2007-07-01 22:52:28 UTC
(In reply to comment #1)
> Can you at least provide the preprocessed source of vdbe.c ?
> 

Certainly.  But before I do, I just noticed that I gave the
wrong GCC version number in the bug report.  The error is 
in GCC 4.2.0, not 4.3.0 as reported.  I have not attempted
to compile or test version 4.3.0.

I have prepared a tarball containing the complete SQLite
source code that can be compiled using a single command
such as:

    gcc *.c -ldl -lpthread

There is a README file in the tarball that describes this
compilation step and which tells how to test (with a second
command) whether or not the bug is present in your build.
You can download the tarball from:

    http://www.sqlite.org/gccbug32575.tar.gz



Comment 3 D. Richard Hipp 2007-07-22 11:43:35 UTC
Follow-up comments to the original bug report in SQLite
(see the link shown above) report that the same problem
exists in GCC 4.2.1.  A work-around for SQLite was
devised, which was to change a single line of code:

    -  aRoot[j] = pTos[-j].u.i;
    +  aRoot[j] = (pTos-j)->u.i;

There are many other places in the SQLite code that
have similar constructs, but this one instance seems
to be the only one that gives GCC trouble.
Comment 4 Jakub Jelinek 2007-08-28 14:55:20 UTC
On the trunk I'm just seeing
*** in database main ***
Page 2 is never used
That seems to be because of miscompiled sqlite3SelectNew function with
-O2 -fstrict-aliasing, with -O2 -fno-strict-aliasing that works.

Here is a reduced self-contained testcase for that:

extern void abort (void);

struct S
{
  void *s1;
  unsigned char s2, s3, s4, s5, s6, s7;
  char s8;
  void *s9, *s10, *s11, *s12, *s13;
  struct S *s14, *s15;
  void *s16, *s17;
  int s18, s19, s20[3];
};

__attribute__((noinline))
void *foo (int x, int y)
{
  static struct S s;
  if (x != sizeof (struct S) || y != 1)
    abort ();
  return &s;
}

__attribute__((noinline))
void bar (struct S *p)
{
  asm volatile ("" : "=m" (*p) : "m" (*p));
}

__attribute__((noinline))
void *baz1 (void *x, void *y,void *z)
{
  if (y || z)
    abort ();
  return x;
}

__attribute__((noinline))
void *baz2 (int x, void *y, void *z, void *a)
{
  if (x || y || z || a)
    abort ();
  return (void *) 0;
}

__attribute__((noinline))
struct S *test (void *a, void *b, void *c, void *d, void *e, void *f, int g, void *h, void *i)
{
  struct S *p, q;
  p = foo (sizeof (*p), 1);
  if (p == 0)
    {
      p = &q;
      __builtin_memset (p, 0, sizeof (*p));
    }
  if (a == 0)
    a = baz1(0, baz2(107,0,0,0), 0);
  p->s1 = a;
  p->s9 = b;
  p->s10 = c;
  p->s11 = d;
  p->s12 = e;
  p->s13 = f;
  p->s3 = g;
  p->s2 = 110;
  p->s16 = h;
  p->s17 = i;
  p->s18 = -1;
  p->s19 = -1;
  p->s20[0] = -1;
  p->s20[1] = -1;
  p->s20[2] = -1;
  if (p == &q)
    {
      bar(p);
      p = 0;
    }
  return p;
}

int
main (void)
{
  int a;
  int b;
  struct S *z = test ((void *) &a, (void *) &b, 0, 0, 0, 0, 0, 0, 0);
  if (z == 0)
    abort ();
  if (z->s1 != (void *) &a || z->s2 != 110 || z->s3 || z->s4)
    abort ();
  if (z->s5 || z->s6 || z->s7 || z->s8)
    abort ();
  if (z->s9 != (void *) &b || z->s10 || z->s11 || z->s12)
    abort ();
  if (z->s13 || z->s14 || z->s15 || z->s16)
    abort ();
  if (z->s17 || z->s18 != -1 || z->s19 != -1)
    abort ();
  if (z->s20[0] != -1 || z->s20[1] != -1 || z->s20[2] != -1)
    abort ();
  return 0;
}
Comment 5 Jakub Jelinek 2007-08-28 15:16:13 UTC
Even more simplified testcase:

extern void abort (void);

struct S
{
  void *s1, *s2;
  unsigned char s3, s4, s5;
};

__attribute__((noinline))
void *foo (void)
{
  static struct S s;
  return &s;
}

__attribute__((noinline))
void *bar ()
{
  return (void *) 0;
}

__attribute__((noinline))
struct S *test (void *a, void *b)
{
  struct S *p, q;
  p = foo ();
  if (p == 0)
    {
      p = &q;
      __builtin_memset (p, 0, sizeof (*p));
    }
  if (a == 0)
    a = bar ();
  p->s1 = a;
  p->s2 = b;
  if (p == &q)
    p = 0;
  return p;
}

int
main (void)
{
  int a;
  int b;
  struct S *z = test ((void *) &a, (void *) &b);
  if (z == 0 || z->s1 != (void *) &a || z->s2 != (void *) &b || z->s3 || z->s4)
    abort ();
  return 0;
}
Comment 6 Jakub Jelinek 2007-08-28 15:58:28 UTC
if (a == 0) a = bar (); isn't necessary either.

salias has:

  # BLOCK 2 freq:10000
  # PRED: ENTRY [100.0%]  (fallthru,exec)
  # VUSE <qD.2026_12(D), SMT.25D.2079_13(D)> { qD.2026 SMT.25D.2079 }
  D.2027_3 = foo ();
  pD.2025_4 = (struct S *) D.2027_3;
  if (pD.2025_4 == 0B)
    goto <bb 3>;
  else
    goto <bb 4>;
  # SUCC: 3 [7.3%]  (true,exec) 4 [92.7%]  (false,exec)

  # BLOCK 3 freq:735
  # PRED: 2 [7.3%]  (true,exec)
  # qD.2026_15 = VDEF <qD.2026_12(D)>
  # SMT.25D.2079_16 = VDEF <SMT.25D.2079_13(D)>
  # SMT.26D.2080_17 = VDEF <SMT.26D.2080_14(D)> { qD.2026 SMT.25D.2079 SMT.26D.2080 }
  __builtin_memset (&qD.2026, 0, 24);
  # SUCC: 4 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:10000
  # PRED: 2 [92.7%]  (false,exec) 3 [100.0%]  (fallthru,exec)
  # qD.2026_11 = PHI <qD.2026_12(D)(2), qD.2026_15(3)>
  # pD.2025_1 = PHI <pD.2025_4(2), &qD.2026(3)>
  # qD.2026_18 = VDEF <qD.2026_11> { qD.2026 }
  pD.2025_1->s1D.2008 = aD.2021_6(D);
  # qD.2026_19 = VDEF <qD.2026_18> { qD.2026 }
  pD.2025_1->s2D.2009 = bD.2022_7(D);

Shouldn't the VDEFs be a PHI of some SMT and qD?  pD.2025_1 can either be what
foo returned, or it can point to the automatic variable q.
Comment 7 Daniel Berlin 2007-09-05 11:50:47 UTC
Subject: Re:  [4.2/4.3 regression] With -ftree-vrp miscompiles a single line of code in SQLite

On 28 Aug 2007 15:58:29 -0000, jakub at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #6 from jakub at gcc dot gnu dot org  2007-08-28 15:58 -------
> if (a == 0) a = bar (); isn't necessary either.
>
> salias has:
>
>   # BLOCK 2 freq:10000
>   # PRED: ENTRY [100.0%]  (fallthru,exec)
>   # VUSE <qD.2026_12(D), SMT.25D.2079_13(D)> { qD.2026 SMT.25D.2079 }
>   D.2027_3 = foo ();
>   pD.2025_4 = (struct S *) D.2027_3;
>   if (pD.2025_4 == 0B)
>     goto <bb 3>;
>   else
>     goto <bb 4>;
>   # SUCC: 3 [7.3%]  (true,exec) 4 [92.7%]  (false,exec)
>
>   # BLOCK 3 freq:735
>   # PRED: 2 [7.3%]  (true,exec)
>   # qD.2026_15 = VDEF <qD.2026_12(D)>
>   # SMT.25D.2079_16 = VDEF <SMT.25D.2079_13(D)>
>   # SMT.26D.2080_17 = VDEF <SMT.26D.2080_14(D)> { qD.2026 SMT.25D.2079
> SMT.26D.2080 }
>   __builtin_memset (&qD.2026, 0, 24);
>   # SUCC: 4 [100.0%]  (fallthru,exec)
>
>   # BLOCK 4 freq:10000
>   # PRED: 2 [92.7%]  (false,exec) 3 [100.0%]  (fallthru,exec)
>   # qD.2026_11 = PHI <qD.2026_12(D)(2), qD.2026_15(3)>
>   # pD.2025_1 = PHI <pD.2025_4(2), &qD.2026(3)>
>   # qD.2026_18 = VDEF <qD.2026_11> { qD.2026 }
>   pD.2025_1->s1D.2008 = aD.2021_6(D);
>   # qD.2026_19 = VDEF <qD.2026_18> { qD.2026 }
>   pD.2025_1->s2D.2009 = bD.2022_7(D);
>
> Shouldn't the VDEFs be a PHI of some SMT and qD?
For VDEF/VUSE, you will never have a PHI of anything other than
multiple versions of the same SMT/virtual variable.

The above looks right to me at a glance.
It is probably pruning the result using TBAA which is what p->s isn't
thought to access the SMT.
Comment 8 Richard Biener 2007-09-07 14:14:10 UTC
Both testcases no longer fail for me on the trunk - do you remember the revision
you had them reduced on?
Comment 9 Jakub Jelinek 2007-09-10 18:38:59 UTC
Fails e.g. with 127816, 127870 up to 127926, the bug goes away
at least on the #c5 testcase with
http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=127927
Comment 10 Mark Mitchell 2007-10-09 19:20:38 UTC
Change target milestone to 4.2.3, as 4.2.2 has been released.
Comment 11 Nick Clifton 2007-11-01 14:02:49 UTC
Created attachment 14451 [details]
Add a param to squelch runaway memory consumption
Comment 12 Nick Clifton 2007-11-01 14:05:36 UTC
Hi Guys,

  I have uploaded a patch for a possible workaround for this problem.  It adds a new param (max-partial-antic-length) which with its default value will stop the tree-pre optimization from eating up all the memory on the host system.  It also adds a testcase for the PR and documents the new param.  Tested without regressions on an x86_64-linux-gnu toolchain.

  What do you think, should this patch be applied ?

Cheers
  Nick
Comment 13 Jakub Jelinek 2007-11-01 14:34:27 UTC
Nick, your patch is most probably fixing PR32540 rather than PR32575, doesn't it?
Comment 14 Nick Clifton 2007-11-01 14:57:35 UTC
Subject: Re:  [4.2/4.3 regression] With -ftree-vrp
 miscompiles a single line of code in SQLite

Hi Jakub,

> Nick, your patch is most probably fixing PR32540 rather than PR32575, doesn't
> it?

Doh.  Yes.  I will fix that in my local sources so that if the patch is 
approved the correct number is used.

Cheers
   Nick


Comment 15 Richard Biener 2007-11-01 15:05:15 UTC
Indeed it does.  But while this is a workaround that works, the problem is in
excessive phi translation which we could stop here (untested! just a wild guess!):

static bool
compute_partial_antic_aux (basic_block block,
                           bool block_has_abnormal_pred_edge)
{
...
              if (phi_nodes (bprime))
                {

and limit the # of PHI args seen here instead?
Comment 16 Jakub Jelinek 2007-11-08 12:54:00 UTC
Continuing after the #c11 .. #c15 unrelated PR32540 comments with the original bug:

I've built -r127926 and -r127927 cc1 and it seems this wasn't fixed just by
a side-effect.
Although p can be either retval of foo or &q, we had in 127926:
  # pD.2027_1 = PHI <pD.2027_4(2), &qD.2028(3)>
  # qD.2028_18 = VDEF <qD.2028_11> { qD.2028 }
  pD.2027_1->s1D.2010 = aD.2023_6(D);
while 127927 has:
  # pD.2027_1 = PHI <pD.2027_4(2), &qD.2028(3)>
  # qD.2028_19 = VDEF <qD.2028_11>
  # SMT.26D.2082_20 = VDEF <SMT.26D.2082_12> { qD.2028 SMT.26D.2082 }
  pD.2027_1->s1D.2010 = aD.2023_6(D);

-pD.2027_1, name memory tag: NMT.27D.2083, is dereferenced, points-to vars: { q }
+pD.2027_1, name memory tag: NMT.27D.2083, is dereferenced, points-to vars: { q SMT.26 }

So guess I'll just submit the testcase for inclusion on the trunk and then we can close this PR.
Comment 17 Jakub Jelinek 2007-11-08 13:08:05 UTC
Subject: Bug 32575

Author: jakub
Date: Thu Nov  8 13:07:54 2007
New Revision: 129998

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=129998
Log:
	PR tree-optimization/32575
	* gcc.c-torture/execute/20071108-1.c: New test.

Added:
    trunk/gcc/testsuite/gcc.c-torture/execute/20071108-1.c
Modified:
    trunk/gcc/testsuite/ChangeLog

Comment 18 Jakub Jelinek 2007-11-08 13:11:49 UTC
I have also retested with the http://www.sqlite.org/gccbug32575.tar.gz
tarball and on the trunk it passes just fine with -O2.
Comment 19 Joseph S. Myers 2008-02-01 16:54:31 UTC
4.2.3 is being released now, changing milestones of open bugs to 4.2.4.
Comment 20 Joseph S. Myers 2008-05-19 20:23:20 UTC
4.2.4 is being released, changing milestones to 4.2.5.
Comment 21 Joseph S. Myers 2009-03-30 22:10:53 UTC
Closing 4.2 branch, fixed in 4.3.
Comment 22 Jackie Rosen 2014-02-16 13:12:30 UTC Comment hidden (spam)