User account creation filtered due to spam.

Bug 30567 - [4.2 Regression] -O3 optimizer bug
Summary: [4.2 Regression] -O3 optimizer bug
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.2.0
: P1 normal
Target Milestone: 4.2.0
Assignee: Richard Biener
URL:
Keywords: alias, wrong-code
Depends on:
Blocks: 30088
  Show dependency treegraph
 
Reported: 2007-01-24 00:08 UTC by Ralf W. Grosse-Kunstleve
Modified: 2007-07-18 09:45 UTC (History)
9 users (show)

See Also:
Host:
Target:
Build:
Known to work: 4.1.3 4.3.0
Known to fail:
Last reconfirmed: 2007-04-26 10:02:33


Attachments
standalone reproducer (9.18 KB, application/x-msdownload)
2007-01-24 00:10 UTC, Ralf W. Grosse-Kunstleve
Details
minimal reproducer (248 bytes, text/plain)
2007-02-26 07:54 UTC, Ralf W. Grosse-Kunstleve
Details
prototype patch (1.04 KB, patch)
2007-04-25 15:04 UTC, Richard Biener
Details | Diff
a patch I like more (901 bytes, patch)
2007-04-25 16:42 UTC, Richard Biener
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ralf W. Grosse-Kunstleve 2007-01-24 00:08:04 UTC
Platform:
  Fedora Core release 4 (Stentz)
  Linux sharptail.lbl.gov 2.6.11-1.1369_FC4smp #1 SMP Thu Jun 2 23:08:39 EDT 2005 i686 i686 i386 GNU/Linux

% g++ -v
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: /net/rosie/scratch2/rwgk/gcc-4_2-branch/configure --prefix=/net/cci-filer1/vol1/tmp/rwgk/gcc_4_2-branch_2007_01_22_0959_fc4_i686 --enable-languages=c,c++,fortran
Thread model: posix
gcc version 4.2.0 20070122 (prerelease)

I'll upload a standalone reproducer. Sorry it is not minimal, but it is already heavily reduced from the original code and has only standard header dependencies.

To reproduce the problem:

g++ -fPIC -O0 ~/dbg.cpp ; ./a.out
1
g++ -fPIC -O3 ~/dbg.cpp ; ./a.out
0
g++ -fPIC -O3 ~/dbg.cpp -DSCITBX_MAT3_TRACE_SIMPLE ; ./a.out
1
g++ -O3 ~/dbg.cpp ; ./a.out
1

The output should be "1" in all cases.

The error occurs in the rot_mx::type() function near the end of the reproducer. Inserting std::cout statements in this function makes the error go away. Therefore I don't know what exactly is going wrong.

By chance I noticed that the SCITBX_MAT3_TRACE_SIMPLE change (see code) also works around the problem. However, the trace() function always works if called directly.

I noticed the error the first time on Dec 29, 2006 under 32-bit Fedora 6. Therefore I believe it is not Fedora 4 or 6 specific, but a general 32-bit optimizer problem.

The gcc svn versions from Dec 29 and Jan 22 both work flawlessly on a 64-bit platfrom (Fedora 5).

I hope you can take it from here. Let me know if you need any other information.
Comment 1 Ralf W. Grosse-Kunstleve 2007-01-24 00:10:33 UTC
Created attachment 12945 [details]
standalone reproducer
Comment 2 Andrew Pinski 2007-01-24 06:57:16 UTC
This works for me with "4.2.0 20061204" and "4.3.0 20070122":
[pinskia@celery pr30567]$ ~/gcc-4.2/bin/g++ -O3 -fPIC dbg.cpp
[pinskia@celery pr30567]$ !./
./a.out
1
Comment 3 Ralf W. Grosse-Kunstleve 2007-01-28 20:03:33 UTC
I've repeated my test with

  g++ (GCC) 4.2.0 20070128 (prerelease)
  SVN revision: 121258

on three platforms:

  x86_64 Fedora Core release 5 (Bordeaux)
    % g++ -fPIC -O3 dbg_gcc_bugzilla_30567.cpp
    % ./a.out
    1

  i686 Fedora Core release 4 (Stentz)
    % g++ -fPIC -O3 dbg_gcc_bugzilla_30567.cpp
    % ./a.out
    0

  Fedora Core release 6 (Zod)
    % g++ -fPIC -O3 dbg_gcc_bugzilla_30567.cpp
    % ./a.out
    0

I.e. 64-bit is still OK, 32-bit is still broken.

Could someone please try to reproduce with the current version of the
4.2 branch? I'm worried that 4.2 is released with this bug, and that we
have to deal with it for years to come...

BTW: the system gcc versions are:

  x86_64 Fedora Core release 5 (Bordeaux)
    g++ (GCC) 4.1.0 20060304 (Red Hat 4.1.0-3)

  i686 Fedora Core release 4 (Stentz)
    g++ (GCC) 4.0.0 20050519 (Red Hat 4.0.0-8)

  i686 Fedora Core release 6 (Zod)
    g++ (GCC) 4.1.1 20061011 (Red Hat 4.1.1-30)
Comment 4 Ralf W. Grosse-Kunstleve 2007-02-03 20:42:28 UTC
I've repeated my test with

  g++ (GCC) 4.2.0 20070203 (prerelease)
  SVN revision: 121547

on two platforms:

  x86_64 Fedora Core release 5 (Bordeaux)
  Fedora Core release 6 (Zod)

The results are still the same as last weekend, i.e. the 32-bit optimizer
is still broken.

Since my reproducer compiles+links in less than a second, wouldn't it be
feasable to run the automatic regression hunt? According to Andrew's
result we just have to concentrate on the few weeks between 2006-12-04
and 2007-01-24.
Comment 5 Wolfgang Bangerth 2007-02-11 04:07:58 UTC
I also can't reproduce this with

bangerth/x> /tmp/bangerth/bin/bin/c++ -v            
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: ../svn/configure --enable-checking --enable-languages=c,c++ --prefix=/tmp/bangerth/bin --with-gmp=/tmp/bangerth/bin --with-mpfr=/tmp/bangerth/bin
Thread model: posix
gcc version 4.3.0 20070125 (experimental)

Since none of us can reproduce it, there's very little we can do :-(

Can you verify whether the testcase works with current mainline for you,
i.e. whether the problem is confined to the 4.2.x branch?

W.
Comment 6 Ralf W. Grosse-Kunstleve 2007-02-11 05:30:37 UTC
I immediately believe that Andrew's and Wolfgang's findings are accurate, but I never claimed that the mainline has a problem. I never even tried it.

My interest it to make sure that our code works with any new gcc release, since that's what the OS makers pick up, and then we are stuck with the remaining bugs for 5+ years.

It looks like a gcc 4.2 release is imminent, therefore I'm testing with the corresponding branch.

From other bug reports I know that you have a "regression hunt" procedure. Is there any way I can submit my reproducer to the hunter? We have a fairly small time bracket already, given by Andrew's 4.2 test and the day this bug was opened. Therefore it would seem straightforward to find the checkin which caused the problem.

I'll repeat my test with the current 4.2 branch and post my results here.
Comment 7 Wolfgang Bangerth 2007-02-12 00:02:35 UTC
(In reply to comment #6)
> I immediately believe that Andrew's and Wolfgang's findings are accurate, but I
> never claimed that the mainline has a problem. I never even tried it.

I didn't want to imply that there was no problem. It just appeared as if
neither Andrew nor I had a recent 4.2.x build around.

The person who has the infrastructure for finding regressions is Janis. Janis,
are you in a position of confirming this PR and finding where on the branch
the problem was introduced? (The PR gives pretty specific dates already.)

W.

> 
> My interest it to make sure that our code works with any new gcc release, since
> that's what the OS makers pick up, and then we are stuck with the remaining
> bugs for 5+ years.
> 
> It looks like a gcc 4.2 release is imminent, therefore I'm testing with the
> corresponding branch.
> 
> From other bug reports I know that you have a "regression hunt" procedure. Is
> there any way I can submit my reproducer to the hunter? We have a fairly small
> time bracket already, given by Andrew's 4.2 test and the day this bug was
> opened. Therefore it would seem straightforward to find the checkin which
> caused the problem.
> 
> I'll repeat my test with the current 4.2 branch and post my results here.
> 
Comment 8 Ralf W. Grosse-Kunstleve 2007-02-12 05:23:35 UTC
I'm in the process of narrowing down the revision bracket the really hard way (make bootstrap; make; make install for each revision, using a binary search).
Currently my best bracket is:

119819 fails
119788 works

I should have it nailed down to the exact revision in the next 12 hours or so.

BTW: I checked the most current revision (yesterday) and it still fails.
Comment 9 Ralf W. Grosse-Kunstleve 2007-02-12 15:47:48 UTC
My binary search (using the gcc-4_2-branch) stopped here:

  119790 OK
  119791 fails

The corresponding commit was:

% svn log -r 119791
------------------------------------------------------------------------
r119791 | dberlin | 2006-12-12 07:31:26 -0800 (Tue, 12 Dec 2006) | 6 lines

2006-12-12  Daniel Berlin  <dberlin@dberlin.org>

        * tree-ssa-structalias.c (handle_ptr_arith): Return false when we
        can't handle the pointer arithmetic.


------------------------------------------------------------------------
Comment 10 Wolfgang Bangerth 2007-02-12 16:03:06 UTC
Daniel, any idea?
Comment 11 Daniel Berlin 2007-02-12 16:37:37 UTC
(In reply to comment #10)
> Daniel, any idea?
> 

None.
This change actually made us more conservative with points-to, it certainly won't cause *more* things to be optimized away.
Comment 12 Ralf W. Grosse-Kunstleve 2007-02-26 02:26:10 UTC
Daniel Berlin wrote:
> This change actually made us more conservative with points-to, it certainly
> won't cause *more* things to be optimized away.

Was the change supposed to fix a certain problem?
If not I suggest the two lines should be backed out since they clearly cause
a regression on the 4.2 branch.
My suggestion is motivated by another experiment I did with the latest
SVN (revision 122315):

Procedure:
  configure; make bootstrap; make; make install
  followed by
  1. running all our unit tests (1700 seconds of CPU time Xeon 2.8GHz)
  2. trying the reproducer with g++ -fPIC -O3 dbg.cpp; ./a.out

Fedora 5, x86_64:
  With SVN "as is" and with the two lines in question backed out
    1. all our unit tests are fine
    2. ./a.out prints "1" (OK)

Fedora 6, i686:
  With SVN "as is"
    1. all our unit tests are fine
    2. ./a.out prints "0" (failure)
  With the two lines in question backed out
    1. all our unit tests are fine
    2. ./a.out prints "1" (OK)

Here is my local patch to back out the two lines introduced in
SVN revision 119791:

Index: gcc/tree-ssa-structalias.c
===================================================================
--- gcc/tree-ssa-structalias.c  (revision 122315)
+++ gcc/tree-ssa-structalias.c  (working copy)
@@ -3303,8 +3303,8 @@
     {
       rhsoffset = TREE_INT_CST_LOW (op1) * BITS_PER_UNIT;
     }
-  else
-    return false;
+/*else
+    return false; */
 
   for (i = 0; VEC_iterate (ce_s, lhsc, i, c); i++)
     for (j = 0; VEC_iterate (ce_s, temp, j, c2); j++)
Comment 13 Daniel Berlin 2007-02-26 02:42:23 UTC
Subject: Re:  -fPIC -O3 optimizer bug (32-bit target only)

On 26 Feb 2007 02:26:12 -0000, rwgk at yahoo dot com
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #12 from rwgk at yahoo dot com  2007-02-26 02:26 -------
> Daniel Berlin wrote:
> > This change actually made us more conservative with points-to, it certainly
> > won't cause *more* things to be optimized away.
>
> Was the change supposed to fix a certain problem?
Yes.
http://gcc.gnu.org/ml/gcc-patches/2006-12/msg00817.html
It also fixed vect-101 testcase, in addition to 464.h264ref from spec2006.
It initially caused a failure in 403.gcc, but this was later found to
be a latent bug elsewhere.
I imagine yours is the same thing (a latent bug elsewhere).

If you revert it, you will start failing these regression tests again.

> If not I suggest the two lines should be backed out since they clearly cause
> a regression on the 4.2 branch.
Then whatever backing that out papers over should be fixed.

> Here is my local patch to back out the two lines introduced in
> SVN revision 119791:

Sorry, but i'm not going to accept a patch that clearly does the wrong
thing, and will cause existing regression tests to fail, so that we
can hide some other latent bug.
If you want to try pointing out why you think what the patch does is
incorrect, i'm happy to work through it with you.
Comment 14 Ralf W. Grosse-Kunstleve 2007-02-26 03:37:36 UTC
> Yes.
> http://gcc.gnu.org/ml/gcc-patches/2006-12/msg00817.html
> It also fixed vect-101 testcase, in addition to 464.h264ref from spec2006.
> It initially caused a failure in 403.gcc, but this was later found to
> be a latent bug elsewhere.

OK.

> I imagine yours is the same thing (a latent bug elsewhere).

I was thinking the same thing.

> If you want to try pointing out why you think what the patch does is
> incorrect, i'm happy to work through it with you.

I'm just a user sounding an alarm bell. Your "latent bug somewhere
else" idea makes total sense to me.

I'm a little frustrated that all my efforts haven't even lead to
confirming the bug, and I'm afraid that it gets swept under the rug
and my time was wasted. If you could at least confirm the bug
I'd be motivated to strip down my reproducer some more if that
could help resolving this problem.
Comment 15 Daniel Berlin 2007-02-26 04:38:22 UTC
I'll happily confirm I can reproduce it on my i686-pc-linux-gnu machine
Comment 16 Ralf W. Grosse-Kunstleve 2007-02-26 07:54:50 UTC
Created attachment 13110 [details]
minimal reproducer

I got it down to 29 lines. There aren't any includes, defines are typedefs left.
Attempts to further reduce the code change the result.
The command line I used during the reduction process was:

g++ -fPIC -O3 dbg_minimal.cpp && ./a.out ; echo $status ; g++ -fPIC -O0 dbg_minimal.cpp && ./a.out ; echo $status

The resulting output is:
0
1

Here 0 indicates failure.

I really hope this helps you pin-pointing the bug!

For completeness:
% g++ --version
g++ (GCC) 4.2.0 20070225 (prerelease)

This is SVN revision 122315.
Comment 17 H.J. Lu 2007-04-18 20:00:25 UTC
Testcase in comment #16 failed on ia32, x86-64 and ia64 with -O3. 051t.alias3
dump looks strange. 102t.final_cleanup dump is:

[hjl@gnu-2 bad]$ cat x.cc.102t.final_cleanup

;; Function int main() (main)

int main() ()
{
  int SR.19;

<bb 2>:
  return SR.19 == 1 == 0;

}
[hjl@gnu-2 bad]$
Comment 18 H.J. Lu 2007-04-18 20:02:25 UTC
(In reply to comment #17)
> Testcase in comment #16 failed on ia32, x86-64 and ia64 with -O3. 051t.alias3
> dump looks strange. 102t.final_cleanup dump is:
> 
> [hjl@gnu-2 bad]$ cat x.cc.102t.final_cleanup
> 
> ;; Function int main() (main)
> 
> int main() ()
> {
>   int SR.19;
> 
> <bb 2>:
>   return SR.19 == 1 == 0;
> 
> }
> [hjl@gnu-2 bad]$
> 

I used a modified version to return 0 when success:

[hjl@gnu-2 30567]$ cat x.cc
template <typename T>
struct const_ref
{
  const T* begin;
  const_ref(const T* b) : begin(b) {}
};

template <typename T>
T sum(const_ref<T> const& a)
{
  T result = 0;
  for(unsigned i=0;i<1;i++) result += a.begin[i];
  return result;
}

struct tiny_plain
{
  int elems[2];
  tiny_plain() { elems[0]=1; }
};

struct vec3 : tiny_plain {};

struct mat3
{
  int type() const { return sum(const_ref<int>(vec3().elems)) == 1; }
};

int main() { return mat3().type() ? 0 :1; }
[hjl@gnu-2 30567]$
Comment 19 H.J. Lu 2007-04-18 21:47:05 UTC
For

for(unsigned i=0;i<1;i++) result += a.begin[i];

x.ii.004t.gimple looks like

  const int * D.2454;
  long unsigned int D.2455;
  long unsigned int D.2456;
  const int * D.2457;
  const int * D.2458;
    D.2455 = (long unsigned int) i;
    D.2456 = D.2455 * 4;
    D.2457 = (const int *) D.2456;
    D.2458 = D.2454 + D.2457;

Why array index is a pointer?

Comment 20 H.J. Lu 2007-04-18 22:11:26 UTC
build_array_ref calls

    return build_indirect_ref (cp_build_binary_op (PLUS_EXPR, ar, ind),
                               "array indexing");

with ar as pointer and ind as integer. cp_build_binary_op turns ind into
pointer and we wind up with

    D.2455 = (long unsigned int) i;
    D.2456 = D.2455 * 4;
    D.2457 = (const int *) D.2456;
    D.2458 = D.2454 + D.2457;
Comment 21 Richard Biener 2007-04-18 22:47:45 UTC
Code in comment #19 is perfectly correct.  I can reproduce the difference with
the attached testcase and -O -fstrict-aliasing -finline-functions.  -fno-strict-aliasing fixes it.
Comment 22 Richard Biener 2007-04-18 22:56:23 UTC
And this is the bug:

  #   SFT.0_29 = V_MAY_DEF <SFT.0_27>;
  #   SFT.1_30 = V_MAY_DEF <SFT.1_28>;
  this_7->elems[0] = 1;

the following is supposed to read it.

  #   VUSE <SMT.11_31>;
  D.2148_18 = *D.2147_17;

it's interesting that

Points-to sets

this_7 = { D.2141 D.2141.32 }
D.2147_17 = { ANYTHING D.2141 D.2141.32 }

but

Pointed-to sets for pointers in int main()

this_7, name memory tag: NMT.12, is dereferenced, points-to vars: { SFT.0 SFT.1 }
D.2147_17, is dereferenced, points-to anything
Comment 23 Richard Biener 2007-04-21 18:07:50 UTC
This is a regression.  Danny?
Comment 24 Richard Biener 2007-04-21 18:38:23 UTC
Actually the handle_ptr_arith change made a difference as we (luckily?)
for D.2147_17 = D.2144_14 + D.2146_16

  D.2144_14 = a_11->begin;
  D.2145_15 = i_1 * 4;
  D.2146_16 = (const int *) D.2145_15;
  D.2147_17 = D.2144_14 + D.2146_16;

add a constraint for offset zero (that's pure luck) for D.2144_14.  So, the
difference in constraints good vs. bad is

@@ -26,10 +26,10 @@
 D.2144_14 = *a_11
 D.2146_16 = &ANYTHING
 D.2147_17 = D.2144_14
+D.2147_17 = D.2146_16
 ESCAPED_VARS = D.2092_25

 Collapsing static cycles and doing variable substitution:
-Collapsing D.2147_17 into D.2144_14
 Collapsing this_7 into D.2140_6
 Collapsing NONLOCAL.8 into ESCAPED_VARS

@@ -54,7 +54,7 @@
 a_11 = { D.2142 }
 D.2144_14 = { D.2141 D.2141.32 }
 D.2146_16 = { ANYTHING }
-D.2147_17 = { D.2141 D.2141.32 }
+D.2147_17 = { ANYTHING D.2141 D.2141.32 }
 D.2092_25 = { }

 main: Total number of aliased vops: 3


Unrelated to this the following resolves possible problems in handle_ptr_arith:

Index: tree-ssa-structalias.c
===================================================================
--- tree-ssa-structalias.c      (revision 124018)
+++ tree-ssa-structalias.c      (working copy)
@@ -3287,7 +3287,7 @@ handle_ptr_arith (VEC (ce_s, heap) *lhsc
   unsigned int i = 0;
   unsigned int j = 0;
   VEC (ce_s, heap) *temp = NULL;
-  unsigned int rhsoffset = 0;
+  unsigned HOST_WIDE_INT rhsoffset = 0;

   if (TREE_CODE (expr) != PLUS_EXPR
       && TREE_CODE (expr) != MINUS_EXPR)
@@ -3298,8 +3298,10 @@ handle_ptr_arith (VEC (ce_s, heap) *lhsc

   get_constraint_for (op0, &temp);
   if (POINTER_TYPE_P (TREE_TYPE (op0))
-      && TREE_CODE (op1) == INTEGER_CST
-      && TREE_CODE (expr) == PLUS_EXPR)
+      && host_integerp (op1, 0)
+      && TREE_CODE (expr) == PLUS_EXPR
+      /* Make sure the pointer offset is positive.  */
+      && tree_int_cst_msb (op1) == 0)
     {
       rhsoffset = TREE_INT_CST_LOW (op1) * BITS_PER_UNIT;
     }

though that last multiplication may still overflow.
Comment 25 Daniel Berlin 2007-04-25 03:14:07 UTC
(In reply to comment #23)
> This is a regression.  Danny?
> 

It actually should get assigned anything as a points-to set, so the "bad" constraints are correct.

We should also always get correct aliasing even if everything was assigned ANYTHING as a variable.
Sadly, we don't in 4.2.0, as this case shows.

I'm not going to get a chance to look at this for at least week, I have something taking priority this week.

If you want to tackle it before then, I would suggest trying to see why we decide not to give the SFT's to the variable marked with anything.

Comment 26 Richard Biener 2007-04-25 14:22:12 UTC
We fail to add the SFTs to the may_alias set of SMT.11, so add_virtual_operand
sees NULL may_aliases and doesn't add SFTs as clobbered.

I believe compute_flow_insensitive_aliasing is the culprit here as one can easily
see that if we enter the

  for (i = 0; i < ai->num_pointers; i++)
    {
      size_t j;
      struct alias_map_d *p_map1 = ai->pointers[i];
      tree tag1 = var_ann (p_map1->var)->symbol_mem_tag;
      bitmap may_aliases1 = p_map1->may_aliases;

      if (PTR_IS_REF_ALL (p_map1->var))
        continue;

      for (j = i + 1; j < ai->num_pointers; j++)
        {
          struct alias_map_d *p_map2 = ai->pointers[j];
          tree tag2 = var_ann (p_map2->var)->symbol_mem_tag;
          bitmap may_aliases2 = p_map2->may_aliases;


loop with may_aliases2 empty it will stay so.  So as a fix I suggest to
add may aliases symmetrically like with

Index: tree-ssa-alias.c
===================================================================
*** tree-ssa-alias.c    (revision 124151)
--- tree-ssa-alias.c    (working copy)
*************** compute_flow_insensitive_aliasing (struc
*** 1287,1292 ****
--- 1287,1294 ----
          struct alias_map_d *p_map2 = ai->pointers[j];
          tree tag2 = var_ann (p_map2->var)->symbol_mem_tag;
          bitmap may_aliases2 = p_map2->may_aliases;
+         unsigned int k;
+         bitmap_iterator bi;
  
          if (PTR_IS_REF_ALL (p_map2->var))
            continue;
*************** compute_flow_insensitive_aliasing (struc
*** 1301,1323 ****
            continue;
  
          if (!bitmap_empty_p (may_aliases2))
!           {
!             unsigned int k;
!             bitmap_iterator bi;
! 
!             /* Add all the aliases for TAG2 into TAG1's alias set.
!                FIXME, update grouping heuristic counters.  */
!             EXECUTE_IF_SET_IN_BITMAP (may_aliases2, 0, k, bi)
!               add_may_alias (tag1, referenced_var (k));
!             bitmap_ior_into (may_aliases1, may_aliases2);
!           }
          else
!           {
!             /* Since TAG2 does not have any aliases of its own, add
!                TAG2 itself to the alias set of TAG1.  */
!             add_may_alias (tag1, tag2);
!             bitmap_set_bit (may_aliases1, DECL_UID (tag2));
!           }
        }
      }
    
--- 1303,1325 ----
            continue;
  
          if (!bitmap_empty_p (may_aliases2))
!           /* Add all the aliases for TAG2 into TAG1's alias set.
!              FIXME, update grouping heuristic counters.  */
!           EXECUTE_IF_SET_IN_BITMAP (may_aliases2, 0, k, bi)
!             add_may_alias (tag1, referenced_var (k));
!         else
!           add_may_alias (tag1, tag2);
! 
!         if (!bitmap_empty_p (may_aliases1))
!           /* Add all the aliases for TAG2 into TAG1's alias set.
!              FIXME, update grouping heuristic counters.  */
!           EXECUTE_IF_SET_IN_BITMAP (may_aliases1, 0, k, bi)
!             add_may_alias (tag2, referenced_var (k));
          else
!           add_may_alias (tag2, tag1);
! 
!         bitmap_ior_into (may_aliases2, may_aliases1);
!         bitmap_ior_into (may_aliases1, may_aliases2);
        }
      }
    


but I have no clue what I am doing here.  't looks like big mess.

I prepare the workaround as alternative ;)
Comment 27 Richard Biener 2007-04-25 14:42:48 UTC
The workaround doesn't work.  I'll test the patch in comment #26, otherwise
I'm out of ideas and clue on how it should work.
Comment 28 Richard Biener 2007-04-25 15:04:26 UTC
Created attachment 13439 [details]
prototype patch

I'm testing this one.  It'll make aliasing slower and more conservative, so I
bet it's not the "right" or the "best" fix.
Comment 29 Richard Biener 2007-04-25 16:32:41 UTC
Oh, btw. why the may_aliases bitmap for SMT.11 is empty on entry to the loops computing the transitive closure (it tries to, right?) is that SFT.0 is not
considered stored to in the first nested loop over pointers and addressable
vars.  Shouldn't we add the vars NMT aliases to ai->written_vars in
compute_flow_sensitive_aliasing?  This also seems to fix the problem:

Index: tree-ssa-alias.c
===================================================================
*** tree-ssa-alias.c    (revision 124151)
--- tree-ssa-alias.c    (working copy)
*************** compute_flow_sensitive_aliasing (struct 
*** 1142,1152 ****
         one).  Note that only pointers that have been dereferenced will
         have a name memory tag.  */
        if (pi->name_mem_tag && pi->pt_vars)
!       EXECUTE_IF_SET_IN_BITMAP (pi->pt_vars, 0, j, bi)
!         {
!           add_may_alias (pi->name_mem_tag, referenced_var (j));
!           add_may_alias (v_ann->symbol_mem_tag, referenced_var (j));
!         }
      }
  }
  
--- 1142,1159 ----
         one).  Note that only pointers that have been dereferenced will
         have a name memory tag.  */
        if (pi->name_mem_tag && pi->pt_vars)
!       {
!         bool stored_to = bitmap_bit_p (ai->dereferenced_ptrs_store,
!                                        DECL_UID (SSA_NAME_VAR (ptr)));
!         EXECUTE_IF_SET_IN_BITMAP (pi->pt_vars, 0, j, bi)
!           {
!             add_may_alias (pi->name_mem_tag, referenced_var (j));
!             if (stored_to)
!               bitmap_set_bit (ai->written_vars,
!                               DECL_UID (referenced_var (j)));
!             add_may_alias (v_ann->symbol_mem_tag, referenced_var (j));
!           }
!       }
      }
  }
  
Comment 30 Richard Biener 2007-04-25 16:42:06 UTC
Created attachment 13441 [details]
a patch I like more

this one attached, bootstrap/testing in progress.
Comment 31 Daniel Berlin 2007-04-25 17:01:23 UTC
Subject: Re:  [4.2 Regression] -O3 optimizer bug

On 25 Apr 2007 15:32:41 -0000, rguenth at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #29 from rguenth at gcc dot gnu dot org  2007-04-25 16:32 -------
> Oh, btw. why the may_aliases bitmap for SMT.11 is empty on entry to the loops
> computing the transitive closure (it tries to, right?) is that SFT.0 is not
> considered stored to in the first nested loop over pointers and addressable
> vars.  Shouldn't we add the vars NMT aliases to ai->written_vars in
> compute_flow_sensitive_aliasing?

This is a better fix.
The other "fix" is just a hack.  Aliasing is not symmetric, so we
shouldn't make the two points-to sets equal.

However, I bet if you look at where we initially add written vars,
you'll discover we are missing it there for some reason, and don't
need to post-add it like you do here.

I think i fixed something like this in 4.3 where we didn't consider
stores to globals or arguments to be written or somesuch when we were
initially generating written vars.
Comment 32 Richard Biener 2007-04-25 17:13:36 UTC
No idea.  The only place I found was setup_pointers_and_addressables, but that
hits the path only if

      /* Add pointer variables that have been dereferenced to the POINTERS
         array and create a symbol memory tag for them.  */
      if (POINTER_TYPE_P (TREE_TYPE (var)))
        {

but we don't have a pointer through which SMT we do store to SFT.0, and we
don't seem to care about NMTs and their aliases in this loop at all.

(I'm lost here now, I'll post the patch and the result of bootstrap & test
tomorrow -- I'd love to see a better approach to fixing this ;)
Comment 33 Daniel Berlin 2007-04-25 19:45:35 UTC
I think richi said on IRC that the following backport from 4.3 will fix it (if so, it's the correct fix here)

Index: tree-ssa-structalias.c
===================================================================
--- tree-ssa-structalias.c	(revision 122853)
+++ tree-ssa-structalias.c	(working copy)
@@ -3228,7 +3228,8 @@ update_alias_info (tree stmt, struct ali
 	  /* If the statement makes a function call, assume
 	     that pointer OP will be dereferenced in a store
 	     operation inside the called function.  */
-	  if (get_call_expr_in (stmt))
+	  if (get_call_expr_in (stmt)
+              || stmt_escape_type == ESCAPE_STORED_IN_GLOBAL)
 	    {
 	      bitmap_set_bit (ai->dereferenced_ptrs_store, DECL_UID (var));
 	      pi->is_dereferenced = 1;
Comment 34 Richard Biener 2007-04-26 10:02:33 UTC
I'll bootstrap & regtest that thing and commit it.
Comment 35 Richard Biener 2007-04-26 12:15:45 UTC
Subject: Bug 30567

Author: rguenth
Date: Thu Apr 26 12:15:16 2007
New Revision: 124184

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=124184
Log:
2007-04-26  Richard Guenther  <rguenther@suse.de>
	Daniel Berlin  <dberlin@dberlin.org>

	PR tree-optimization/30567
	* tree-ssa-structalias.c (update_alias_info): Record dereference
	also if ESCAPE_STORED_IN_GLOBAL.

	* g++.dg/other/pr30567.C: New testcase.

Added:
    branches/gcc-4_2-branch/gcc/testsuite/g++.dg/other/pr30567.C
Modified:
    branches/gcc-4_2-branch/gcc/ChangeLog
    branches/gcc-4_2-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_2-branch/gcc/tree-ssa-structalias.c

Comment 36 Richard Biener 2007-04-26 12:16:42 UTC
Fixed.
Comment 37 Richard Biener 2007-04-26 16:50:47 UTC
Subject: Bug 30567

Author: rguenth
Date: Thu Apr 26 16:50:32 2007
New Revision: 124191

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=124191
Log:
2007-04-26  Richard Guenther  <rguenther@suse.de>
        Daniel Berlin  <dberlin@dberlin.org>

        PR tree-optimization/30567
        * g++.dg/other/pr30567.C: New testcase.

Added:
    trunk/gcc/testsuite/g++.dg/other/pr30567.C
Modified:
    trunk/gcc/testsuite/ChangeLog