Bug 39509 - [4.4 Regression] bad optimization(?) pure virtual function call with -O2
Summary: [4.4 Regression] bad optimization(?) pure virtual function call with -O2
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.3.2
: P2 normal
Target Milestone: 4.4.7
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
: 39604 42394 (view as bug list)
Depends on: 39604
Blocks:
  Show dependency treegraph
 
Reported: 2009-03-20 03:02 UTC by Eric Niebler
Modified: 2012-03-13 13:13 UTC (History)
9 users (show)

See Also:
Host:
Target:
Build:
Known to work: 4.1.1, 4.5.2, 4.6.2, 4.7.0
Known to fail: 4.2.4, 4.3.3, 4.4.0
Last reconfirmed: 2009-05-11 12:04:41


Attachments
tarred, gzipped preprocessed c++ source file (329.48 KB, application/x-compressed-tar)
2009-03-20 03:03 UTC, Eric Niebler
Details
unincluded testcase (133.10 KB, application/octet-stream)
2009-05-11 12:03 UTC, Richard Biener
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Niebler 2009-03-20 03:02:06 UTC
When attached code is compiled with "g++ -01 -x c++ main.i" the result executes file. When compiled with -O2 it segfaults with "pure virtual function call". This may be a case of an overeager optimization.
Comment 1 Eric Niebler 2009-03-20 03:03:35 UTC
Created attachment 17501 [details]
tarred, gzipped preprocessed c++ source file
Comment 2 Eric Niebler 2009-03-20 04:04:27 UTC
Additional information: adding "__attribute__((noinline))" to the constructor for xpression_adaptor (line 82452) makes the problem go away. Definitely looks like an optimization problem to me.
Comment 3 Andrew Pinski 2009-04-16 20:54:50 UTC
What target are you compiling on?   I could not link this on i686-darwin:
std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Rep::_S_create(unsigned int, unsigned int, std::allocator<char> const&)
std::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned int)
___assert_fail


Can you give the output of gcc -v ?
Comment 4 Eric Niebler 2009-04-21 01:10:48 UTC
I'm sorry to say that I no longer have access to the Linux machine on which I gcc-4.3 installed. I have only my windows laptop now, and I can't for the life of me get gcc-4.3 working under cygwin. I can, however, refer you to the original bug report, filed against a library in Boost:

https://svn.boost.org/trac/boost/ticket/2655

For reference, here is the source code (requires Boost to be in the path, but uses the header-only parts of Boost and so does not require you to build or link to any part of Boost).

{{{
#include <string> 
#include <boost/xpressive/xpressive_static.hpp> 
using std::string; 
namespace xpr = boost::xpressive;

int main()
{
    string text = "at";
    
    xpr::sregex r1 = xpr::as_xpr('a');
    xpr::sregex r2 = 'b' >> r1;
    xpr::sregex r3 = r2 | r1;

    xpr::regex_replace(text, r3, string(""));
    return 0;
}
}}}
Comment 5 Jonathan Wakely 2009-05-11 11:15:56 UTC
I can reproduce this on x86_64-unknown-linux-gnu at -O2 with 4.3.2 and 4.5-20090402 snapshot, and at -O3 with 4.2.2

I can't reproduce it with 4.1.2



Comment 6 Jonathan Wakely 2009-05-11 11:23:19 UTC
I was using Boost 1.37.0 with:

Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.5-20090402/configure --prefix=/dev/shm/wakelyjo/insroot/4.5-20090402 --enable-languages=c,c++ --disable-bootstrap --disable-checking --with-gmp=/dev/shm/wakelyjo/stage --with-mpfr=/dev/shm/wakelyjo/stage
Thread model: posix
gcc version 4.5.0 20090402 (experimental) (GCC)

and

Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: ../gcc-4.3.2/configure --prefix=/opt/gcc/32-bit/4.3.2 --enable-languages=c,c++,fortran,objc,obj-c++ --with-gnu-as --with-as=/opt/binutils/32-bit/2.18/bin/as --with-gnu-ld --with-ld=/opt/binutils/32-bit/2.18/bin/ld --with-gmp=/var/tmp/stage --with-mpfr=/var/tmp/stage --build=i686-pc-linux-gnu --with-arch=pentium4 --enable-shared --enable-threads=posix --enable-nls --enable-libstdcxx-debug --enable-__cxa_atexit
Thread model: posix
gcc version 4.3.2 (GCC)
Comment 7 Richard Biener 2009-05-11 12:03:22 UTC
Created attachment 17852 [details]
unincluded testcase
Comment 8 Richard Biener 2009-05-11 12:04:41 UTC
4.1 and 4.2 cannot compile the testcase.
Comment 9 Jonathan Wakely 2009-05-11 12:42:32 UTC
The testcase will have included lots of code from Boost headers conditionally based on __GNUC__ etc. so it's not surprising it doesn't compile with 4.1 or 4.2

If wanted I can produce an unincluded version using 4.1 and 4.2, but I assume there's no point as those branches are closed.

Comment 10 rguenther@suse.de 2009-05-11 12:44:50 UTC
Subject: Re:  bad optimization(?) pure virtual function
 call with -O2

On Mon, 11 May 2009, jwakely dot gcc at gmail dot com wrote:

> ------- Comment #9 from jwakely dot gcc at gmail dot com  2009-05-11 12:42 -------
> The testcase will have included lots of code from Boost headers conditionally
> based on __GNUC__ etc. so it's not surprising it doesn't compile with 4.1 or
> 4.2
> 
> If wanted I can produce an unincluded version using 4.1 and 4.2, but I assume
> there's no point as those branches are closed.

Right.  I was trying to check if this is a regression or not.

Richard.
Comment 11 Jonathan Wakely 2009-05-11 13:09:10 UTC
It does seem to be a regression.  I preprocessed the program in comment 4 with GCC version 4.1.2 20071124 (Red Hat 4.1.2-42), then ran uninclude to produce a standalone testcase and compiled it with different versions of GCC

GCC 4.1.1 (FSF) at -O3 doesn't call the pure virtual.
GCC 4.1.2 (Red Hat) at -O3 doesn't call the pure virtual.
GCC 4.2.2 (FSF) at -O3 calls the pure virtual.
GCC 4.3.2 (FSF) can't compile the program (again, due to Boost's heavy use of preprocessor checks)

Do you want that unincluded testcase attached?
Comment 12 rguenther@suse.de 2009-05-11 13:10:51 UTC
Subject: Re:  bad optimization(?) pure virtual function
 call with -O2

On Mon, 11 May 2009, jwakely dot gcc at gmail dot com wrote:

> ------- Comment #11 from jwakely dot gcc at gmail dot com  2009-05-11 13:09 -------
> It does seem to be a regression.  I preprocessed the program in comment 4 with
> GCC version 4.1.2 20071124 (Red Hat 4.1.2-42), then ran uninclude to produce a
> standalone testcase and compiled it with different versions of GCC
> 
> GCC 4.1.1 (FSF) at -O3 doesn't call the pure virtual.
> GCC 4.1.2 (Red Hat) at -O3 doesn't call the pure virtual.
> GCC 4.2.2 (FSF) at -O3 calls the pure virtual.
> GCC 4.3.2 (FSF) can't compile the program (again, due to Boost's heavy use of
> preprocessor checks)
> 
> Do you want that unincluded testcase attached?

No, that's enough info.

Richard.
Comment 13 Jakub Jelinek 2009-05-12 18:56:10 UTC
In 4.4.0/x86_64-linux at -O2 at least the problem seems to be that in:
_ZNK5boost9xpressive6detail17xpression_adaptorINS1_16static_xpressionINS1_17alternate_matcherINS1_15alternates_listINS3_INS1_13regex_matcherIN9__gnu_cxx17__normal_iteratorIPKcSsEEEENS3_INS1_21alternate_end_matcherENS1_7no_nextEEEEENS5_ISG_NS_6fusion3nilEEEEENS0_16cpp_regex_traitsIcEEEENS3_INS1_11end_matcherESE_EEEENS1_12matchable_exISB_EEE5matchERNS1_11match_stateISB_EE
adaptor is destructed before the push_context_match call. In *.optimized dump we have:
<bb 9>:
  adaptor.D.166953._vptr.matchable = &_ZTVN5boost9xpressive6detail17xpression_adaptorINS_17reference_wrapperIKNS1_17stacked_xpressionINS1
_16static_xpressionINS1_11end_matcherENS1_7no_nextEEENS5_INS1_21alternate_end_matcherES7_EEEEEENS1_9matchableIN9__gnu_cxx17__normal_itera
torIPKcSsEEEEEE[2];
  adaptor.xpr_.t_ = (const struct stacked_xpression *) (const struct stacked_xpression &) &this->xpr_.D.142582.alternates_.D.142415.cdr.D
.142229.car.next_;
  adaptor.D.166953._vptr.matchable = &_ZTVN5boost9xpressive6detail9matchableIN9__gnu_cxx17__normal_iteratorIPKcSsEEEE[2];
  D.187198 = push_context_match (&this->xpr_.D.142582.alternates_.D.142415.cdr.D.142229.car.D.142156.impl_, state, &adaptor.D.166953);
  goto <bb 11>;
Comment 14 Jakub Jelinek 2009-05-12 19:06:21 UTC
In *.pre it still looks correct:
  adaptor.D.166953._vptr.matchable = &_ZTVN5boost9xpressive6detail17xpression_adaptorINS_17reference_wrapperIKNS1_17stacked_xpressionINS1
_16static_xpressionINS1_11end_matcherENS1_7no_nextEEENS5_INS1_21alternate_end_matcherES7_EEEEEENS1_9matchableIN9__gnu_cxx17__normal_itera
torIPKcSsEEEEEE[2];
  adaptor.xpr_.t_ = D.187226_39;
  D.187223_40 = &this_1(D)->xpr_.D.142582.alternates_.D.142415.cdr.D.142229.car.D.142156.impl_;
  D.187198_41 = push_context_match (D.187223_40, state_4(D), &adaptor.D.166953);

and

  adaptor.D.166953._vptr.matchable = &_ZTVN5boost9xpressive6detail17xpression_adaptorINS_17reference_wrapperIKNS1_17stacked_xpressionINS1
_16static_xpressionINS1_11end_matcherENS1_7no_nextEEENS5_INS1_21alternate_end_matcherES7_EEEEEENS1_9matchableIN9__gnu_cxx17__normal_itera
torIPKcSsEEEEEE[2];
  adaptor.xpr_.t_ = D.187218_33;
  D.187221_34 = &this_1(D)->xpr_.D.142582.alternates_.D.142415.car.D.142156.impl_;
  D.187216_35 = push_context_match (D.187221_34, state_4(D), &adaptor.D.166953);

but *.sink already breaks the first push_context_match call:

  D.187227_38 = &this_1(D)->xpr_.D.142582.alternates_.D.142415.cdr.D.142229.car.next_;
  D.187226_39 = (const struct stacked_xpression &) D.187227_38;
  adaptor.D.166953._vptr.matchable = &_ZTVN5boost9xpressive6detail17xpression_adaptorINS_17reference_wrapperIKNS1_17stacked_xpressionINS1
_16static_xpressionINS1_11end_matcherENS1_7no_nextEEENS5_INS1_21alternate_end_matcherES7_EEEEEENS1_9matchableIN9__gnu_cxx17__normal_itera
torIPKcSsEEEEEE[2];
  adaptor.xpr_.t_ = D.187226_39;
  D.187223_40 = &this_1(D)->xpr_.D.142582.alternates_.D.142415.cdr.D.142229.car.D.142156.impl_;
  adaptor.D.166953._vptr.matchable = &_ZTVN5boost9xpressive6detail9matchableIN9__gnu_cxx17__normal_iteratorIPKcSsEEEE[2];
  D.187198_41 = push_context_match (D.187223_40, state_4(D), &adaptor.D.166953);
Comment 15 Jakub Jelinek 2009-05-12 19:34:43 UTC
Ah, there are two different adaptor variables after inlining, so this might as well be a dup of PR39604.
Comment 16 Richard Biener 2009-05-21 10:35:57 UTC
Which means it should work with -fno-tree-sink.
Comment 17 Richard Biener 2009-08-04 12:29:59 UTC
GCC 4.3.4 is being released, adjusting target milestone.
Comment 18 Richard Biener 2009-12-16 20:18:53 UTC
*** Bug 42394 has been marked as a duplicate of this bug. ***
Comment 19 Richard Biener 2009-12-16 20:19:52 UTC
PR42394 has a smaller testcase (but it doesn't fail with trunk).  It works
with -fno-tree-sink.
Comment 20 Andrew Pinski 2010-03-08 22:50:21 UTC
*** Bug 39604 has been marked as a duplicate of this bug. ***
Comment 21 Richard Biener 2010-05-22 18:13:29 UTC
GCC 4.3.5 is being released, adjusting target milestone.
Comment 22 Richard Biener 2011-06-27 12:12:39 UTC
4.3 branch is being closed, moving to 4.4.7 target.
Comment 23 Andrew Pinski 2011-12-08 03:24:18 UTC
I think this has been fixed on the trunk by:
2011-11-08  Michael Matz  <matz@suse.de>

...
        * tree.h (TREE_CLOBBER_P): New macro.
...
Comment 24 Richard Biener 2011-12-08 08:55:29 UTC
Works with 4.5 and 4.6 because tree-ssa-sink.c is basically "broken" for
memory there (after alias improvements merge).  That was fixed for 4.7
which was in turn broken again for this testcase which indeed was fixed
by Michas patch.

Thus, a 4.4. regression only now.
Comment 25 Jakub Jelinek 2012-03-13 13:13:06 UTC
"Fixed" by dumber sinking in 4.5/4.6 and fixed for real in 4.7+. 4.4 is no longer supported.