The following portion of shared_ptr_test.cpp: #include <boost/shared_ptr.hpp> #include <boost/detail/lightweight_test.hpp> int main() { boost::shared_ptr<int> pi(new int); boost::shared_ptr<void> pv(pi); boost::shared_ptr<int> pi2 = boost::static_pointer_cast<int>(pv); BOOST_TEST(pi.get() == pi2.get()); BOOST_TEST(!(pi < pi2 || pi2 < pi)); BOOST_TEST(pi.use_count() == 3); BOOST_TEST(pv.use_count() == 3); BOOST_TEST(pi2.use_count() == 3); return boost::report_errors(); } (using the current Boost CVS HEAD that will become 1.33 soon) fails the three use_count() tests with g++ 4.0.0 -O3, but not with -O0, 1, 2. It seems that a reference count increment is being optimized away. This does not happen if BOOST_TEST is replaced with assert.
Created attachment 8871 [details] Preprocessed source
I can confirm this error for 4.0 on x86 and x86-64 but not with 4.1
-O2 -finline-functions shows the errors, -O3 -fno-inline-functions does not.
Using 4.0.1 20050509 on x86-64 Linux -O2 -finline-functions -finline-limit=79 is OK -O2 -finline-functions -finline-limit=80 is not
Also, I *do* still see the failure if I replace BOOST_TEST() with assert(). This still fails with -finline-functions -finline-limit=80 #include <boost/shared_ptr.hpp> #include <cassert> int main() { boost::shared_ptr<int> pi(new int); boost::shared_ptr<void> pv(pi); boost::shared_ptr<int> pi2 = boost::static_pointer_cast<int>(pv); assert(pi.get() == pi2.get()); assert(pi.use_count() == 3); // fails here return 0; } *however* if I remove the first assertion it succeeds.
Hmm, -fno-sra works around the issue.
Hmm, I am starting to think there is an alias bug in 4.0.0 somewhere.
The difference between the mainline and the 4.0.0 branch is the following additional SRA in 4.0.x: +Initial instantiation for pv + pv.pn.pi_ -> pv$pn$pi_ + pv.px -> pv$px +Initial instantiation for pi2 + pi2.pn.pi_ -> pi2$pn$pi_ + pi2.px -> pi2$px +
Also the other major difference is: mainline: - # pi_503 = V_MAY_DEF <pi_504>; - pi.pn.pi_ = this_45; 4.0.0: + # TMT.242_3 = V_MAY_DEF <TMT.242_66>; + pi.pn.pi_ = this_99; See how we used a TMT in 4.0.0 case and pi on the mainline.
This seems to be an RTL bug in 4.0. None of the tree optimizers seem to be doing anything wrong with this code. However, applying this patch to the test case: --- sp_test.ii 2005-06-06 12:02:57.000000000 -0400 +++ pr21528.cc 2005-06-07 15:44:49.297961371 -0400 @@ -3152,7 +3152,9 @@ inline int atomic_exchange_and_add( int inline void atomic_increment( int * pw ) { - +#if defined HACK + *pw = *pw + 1; +#else __asm__ ( "lock\n\t" @@ -3161,6 +3163,7 @@ inline void atomic_increment( int * pw ) "m"( *pw ): "cc" ); +#endif } inline int atomic_conditional_increment( int * pw ) works around the problem. The bug seems to be introduced by the RTL optimizers. The second call to atomic_increment is being removed in .CE1. Here's some IRC analysis: <rth> dnovillo: pong <dnovillo> rth: been looking at 21528 (4.0 regression) for a while now and afaict none of the tree optimizers are doing anything wrong. i think it may be an inline asm thing: <dnovillo> inline void atomic_increment( int * pw ) <dnovillo> { <dnovillo> #if defined HACK <dnovillo> *pw = *pw + 1; <dnovillo> #else <dnovillo> __asm__ <dnovillo> ( <dnovillo> "lock\n\t" <dnovillo> "incl %0": <dnovillo> "=m"( *pw ): <dnovillo> "m"( *pw ): <dnovillo> "cc" <dnovillo> ); <dnovillo> #endif <dnovillo> } <dnovillo> if i compile with -DHACK, it works. Otherwise, it fails. <dnovillo> (at -O3) <pinskia> hmm is there a V_MAY_DEF for the inline-asm? <rth> dnovillo: inline asm looks ok. you've not logged anything to the pr that looks useful. or wrong pr number? <dnovillo> rth: not yet. first i wanted to know if the inline looked fine. <dnovillo> the tree dumps are identical modulo the __asm__ bits, the .s files only differ in: <dnovillo> --- 00bad/pr21528.s 2005-06-07 15:24:37.651935299 -0400 <dnovillo> +++ 00good/pr21528.s 2005-06-07 15:25:11.150401900 -0400 <dnovillo> @@ -384,10 +384,7 @@ main: <dnovillo> movl -16(%ebp), %esi <dnovillo> testl %esi, %esi <dnovillo> je .L57 <dnovillo> -#APP <dnovillo> - lock <dnovillo> - incl 4(%esi) <dnovillo> -#NO_APP <dnovillo> + addl $2, 4(%esi) <dnovillo> testl %esi, %esi <dnovillo> je .L57 <dnovillo> cmpl $3, 4(%esi) <dnovillo> which is odd. <rth> that is odd. <dnovillo> it's as if one __asm__ call is being lost <rth> well, is it? <dnovillo> not in the tree optimizers, for sure. gimme a minute. <DannyB> It's like thousands of inline lock routines shouted out all at once <DannyB> and then vanished <rth> actually, no it isn't. <rth> i don't believe in anthromorphic lock routines. <DannyB> so where did the lock go <DannyB> oh <DannyB> :) <DannyB> just wait till we have biological computers <rth> there's an extra couple of letters in that word. i'll let you fix them up on your end. <pinskia> in .expand, there is only one lock incl <pinskia> :q <dnovillo> well, the trees look fine at .final_cleanup <dnovillo> pw = &pv$pn$pi_->use_count_; | pw = &pv$pn$pi_->use_count_; <dnovillo> # TMT.247_999 = V_MAY_DEF <TMT.247| # TMT.247_1003 = V_MAY_DEF <TMT.2 <dnovillo> __asm__("lock\n\tincl %0":"=m" *pw:"| *pw = *pw + 1; <dnovillo> [ ... ] | <dnovillo> pw = &pv$pn$pi_->use_count_; | pw = &pv$pn$pi_->use_count_; <dnovillo> # TMT.247_998 = V_MAY_DEF <TMT.247| # TMT.247_1002 = V_MAY_DEF <TMT.2 <dnovillo> __asm__("lock\n\tincl %0":"=m" *pw:"| *pw = *pw + 1; <pinskia> never mind looking at the wrong dump <dnovillo> what's the rtl dump switch? <pinskia> -da or -fdump-rtl-all <rth> -fdump-tree-expand. <dnovillo> It disappears in pr21528.cc.11.ce1 <dnovillo> (the second __asm__ ("lock"), that is) <rth> hmm. cse bug invalidating memories? <rth> dump state into the pr and i'll look at it <dnovillo> will do. thanks.
Subject: Bug 21528 CVSROOT: /cvs/gcc Module name: gcc Changes by: rth@gcc.gnu.org 2005-06-07 23:45:09 Modified files: gcc : ChangeLog rtlanal.c Log message: PR rtl-opt/21528 * rtlanal.c (reg_overlap_mentioned_p) <MEM>: Handle 'E' formats. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.9078&r2=2.9079 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/rtlanal.c.diff?cvsroot=gcc&r1=1.215&r2=1.216
Subject: Bug 21528 CVSROOT: /cvs/gcc Module name: gcc Branch: gcc-4_0-branch Changes by: rth@gcc.gnu.org 2005-06-07 23:48:31 Modified files: gcc : ChangeLog rtlanal.c Log message: PR rtl-opt/21528 * rtlanal.c (reg_overlap_mentioned_p) <MEM>: Handle 'E' formats. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-4_0-branch&r1=2.7592.2.282&r2=2.7592.2.283 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/rtlanal.c.diff?cvsroot=gcc&only_with_tag=gcc-4_0-branch&r1=1.211&r2=1.211.8.1
Fixed.