Bug 21528 - [4.0 Regression] Boost shared_ptr_test.cpp fails with -O3
Summary: [4.0 Regression] Boost shared_ptr_test.cpp fails with -O3
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 4.0.0
: P2 normal
Target Milestone: 4.0.1
Assignee: Richard Henderson
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2005-05-12 08:40 UTC by Peter Dimov
Modified: 2005-07-23 22:49 UTC (History)
3 users (show)

See Also:
Host:
Target: i686-pc-linux-gnu
Build:
Known to work: 3.4.3 4.1.0
Known to fail: 4.0.0 4.0.1
Last reconfirmed:


Attachments
Preprocessed source (77.17 KB, application/octet-stream)
2005-05-12 08:42 UTC, Peter Dimov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Dimov 2005-05-12 08:40:57 UTC
The following portion of shared_ptr_test.cpp:

#include <boost/shared_ptr.hpp>
#include <boost/detail/lightweight_test.hpp>

int main()
{
    boost::shared_ptr<int> pi(new int);
    boost::shared_ptr<void> pv(pi);

    boost::shared_ptr<int> pi2 = boost::static_pointer_cast<int>(pv);
    BOOST_TEST(pi.get() == pi2.get());
    BOOST_TEST(!(pi < pi2 || pi2 < pi));
    BOOST_TEST(pi.use_count() == 3);
    BOOST_TEST(pv.use_count() == 3);
    BOOST_TEST(pi2.use_count() == 3);

    return boost::report_errors();
}

(using the current Boost CVS HEAD that will become 1.33 soon) fails the three
use_count() tests with g++ 4.0.0 -O3, but not with -O0, 1, 2. It seems that a
reference count increment is being optimized away. This does not happen if
BOOST_TEST is replaced with assert.
Comment 1 Peter Dimov 2005-05-12 08:42:03 UTC
Created attachment 8871 [details]
Preprocessed source
Comment 2 Jonathan Wakely 2005-05-12 09:12:53 UTC
I can confirm this error for 4.0 on x86 and x86-64 but not with 4.1

Comment 3 Jonathan Wakely 2005-05-12 09:36:28 UTC
-O2 -finline-functions shows the errors, -O3 -fno-inline-functions does not.
Comment 4 Jonathan Wakely 2005-05-12 11:56:22 UTC
Using 4.0.1 20050509 on x86-64 Linux

-O2 -finline-functions -finline-limit=79 is OK

-O2 -finline-functions -finline-limit=80 is not


Comment 5 Jonathan Wakely 2005-05-12 12:22:56 UTC
Also, I *do* still see the failure if I replace BOOST_TEST() with assert().

This still fails with -finline-functions -finline-limit=80

#include <boost/shared_ptr.hpp>
#include <cassert>

int main()
{
    boost::shared_ptr<int> pi(new int);
    boost::shared_ptr<void> pv(pi);

    boost::shared_ptr<int> pi2 = boost::static_pointer_cast<int>(pv);
    assert(pi.get() == pi2.get());
    assert(pi.use_count() == 3);     // fails here

    return 0;
}

*however* if I remove the first assertion it succeeds.
Comment 6 Andrew Pinski 2005-05-12 13:17:47 UTC
Hmm, -fno-sra works around the issue.
Comment 7 Andrew Pinski 2005-05-12 13:28:26 UTC
Hmm, I am starting to think there is an alias bug in 4.0.0 somewhere.
Comment 8 Andrew Pinski 2005-05-12 13:35:30 UTC
The difference between the mainline and the 4.0.0 branch is the following additional SRA in 4.0.x:
+Initial instantiation for pv
+  pv.pn.pi_ -> pv$pn$pi_
+  pv.px -> pv$px
+Initial instantiation for pi2
+  pi2.pn.pi_ -> pi2$pn$pi_
+  pi2.px -> pi2$px
+
Comment 9 Andrew Pinski 2005-05-12 13:38:18 UTC
Also the other major difference is:
mainline:
-  #   pi_503 = V_MAY_DEF <pi_504>;
-  pi.pn.pi_ = this_45;

4.0.0:
+  #   TMT.242_3 = V_MAY_DEF <TMT.242_66>;
+  pi.pn.pi_ = this_99;

See how we used a TMT in 4.0.0 case and pi on the mainline.
Comment 10 Diego Novillo 2005-06-07 19:48:28 UTC
This seems to be an RTL bug in 4.0.  None of the tree optimizers seem to be
doing anything wrong with this code.  However, applying this patch to the test case:

--- sp_test.ii  2005-06-06 12:02:57.000000000 -0400
+++ pr21528.cc  2005-06-07 15:44:49.297961371 -0400
@@ -3152,7 +3152,9 @@ inline int atomic_exchange_and_add( int
 inline void atomic_increment( int * pw )
 {

-
+#if defined HACK
+  *pw = *pw + 1;
+#else
     __asm__
     (
         "lock\n\t"
@@ -3161,6 +3163,7 @@ inline void atomic_increment( int * pw )
         "m"( *pw ):
         "cc"
     );
+#endif
 }

 inline int atomic_conditional_increment( int * pw )

works around the problem.  The bug seems to be introduced by the RTL optimizers.
 The second call to atomic_increment is being removed in .CE1.  Here's some IRC
analysis:

<rth> dnovillo: pong
<dnovillo> rth: been looking at 21528 (4.0 regression) for a while now and
afaict none of the tree optimizers are doing anything wrong.  i think it may be
an inline asm thing:
<dnovillo> inline void atomic_increment( int * pw )
<dnovillo> {
<dnovillo> #if defined HACK
<dnovillo>   *pw = *pw + 1;
<dnovillo> #else
<dnovillo>     __asm__
<dnovillo>     (
<dnovillo>         "lock\n\t"
<dnovillo>         "incl %0":
<dnovillo>         "=m"( *pw ):
<dnovillo>         "m"( *pw ):
<dnovillo>         "cc"
<dnovillo>     );
<dnovillo> #endif
<dnovillo> }
<dnovillo> if i compile with -DHACK, it works.  Otherwise, it fails.
<dnovillo> (at -O3)
<pinskia> hmm is there a V_MAY_DEF for the inline-asm?
<rth> dnovillo: inline asm looks ok.  you've not logged anything to the pr that
looks useful.  or wrong pr number?
<dnovillo> rth: not yet.  first i wanted to know if the inline looked fine.
<dnovillo> the tree dumps are identical modulo the __asm__ bits, the .s files
only differ in:
<dnovillo> --- 00bad/pr21528.s     2005-06-07 15:24:37.651935299 -0400
<dnovillo> +++ 00good/pr21528.s    2005-06-07 15:25:11.150401900 -0400
<dnovillo> @@ -384,10 +384,7 @@ main:
<dnovillo>         movl    -16(%ebp), %esi
<dnovillo>         testl   %esi, %esi
<dnovillo>         je      .L57
<dnovillo> -#APP
<dnovillo> -       lock
<dnovillo> -       incl 4(%esi)
<dnovillo> -#NO_APP
<dnovillo> +       addl    $2, 4(%esi)
<dnovillo>         testl   %esi, %esi
<dnovillo>         je      .L57
<dnovillo>         cmpl    $3, 4(%esi)
<dnovillo> which is odd.
<rth> that is odd.
<dnovillo> it's as if one __asm__ call is being lost
<rth> well, is it?
<dnovillo> not in the tree optimizers, for sure.  gimme a minute.
<DannyB> It's like thousands of inline lock routines shouted out all at once
<DannyB> and then vanished
<rth> actually, no it isn't.
<rth> i don't believe in anthromorphic lock routines.
<DannyB> so where did the lock go
<DannyB> oh
<DannyB> :)
<DannyB> just wait till we have biological computers
<rth> there's an extra couple of letters in that word.  i'll let you fix them up
on your end.
<pinskia> in .expand, there is only one lock incl
<pinskia> :q
<dnovillo> well, the trees look fine at .final_cleanup
<dnovillo>     pw = &pv$pn$pi_->use_count_;        |    pw = &pv$pn$pi_->use_count_;
<dnovillo>     #   TMT.247_999 = V_MAY_DEF <TMT.247|    #   TMT.247_1003 =
V_MAY_DEF <TMT.2
<dnovillo>     __asm__("lock\n\tincl %0":"=m" *pw:"|    *pw = *pw + 1;
<dnovillo>     [ ... ]                             |
<dnovillo>     pw = &pv$pn$pi_->use_count_;        |    pw = &pv$pn$pi_->use_count_;
<dnovillo>     #   TMT.247_998 = V_MAY_DEF <TMT.247|    #   TMT.247_1002 =
V_MAY_DEF <TMT.2
<dnovillo>     __asm__("lock\n\tincl %0":"=m" *pw:"|    *pw = *pw + 1;
<pinskia> never mind looking at the wrong dump
<dnovillo> what's the rtl dump switch?
<pinskia> -da or -fdump-rtl-all
<rth> -fdump-tree-expand.
<dnovillo> It disappears in pr21528.cc.11.ce1
<dnovillo> (the second __asm__ ("lock"), that is)
<rth> hmm.  cse bug invalidating memories?
<rth> dump state into the pr and i'll look at it
<dnovillo> will do.  thanks.
Comment 11 GCC Commits 2005-06-07 23:45:35 UTC
Subject: Bug 21528

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	rth@gcc.gnu.org	2005-06-07 23:45:09

Modified files:
	gcc            : ChangeLog rtlanal.c 

Log message:
	PR rtl-opt/21528
	* rtlanal.c (reg_overlap_mentioned_p) <MEM>: Handle 'E' formats.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.9078&r2=2.9079
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/rtlanal.c.diff?cvsroot=gcc&r1=1.215&r2=1.216

Comment 12 GCC Commits 2005-06-07 23:48:45 UTC
Subject: Bug 21528

CVSROOT:	/cvs/gcc
Module name:	gcc
Branch: 	gcc-4_0-branch
Changes by:	rth@gcc.gnu.org	2005-06-07 23:48:31

Modified files:
	gcc            : ChangeLog rtlanal.c 

Log message:
	PR rtl-opt/21528
	* rtlanal.c (reg_overlap_mentioned_p) <MEM>: Handle 'E' formats.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-4_0-branch&r1=2.7592.2.282&r2=2.7592.2.283
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/rtlanal.c.diff?cvsroot=gcc&only_with_tag=gcc-4_0-branch&r1=1.211&r2=1.211.8.1

Comment 13 Richard Henderson 2005-06-07 23:49:30 UTC
Fixed.