Bug 10823

Summary: [3.3/3.4 regression] wrong code after first cse pass
Product: gcc Reporter: gawrilow
Component: rtl-optimizationAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED DUPLICATE    
Severity: normal CC: gcc-bugs
Priority: P3 Keywords: wrong-code
Version: 3.3   
Target Milestone: 3.3.1   
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed: 2003-05-27 17:49:16
Attachments: minkowski_test.cc
Last reduced testcase

Description gawrilow 2003-05-16 16:56:01 UTC
The first cse pass garbles the class LV constructor code inlined into the fetch() function.  Compare the insn 405, 406 in the 08.null and 09.cse dumps.  The _start field of the target is copied from the _size field of the source, which is obviuosly wrong.

I hope, this time the test case is sufficiently short.  The original program that crashed due to this bug has over 80K lines after cpp.  Further reducing makes the bug disappear, especially removing the explicitly defined copy constructor of class Gen (if this helps...)

The parts embraced in #ifdef DEMO are useful only for the bug demonstration, e.g. in a test suite, and can be safely ignored when debugging.

BTW, as I analyzed the assembler output, some other optimization deficiences have been revealed. If you mean, I should submit them as separate bugs, or they are no bugs at all, please let me know. Here they are:

1. The placement operator new is not required to check his argument for being non-null. Neither the constructors are. Nevertheless, gcc emits a test and a branch.

2. Even if this test would be required, it should have been eliminated by the previous optimization pass (null pointer elimination), as the same pointer is used to access the 'exists' field prior to the test.  Or, alternatively, the null pointer elimination should be repeated after cse, if this failure is caused by the expressions looking too different at the beginning.

3. The constructor argument in the placement new expression is a temporary object returned by the inlined function get(). It would be much better for the run-time efficiency if the target address (area) would be passed directly to get(). Currently the temporary object is created separately and then copied into the area.  I think, the Standard allows to eliminate such temporaries together with copy constructor calls, doesn't it?

Release:
3.3

Environment:
RedHat 7.3 on an Athlon PC

../gcc-3.3/configure --enable-languages=c++ --enable-shared --with-cpu=athlon --enable-version-specific-runtime-libs --with-gxx-include-dir=/usr/site-local/share/include/c++/3.3 --prefix=/usr/site-local/share --exec-prefix=/usr/site-local

How-To-Repeat:
Compile the attached program with additional -DDEMO option, then run it.  If compiled with -O1, it produces the correct output string '0 3 0 3'.  When compiled with -O2 or higher, the output will be '0 3 3 3'.
Comment 1 Andrew Pinski 2003-05-27 17:49:16 UTC
I can confirm this one on the mainline (20030527) with -march=i686 on x86-linux.
This is a regression from `gcc version 2.95.3 20010125 (prerelease)'.
But on ppc-darwin (mainline 20030526), I get the correct output at -O3 (do not know 
why).
Comment 2 Eric Botcazou 2003-06-25 12:48:10 UTC
Note that the testcase gives the correct ouput with -O2 -fno-strict-aliasing so
a specialist of the C++ aliasing rules might want to take a look at the code.
Comment 3 Wolfgang Bangerth 2003-06-25 14:29:06 UTC
Regarding the others questions you have: yes, please file separate reports for
unrelated problems. It's very hard to track multiple bugs reporting in the same
PR.

Thanks
  Wolfgang
Comment 4 Wolfgang Bangerth 2003-06-25 14:41:20 UTC
I'm taking a look at the aliasing problem right now. Just to let you know
and to avoid duplication of work.
W.
Comment 5 Wolfgang Bangerth 2003-06-25 15:14:45 UTC
Your code is faulty: in this function
   LV get() const {
      return op(*static_cast<const ITP&>(*this), *second);
   }
the first operand to op.operator() is constructed by calling
   Uop::result_type Uev::operator* () const {
      return op(*p);
   }
with result_type being of type RV. Thus, it's returned object is a 
temporary. However, op.Bop::operator() takes a reference and passes
it on. This doesn't work.

I could make the code work as expected also in the presence of
strict aliasing, by replacing the first function above (in the
original code) by
   LV get() const {
     static RV rv = *static_cast<const ITP&>(*this);
     return op(rv, *second);
   }

I attach my last reduced testcase to the PR later.

W.
Comment 6 Wolfgang Bangerth 2003-06-25 15:16:30 UTC
Created attachment 4281 [details]
Last reduced testcase
Comment 7 Wolfgang Bangerth 2003-06-25 15:26:16 UTC
By the way, since you are a frequent submitter: if you prepare reduced
testcases, it is helpful to make all classes structs, since then you can
drop public/private/protected. Next step is to get rid of accessor
functions, since now everything is public. At this point, your testcase had
already shrunk to about half its size!

After a few more steps, you'll want to get rid of const in as many
places as possible. This was when I realized that one of the arguments
didn't bind any more, which means that it must be a temporary -- because 
temporaries _can_ bind to _const_ references, but not nonconst references.
From there, it was an easy run through the defences for touchdown :-)

W.
Comment 8 Wolfgang Bangerth 2003-07-17 23:44:04 UTC
I got this mail from the submitter of the PR:

>>>>> "bangerth" == bangerth at dealii dot org <gcc-bugzilla@gcc.gnu.org> writes:

    bangerth> bangerth at dealii dot org changed:

    bangerth>            What |Removed |Added
    bangerth>
----------------------------------------------------------------------------
    bangerth> Status|NEW |RESOLVED Resolution| |INVALID

Sorry for the late reaction. Our institute's file server has crashed,
I'd got to spent some time to put him back on its legs...

    bangerth> Your code is faulty: in this function 

    bangerth>   LV get() const { return op(*static_cast<const ITP&>(*this),
*second); }

    bangerth> the first operand to op.operator() is constructed by calling

    bangerth>   Uop::result_type Uev::operator* () const { return op(*p); }

    bangerth> with result_type being of type RV. Thus, it's
    bangerth> returned object is a temporary. However,
    bangerth> op.Bop::operator() takes a reference and passes it
    bangerth> on. This doesn't work.

I think, your analysis is not entirely correct.  The first operand of
op.operator() is indeed a temporary, but it is guaranteed to stay
alive up to the next sequence point, that is, end of the return
statement.  The Bop::operator() passes the reference to this temporary
RV to the constructor of LV, which makes a copy of it and stores it in
the new object, being finally returned from get() (or possibly copied
one more time if the return value optimization doesn't work.)  At any
rate, when the sequence point is reached, the temporary RV is not used
any more.

And, after all, even if this code had been an example of wrong usage
of references (of that kind I'm seeing regularly by our students), it
had to lead to a program crash when accessing an invalid reference.  The
(miscompiled) code of the LV constructor reads instead the wrong field in RV
(s2._size instead of s2._start) when copying from the temporary into
the new LV object.

Your solution with a static variable is not applicable here, since the
result of get() must depend on the actual contents of the Bev object,
and the latter is by no means a solitary object. The test case is just
VERY simplified compared to the original program.

I hope, you revisit this case and eventually revert the RESOLUTION, so
that the two days I've spent reading the assembler output and RTL
dumps were not in vain :-).

Another evidence of this problem being a gcc bug is the fact that when
I simplify the case a little bit further, namely throw away the
inheritance Seq : Gen, the problem disappears and the code becomes correct!
But the class Gen is absolutely empty, and should not affect the alias
analisys in any way?!

Thank you in advance for your attention and patience!

Ewgenij Gawrilow

PS:

By the way, what is this mysterious strict aliasing exactly?
Is it described formally anywhere?  Is it merely the list in [3.10] (15) of
ANSI C++ or something more elaborated?
Comment 9 Wolfgang Bangerth 2003-07-18 00:01:57 UTC
Subject: Re:  [3.3/3.4 regression] wrong code after first cse pass


>     bangerth> Your code is faulty: in this function
>
>     bangerth>   LV get() const { return op(*static_cast<const ITP&>(*this),
> *second); }
>
>     bangerth> the first operand to op.operator() is constructed by calling
>
>     bangerth>   Uop::result_type Uev::operator* () const { return op(*p); }
>
>     bangerth> with result_type being of type RV. Thus, it's
>     bangerth> returned object is a temporary. However,
>     bangerth> op.Bop::operator() takes a reference and passes it
>     bangerth> on. This doesn't work.
>
> I think, your analysis is not entirely correct.  The first operand of
> op.operator() is indeed a temporary, but it is guaranteed to stay
> alive up to the next sequence point, that is, end of the return
> statement.  The Bop::operator() passes the reference to this temporary
> RV to the constructor of LV, which makes a copy of it [...]

Hm, that seems right. I think I thought this called the implemented 
constructor of RV, but it of course just calls the copy constructor.


> And, after all, even if this code had been an example of wrong usage
> of references (of that kind I'm seeing regularly by our students), it
> had to lead to a program crash when accessing an invalid reference.  The
> (miscompiled) code of the LV constructor reads instead the wrong field in
> RV (s2._size instead of s2._start) when copying from the temporary into the
> new LV object.

No, that's not quite right: when the compiler re-uses the stack space of a 
temporary, it might place other data there. When you later access is, 
expecting that the old object were there, you might just get "wrong" data. In 
general, violations of lifetimes of temporaries just invoke undefined 
behavior -- which might be wrong data or crashes, depending on circumstances.


> I hope, you revisit this case and eventually revert the RESOLUTION, so
> that the two days I've spent reading the assembler output and RTL
> dumps were not in vain :-).

I'm not yet convinced of legality of the code, but I'm willing to hear. What 
we would need is a _simpler_ testcase than your original one. I remember that 
it was _very_ hard to trace the whole thing due to the use of operator() and 
that all names somehow sounded similar. If you can come up with a somewhat 
clearer testcase, I'm sure we'll revisit the issue. Feel free to take my 
reduced testcase as a basis if you think it is valid.

I think the thing is just that the code is too complicated to reduce for 
someone who is not familiar what it is actually doing. We just need you help 
in this -- try to make it as small and self-explanatory as you can.


> Another evidence of this problem being a gcc bug is the fact that when
> I simplify the case a little bit further, namely throw away the
> inheritance Seq : Gen, the problem disappears and the code becomes correct!

No, that's the same thing as above: if you invoke undefined behavior, you may 
get anything, including the behavior you expected.


> By the way, what is this mysterious strict aliasing exactly?
> Is it described formally anywhere?  Is it merely the list in [3.10] (15) of
> ANSI C++ or something more elaborated?

In short, yes. In practice, it's often hard to find out, but usually people 
are doing casts in the wrong way.

W.

-------------------------------------------------------------------------
Wolfgang Bangerth              email:            bangerth@ices.utexas.edu
                               www: http://www.ices.utexas.edu/~bangerth/

Comment 10 Andrew Pinski 2005-06-05 08:36:44 UTC
Reopening to ...
Comment 11 Andrew Pinski 2005-06-05 08:36:59 UTC
Mark as a dup of bug 21920

*** This bug has been marked as a duplicate of 21920 ***