Bug 60734 - Undefined behavior in g++-v4/bits/stl_tree.h
Summary: Undefined behavior in g++-v4/bits/stl_tree.h
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 4.9.0
: P3 normal
Target Milestone: 4.8.4
Assignee: Jonathan Wakely
Depends on:
Reported: 2014-04-01 20:24 UTC by Markus Trippelsdorf
Modified: 2015-10-22 13:01 UTC (History)
4 users (show)

See Also:
Known to work:
Known to fail:
Last reconfirmed: 2014-04-02 00:00:00


Note You need to log in before you can comment on or make changes to this bug.
Description Markus Trippelsdorf 2014-04-01 20:24:07 UTC
While debugging a gold linker issue I came across the following
-fsanitize=undefined runtime error:

/usr/lib64/gcc/x86_64-pc-linux-gnu/4.9.0/include/g++-v4/bits/stl_tree.h:741:25: runtime error: downcast of address 0x7fff4c8e1d80 with insufficient space for an object of type '_Rb_tree_node<std::pair<const std::basic_string<char>, gold::Output_segment *> >'
0x7fff4c8e1d80: note: pointer points here
 00 00 00 00  00 00 00 00 00 00 00 00  d0 2c 8d 02 00 00 00 00  10 2d 8d 02 00 00 00 00  50 2d 8d 02

      { return iterator(static_cast<_Link_type>(&this->_M_impl._M_header)); }

The following message: 
also points to the issue.

» _M_header is an _Rb_tree_node_base, which is smaller than an
_Rb_tree_node<_Tp>. Usually, this would be OK -- you can reinterpret_cast
between pointers of different types pretty much arbitrarily -- but because
_Rb_tree_node_base is a base class of _Rb_tree_node<_Tp>, this is a
static_cast, and the ruling wording is 5.2.9/11, which says " If the
prvalue of type “pointer to cv1 B” points
to a B that is actually a subobject of an object of type D, the resulting
pointer points to the enclosing object of type D. Otherwise, the behavior
is undefined."

We've managed to prove that the prvalue of type "pointer to
_Rb_tree_node_base" is not, in fact, a subobject of type
"_Rb_tree_node<_Tp>", because there's not enough room in the allocated
storage for an object of that type at that address. So we've determined
that the behavior is undefined.

This is a bug in libstdc++. The fix is to use reinterpret_cast instead of
static_cast in the definition of 'end'.«

And indeed this fixes the issue for me.
Comment 1 Jonathan Wakely 2014-04-02 11:02:55 UTC
The analysis looks right. The static_cast is definitely undefined. You can't dereference the end() iterator so we don't need a valid pointer, so the reinterpret_cast is OK.

I'd like to be able to reproduce it though, the toy examples I've tried don't get the ubsan error.
Comment 2 Marek Polacek 2014-04-02 11:12:03 UTC
Our ubsan does not yet detect the "downcast of address..." error.  Maybe in next stage1.
Comment 3 Jonathan Wakely 2014-04-02 11:14:09 UTC
Thanks Marek, in that case this is less high priority IMHO
Comment 4 Jonathan Wakely 2014-04-15 10:52:38 UTC
Author: redi
Date: Tue Apr 15 10:52:06 2014
New Revision: 209414

URL: http://gcc.gnu.org/viewcvs?rev=209414&root=gcc&view=rev
	PR libstdc++/60734
	* include/bits/stl_tree.h (_Rb_tree::_M_end): Fix invalid cast.

Comment 5 Jonathan Wakely 2014-04-15 11:19:18 UTC
Fixed on trunk so far
Comment 6 Jonathan Wakely 2014-06-03 17:26:39 UTC
Author: redi
Date: Tue Jun  3 17:26:05 2014
New Revision: 211190

URL: http://gcc.gnu.org/viewcvs?rev=211190&root=gcc&view=rev
Backport from mainline
2014-04-15  Jonathan Wakely  <jwakely@redhat.com>

	PR libstdc++/60734
	* include/bits/stl_tree.h (_Rb_tree::_M_end): Fix invalid cast.

Comment 7 Jonathan Wakely 2014-06-03 17:59:22 UTC
Author: redi
Date: Tue Jun  3 17:58:51 2014
New Revision: 211197

URL: http://gcc.gnu.org/viewcvs?rev=211197&root=gcc&view=rev
Backport from mainline
2014-04-15  Jonathan Wakely  <jwakely@redhat.com>

	PR libstdc++/60734
	* include/bits/stl_tree.h (_Rb_tree::_M_end): Fix invalid cast.

Comment 8 Jonathan Wakely 2014-06-03 18:00:32 UTC
Fixed for 4.8.4 and 4.9.1
Comment 9 Bhargava Shastry 2015-08-27 12:13:33 UTC
Hi. I just noticed that there are two more instances of the undefined downcast via static_cast that are not fixed by the said patch.

### Lines borrowed from bits/stl_tree.h [gcc 5.1.0]

883.      iterator
884.      end() _GLIBCXX_NOEXCEPT
885.      { return iterator(static_cast<_Link_type>(&this->_M_impl._M_header)); }
887.      const_iterator
888.      end() const _GLIBCXX_NOEXCEPT
889.      {
890.        return const_iterator(static_cast<_Const_Link_type>
891.                              (&this->_M_impl._M_header));
892.      }

The undefined casts happen on line 885 and 890--891.
Comment 10 Jonathan Wakely 2015-08-27 12:30:06 UTC
Yes, there are a few remaining. I changed the code on trunk to avoid all those casts completely.
Comment 11 Jonathan Wakely 2015-08-27 12:33:18 UTC
In fact I also already fixed it on the gcc-5 branch, see r223811. The 5.1.0 sources are not current.
Comment 12 Bhargava Shastry 2015-08-27 12:50:14 UTC
Also, I noticed a couple of potentially suspicious casts not fixed upstream. They are in _S_right [1] and elsewhere.

The problem I see is this:
a. _M_right is a pointer to an object of type _Rb_tree_node_base
b. _Link_type is a pointer to an object of type _Rb_tree_node<_Val> that is derived from _Rb_tree_node_base
b. _M_right points to an object of type _Rb_tree_node_base in init() [2] and reset() [3]

a) and b) together imply that it is possible that _M_right points to an object of type _Rb_tree_node_base when cast to _Link_type in [1]. Is this a matter for concern?

[1]: https://gcc.gnu.org/viewcvs/gcc/branches/gcc-5-branch/libstdc%2B%2B-v3/include/bits/stl_tree.h?view=markup&pathrev=223811#l685
[2]: https://gcc.gnu.org/viewcvs/gcc/branches/gcc-5-branch/libstdc%2B%2B-v3/include/bits/stl_tree.h?view=markup&pathrev=223811#l614
[3]: https://gcc.gnu.org/viewcvs/gcc/branches/gcc-5-branch/libstdc%2B%2B-v3/include/bits/stl_tree.h?view=markup&pathrev=223811#l604
Comment 13 Jonathan Wakely 2015-08-27 13:25:22 UTC
(In reply to Bhargava Shastry from comment #12)
> a) and b) together imply that it is possible that _M_right points to an
> object of type _Rb_tree_node_base when cast to _Link_type in [1]. Is this a
> matter for concern?

Not especially, because we never dereference it except when it really does point to the derived type.

We might be able to replace those with reinterpret_cast, or just return the base pointer and delay the cast until needed.