Bug 24975 - Aliasing problems inside libstdc++
Summary: Aliasing problems inside libstdc++
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 4.1.0
: P3 normal
Target Milestone: 4.1.0
Assignee: Paolo Carlini
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2005-11-21 15:43 UTC by Richard Biener
Modified: 2005-11-27 02:10 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2005-11-21 16:05:14


Attachments
patch to enable aliasing warnings for C++ (1.28 KB, text/plain)
2005-11-21 15:44 UTC, Richard Biener
Details
Draft for the set/multiset issue (1.94 KB, patch)
2005-11-21 17:55 UTC, Paolo Carlini
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Biener 2005-11-21 15:43:11 UTC
We currently miscompile DLV with -fstrict-aliasing, and the only aliasing issues
that are visible are inside libstdc++:

/usr/include/c++/4.1.0/bits/basic_string.h:180: warning: dereferencing type-punned pointer will break strict-aliasing rules
/usr/include/c++/4.1.0/bits/stl_set.h:348: warning: dereferencing type-punned pointer will break strict-aliasing rules
/usr/include/c++/4.1.0/bits/stl_set.h:409: warning: dereferencing type-punned pointer will break strict-aliasing rules

this is with a patch to enable aliasing warnings for C++
Comment 1 Richard Biener 2005-11-21 15:44:09 UTC
Created attachment 10312 [details]
patch to enable aliasing warnings for C++

Apply to see the warnings (even during a libstdc++ build).
Comment 2 Andrew Pinski 2005-11-21 15:51:54 UTC
(In reply to comment #1)
> Created an attachment (id=10312) [edit]
> patch to enable aliasing warnings for C++

Note I think this patch is slightly incorrect as the function build_reinterpret_cast_1 is called during parsing so you get the warning for template cases where actually the warning would be incorrect.
An example is:
int b;
template <typename a> a *f(void) {  return (a*)(&b); }

This should not warn if a is int.
Comment 3 Chris Jefferson 2005-11-21 16:03:21 UTC
While it does look like there might be an error in this warning code, I also have a feeling we are doing something bad here. I looked at the examples in stl_set.h and we are doing reference casting from a _Rb_tree_const_iterator to a _Rb_tree_iterator. Now these both have identical elements and no virtual functions, which is why I suspect this code works. However I sure this isn't valid C++?
Comment 4 Paolo Carlini 2005-11-21 16:05:14 UTC
Thanks. I'm looking into the issue. The part involving set seems to me much more
critical and hopefully solvable without breaking the ABI, because we are not
exporting anything from the library involving set/multiset. I suppose you can
confirm that DLV heavily uses set?!?
Comment 5 Paolo Carlini 2005-11-21 16:09:01 UTC
One more bit of info (meant for Chris too): I think the cast has been added when
set was changed to have both iterator and const_iterator as const iterator types.
Since we are not exporting anything, I suppose we can fix the code at the level
of the underlying _Rb_tree functions...
Comment 6 Paolo Carlini 2005-11-21 16:28:48 UTC
Actually, to be 100% safe wrt binary compatibility (addresses, etc.) we can
also overload for _Rb_tree::const_iterator the involved _Rb_tree functions,
I'm preparing a patch.
Comment 7 Paolo Carlini 2005-11-21 17:55:36 UTC
Created attachment 10314 [details]
Draft for the set/multiset issue

Richard, can you see if the miscompilation goes away with the attached? In a sense is natural (the entire library is evolving to separate overloads for iterator and const_iterator), on the other hand adds redundancy to the code.

Anyway, the patch is very safe from the point of view of binary compatibility and probably, short term at least, we want something similar, assuming it solves the nasty issue with DLV.
Comment 8 Paolo Carlini 2005-11-21 19:49:15 UTC
Another thing, about the basic_string warning: in order to figure out whether that cast is really creating problems, you can add --enable-fully-dynamic-string: with it the problematic code is not used at all.

(as I already told you, I really hope we can leave basic_string alone, because it's really, really difficult to fix it for similar issues without breaking binary compatibility :(
Comment 9 Richard Biener 2005-11-21 22:35:05 UTC
Doing both fixed DLV! (now trying without that dynamic string thingie)
Comment 10 Richard Biener 2005-11-21 22:59:22 UTC
The patch_draft_24975 alone fixed it.  Looks like libstdc++ needs some auditing for aliasing issues...

Thanks Paolo!
Comment 11 Paolo Carlini 2005-11-21 23:02:51 UTC
(In reply to comment #10)
> The patch_draft_24975 alone fixed it.  Looks like libstdc++ needs some
> auditing for aliasing issues...

Indeed it does, but I expect that nothing is in a shape worse than basic_string from the point of view of dangerous casts and such...

> Thanks Paolo!

You are welcome. Therefore I'm going to see whether I can somewhat simplify / clean-up the draft, otherwise, I will just go ahead with it...
Comment 12 Paolo Carlini 2005-11-21 23:07:44 UTC
(In reply to comment #10)
> The patch_draft_24975 alone fixed it.  Looks like libstdc++ needs some auditing
> for aliasing issues...

By the way, for your curiosity, I digged a bit and the set issue was already there in v2! Really we inherited that dangerous cast from the original HP/SGI! Of course, at the time alias analysis was so weak that nobody could notice...
Comment 13 paolo@gcc.gnu.org 2005-11-22 14:53:09 UTC
Subject: Bug 24975

Author: paolo
Date: Tue Nov 22 14:53:03 2005
New Revision: 107362

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=107362
Log:
2005-11-22  Paolo Carlini  <pcarlini@suse.de>

	PR libstdc++/24975
	* include/bits/stl_set.h (insert(iterator, const value_type&),
	erase(iterator), erase(iterator, iterator)): Don't break aliasing
	rules casting to _Rep_iterator&, forward to _Rb_tree facilities.
	* include/bits/stl_multiset.h (insert(iterator, const value_type&),
	erase(iterator), erase(iterator, iterator)): Likewise.
	* include/bits/stl_tree.h (_Rb_tree<>::_M_insert(_Const_Base_ptr,
	_Const_Base_ptr, const value_type&), insert_unique(const_iterator,
	const value_type&), insert_equal(const_iterator, const value_type&),
	erase(const_iterator), erase(const_iterator, const_iterator)): New,
	_Rb_tree<>::const_iterator counterparts of existing facilities.
	

Modified:
    trunk/libstdc++-v3/ChangeLog
    trunk/libstdc++-v3/include/bits/stl_multiset.h
    trunk/libstdc++-v3/include/bits/stl_set.h
    trunk/libstdc++-v3/include/bits/stl_tree.h

Comment 14 paolo@gcc.gnu.org 2005-11-22 14:55:13 UTC
Subject: Bug 24975

Author: paolo
Date: Tue Nov 22 14:55:09 2005
New Revision: 107363

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=107363
Log:
2005-11-22  Paolo Carlini  <pcarlini@suse.de>

	PR libstdc++/24975
	* include/bits/stl_set.h (insert(iterator, const value_type&),
	erase(iterator), erase(iterator, iterator)): Don't break aliasing
	rules casting to _Rep_iterator&, forward to _Rb_tree facilities.
	* include/bits/stl_multiset.h (insert(iterator, const value_type&),
	erase(iterator), erase(iterator, iterator)): Likewise.
	* include/bits/stl_tree.h (_Rb_tree<>::_M_insert(_Const_Base_ptr,
	_Const_Base_ptr, const value_type&), insert_unique(const_iterator,
	const value_type&), insert_equal(const_iterator, const value_type&),
	erase(const_iterator), erase(const_iterator, const_iterator)): New,
	_Rb_tree<>::const_iterator counterparts of existing facilities.
	

Modified:
    branches/gcc-4_1-branch/libstdc++-v3/ChangeLog
    branches/gcc-4_1-branch/libstdc++-v3/include/bits/stl_multiset.h
    branches/gcc-4_1-branch/libstdc++-v3/include/bits/stl_set.h
    branches/gcc-4_1-branch/libstdc++-v3/include/bits/stl_tree.h

Comment 15 Richard Biener 2005-11-22 15:09:50 UTC
So that leaves us with basic_string and it's problems.  What do you suggest to vendors needing ABI compatibility?
Comment 16 Paolo Carlini 2005-11-22 15:17:43 UTC
(In reply to comment #15)
> So that leaves us with basic_string and it's problems.  What do you suggest to
> vendors needing ABI compatibility?

I'm still trying to figure out whether we can do something not breaking the ABI. Honestly, I think we should first figure out whether this specific cast really causes problems or not. If it does, then we should raise the priority of the issue to the highest level and probably it warrants an ABI breakage (for 4.2). If it doesn't and/or we are not ready to break the ABI, probably tweaking the compiler to spill warnings to the face of the user is not an appropriate action.
Comment 17 Paolo Carlini 2005-11-22 16:14:04 UTC
(In reply to comment #15)
> So that leaves us with basic_string and it's problems.  What do you suggest to
> vendors needing ABI compatibility?

Richard, something we *can* safely apply (safely from the ABI point of view) is this hack (vaguely inspired by HP/SGI hacks...). I don't think it can make the underlying issue *worse*, at least. What do you think?

Index: include/bits/basic_string.h
===================================================================
--- include/bits/basic_string.h (revision 107363)
+++ include/bits/basic_string.h (working copy)
@@ -177,7 +177,10 @@

         static _Rep&
         _S_empty_rep()
-        { return *reinterpret_cast<_Rep*>(&_S_empty_rep_storage); }
+        {
+         void* __p = reinterpret_cast<void*>(&_S_empty_rep_storage);
+         return *reinterpret_cast<_Rep*>(__p);
+       }

         bool
        _M_is_leaked() const
Comment 18 Richard Biener 2005-11-22 16:40:53 UTC
Well, yes.  That would at least silence the warning (which would be annoying, if we apply the C++ warning patch).
Comment 19 Paolo Carlini 2005-11-22 16:59:06 UTC
(In reply to comment #18)
> Well, yes.  That would at least silence the warning (which would be annoying,
> if we apply the C++ warning patch).

It would be more than annoying, it would be awful, because would be also emitted to end users of the library, not only at library build time.

So... As I see the issue, either we come up with an actual miscompilation caused by the punning, or we have to apply the hack, for this library ABI. I'll try to think a bit more about the issue, then, in a few days we should make a decision...
Comment 20 Paolo Carlini 2005-11-22 18:00:01 UTC
An observation: while I don't know all the details of current GCC alias analysis, I'm fairly optimistic that this specific issue with basic_string is not serious. The reason is that _S_empty_rep_storage, of which we are taking the address, actually **never** changes after static zero inizialization. Therefore, all the type based optimizations reasoning that a given pointer will not modify _S_empty_rep_storage simply because the type is different, tuen out to be always trivially correct. If we want to be additionally safe we can slighlty change _M_mutate to never write into it (currently, in some corner cases, it will write zeros into it)
Comment 21 paolo@gcc.gnu.org 2005-11-24 01:29:55 UTC
Subject: Bug 24975

Author: paolo
Date: Thu Nov 24 01:29:51 2005
New Revision: 107447

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=107447
Log:
2005-11-23  Paolo Carlini  <pcarlini@suse.de>

	PR libstdc++/24975 (basic_string)
	* include/bits/basic_string.h (_Rep::_S_empty_rep): Avoid
	strict-aliasing warnings.

Modified:
    trunk/libstdc++-v3/ChangeLog
    trunk/libstdc++-v3/include/bits/basic_string.h

Comment 22 Paolo Carlini 2005-11-27 02:10:20 UTC
I think we can close the PR now: the miscompilation is fixed and some tricks in
basic_string are well known (and not present in ext/vstring.h and basic_string
in v7-branch).
Comment 23 Richard Biener 2005-12-08 11:24:17 UTC
Subject: Bug 24975

Author: rguenth
Date: Thu Dec  8 11:24:07 2005
New Revision: 108226

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=108226
Log:
2005-12-08  Richard Guenther  <rguenther@suse.de>

        Backport from mainline
        2005-11-28  Richard Guenther  <rguenther@suse.de>

	* c-common.c (strict_aliasing_warning): Handle all
	component-ref like accesses.

	* gcc.dg/alias-9.c: New testcase.
	* g++.dg/warn/Wstrict-aliasing-7.C: Likewise.

        2005-11-24  Richard Guenther  <rguenther@suse.de>
	        Dirk Mueller <dmueller@suse.de>

	PR c++/14024
	* c-common.h (strict_aliasing_warning): Declare.
	* c-common.c (strict_aliasing_warning): New function,
	split out from ...
	* c-typeck.c (build_c_cast): ... here.

	* typeck.c (build_reinterpret_cast_1): Use
	strict_aliasing_warning.

	* g++.dg/warn/Wstrict-aliasing-1.C: New testcase.
	* g++.dg/warn/Wstrict-aliasing-2.C: Likewise.
	* g++.dg/warn/Wstrict-aliasing-3.C: Likewise.
	* g++.dg/warn/Wstrict-aliasing-4.C: Likewise.
	* g++.dg/warn/Wstrict-aliasing-5.C: Likewise.
	* g++.dg/warn/Wstrict-aliasing-6.C: Likewise.

        2005-11-23  Paolo Carlini  <pcarlini@suse.de>

	PR libstdc++/24975 (basic_string)
	* include/bits/basic_string.h (_Rep::_S_empty_rep): Avoid
	strict-aliasing warnings.

Added:
    branches/gcc-4_1-branch/gcc/testsuite/g++.dg/warn/Wstrict-aliasing-1.C
    branches/gcc-4_1-branch/gcc/testsuite/g++.dg/warn/Wstrict-aliasing-2.C
    branches/gcc-4_1-branch/gcc/testsuite/g++.dg/warn/Wstrict-aliasing-3.C
    branches/gcc-4_1-branch/gcc/testsuite/g++.dg/warn/Wstrict-aliasing-4.C
    branches/gcc-4_1-branch/gcc/testsuite/g++.dg/warn/Wstrict-aliasing-5.C
    branches/gcc-4_1-branch/gcc/testsuite/g++.dg/warn/Wstrict-aliasing-6.C
    branches/gcc-4_1-branch/gcc/testsuite/g++.dg/warn/Wstrict-aliasing-7.C
    branches/gcc-4_1-branch/gcc/testsuite/gcc.dg/alias-9.c
Modified:
    branches/gcc-4_1-branch/gcc/ChangeLog
    branches/gcc-4_1-branch/gcc/c-common.c
    branches/gcc-4_1-branch/gcc/c-common.h
    branches/gcc-4_1-branch/gcc/c-typeck.c
    branches/gcc-4_1-branch/gcc/cp/ChangeLog
    branches/gcc-4_1-branch/gcc/cp/typeck.c
    branches/gcc-4_1-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_1-branch/libstdc++-v3/ChangeLog
    branches/gcc-4_1-branch/libstdc++-v3/include/bits/basic_string.h