jh@evans:/abuild/jh/mozilla-central/build6/gfx/harfbuzz/src> /abuild/jh/trunk-install/bin/gcc -O2 -fwhopr -r -nostdlib hb-blob.i -c jh@evans:/abuild/jh/mozilla-central/build6/gfx/harfbuzz/src> /abuild/jh/trunk-install/bin/g++ -O2 -fwhopr -r -nostdlib *.ii -c jh@evans:/abuild/jh/mozilla-central/build6/gfx/harfbuzz/src> /abuild/jh/trunk-install/bin/g++ hb-blob.o -O2 -fwhopr -r -nostdlib hb-font.o ../../../../gfx/harfbuzz/src/hb-blob-private.h:53:60: warning: type of ‘_hb_blob_nil’ does not match original declaration [enabled by default] ../../../../gfx/harfbuzz/src/hb-blob.c:45:11: note: previously declared here
Created attachment 21143 [details] first part of testcase
Created attachment 21144 [details] second part of testcase
Created attachment 21145 [details] reduced testcase
Created attachment 21146 [details] reduced testcase
The C frontend for typedef struct { hb_atomic_int_t ref_count; } hb_reference_count_t; has an unnamed (NULL TYPE_NAME) main variant of hb_reference_count_t while the C++ frontend does not have a main variant different from the type with the typedef name. That causes us to hash both types differently (once with tag NULL and once with tag hb_reference_count_t). Now, our current behavior is to allow C-only merging of 'struct Foo *x' vs 'Foo_t *x' or 'stuct Foo x' vs. 'Foo_t x' in a structure. We honor a missing name as ok for merging but still hash in stuff (mostly to avoid excessive collisions for pointer types).
Testcase to illustrate C / C++ differences: typedef struct { } T1; typedef T1 T2; void foo (T1 a, T2 b) { } TYPE_MAIN_VARIANT of the type of a is T1 for C++, struct {} for C. TYPE_MAIN_VARIANT of the type of b is T1 for C++, struct {} for C. typedef struct X { } T1; typedef T1 T2; void foo (T1 a, T2 b) { } TYPE_MAIN_VARIANT of the type of a is struct X {} for C++, struct X {} for C. TYPE_MAIN_VARIANT of the type of b is struct X {} for C++, struct X {} for C.
The hard part with changing the C frontend to behave like the C++ frontend is to preserve the debug info for the typedef. The C frontend consistently uses DECL_ORIGINAL_TYPE to refer to what a TYPE_DECL was typedeffed to, but the C++ FE again does not use this for typedef struct { } T1; (but it does for typedef struct X { } T1;).
The only difference I see for typedef struct { } T1; vs typedef struct X { } T1; when looking at its main variant (or recursively DECL_ORIGINAL_TYPE) is <type_decl 0x7ffff5b2be60 X type <record_type 0x7ffff5b31dc8 X type_5 QI size <integer_cst 0x7ffff7ed24b0 constant 8> unit size <integer_cst 0x7ffff7ed24d8 constant 1> align 8 symtab 0 alias set -1 canonical type 0x7ffff5b31dc8 fields <type_decl 0x7ffff5b2bf18 X type <record_type 0x7ffff5b31e70 X> nonlocal decl_4 VOID file t.i line 1 col 18 align 1 context <record_type 0x7ffff5b31dc8 X> result <record_type 0x7ffff5b31dc8 X> > full-name "struct X" X() X(constX&) this=(X&) n_parents=0 use_template=0 interface-unknown chain <type_decl 0x7ffff5b2be60 X>> public decl_2 VOID file t.i line 1 col 16 align 8 chain <function_decl 0x7ffff5b23e00 __cxa_call_unexpected>> <type_decl 0x7ffff5b43000 T1 type <record_type 0x7ffff5b31dc8 T1 type_5 QI size <integer_cst 0x7ffff7ed24b0 constant 8> unit size <integer_cst 0x7ffff7ed24d8 constant 1> align 8 symtab 0 alias set -1 canonical type 0x7ffff5b31dc8 fields <type_decl 0x7ffff5b2bf18 ._0 type <record_type 0x7ffff5b31e70 T1> nonlocal decl_4 VOID file t.i line 1 col 16 align 1 context <record_type 0x7ffff5b31dc8 T1> result <record_type 0x7ffff5b31dc8 T1> > full-name "struct T1" X() X(constX&) this=(X&) n_parents=0 use_template=0 interface-unknown chain <type_decl 0x7ffff5b2be60 ._0>> VOID file t.i line 1 col 20 align 1 chain <type_decl 0x7ffff5b2be60 ._0>> which is the difference in alignment, TREE_PUBLIC and decl_lang_flag_2 set (DECL_IMPLICIT_TYPEDEF_P).
The following seems to work at least for the testcase: Index: gcc/tree.c =================================================================== --- gcc/tree.c (revision 161994) +++ gcc/tree.c (working copy) @@ -4298,6 +4298,18 @@ free_lang_data_in_type (tree type) { tree prev, member; + if (strcmp (lang_hooks.name, "GNU C++") == 0) + { + /* If this isn't an implicit typedef then this final + non-typedef name is not a struct tag. Clear its name. + See PR44871 for the glory details. */ + if (TYPE_NAME (type) + && TREE_CODE (TYPE_NAME (type)) == TYPE_DECL + && !DECL_ORIGINAL_TYPE (TYPE_NAME (type)) + /* DECL_IMPLICIT_TYPEDEF_P */ + && !DECL_LANG_FLAG_2 (TYPE_NAME (type))) + TYPE_NAME (type) = NULL_TREE; + } /* Note that TYPE_FIELDS can be shared across distinct TREE_TYPEs. Therefore, if the first field of TYPE_FIELDS is to be removed, we cannot set its TREE_CHAIN to NULL. it will wreck debuginfo for T1 of course.
reconfirmed.
Richard, any hope for a clean fix or shall we go with the hack above for next release?
I'd like to hear opinions from C and C++ FE people as to why the current state illustrated in comment #6 makes sense and if the behavior can be commonized between C and C++. I might be that the current difference in behavior is mandated by some standards. We can now also fix this in LTO by making sure we at least get the same TYPE_CANONICAL. I will try to look at how difficult that would be.
Testcase: > cat gcc/testsuite/g++.dg/lto/20101126-1_0.C typedef struct { int i; } T1; typedef T1 T2; extern T1 a; extern T2 b; int main() { return a.i + b.i; } > cat gcc/testsuite/g++.dg/lto/20101126-1_1.c typedef struct { int i; } T1; typedef T1 T2; T1 a; T2 b; /space/rguenther/src/svn/trunk/gcc/testsuite/g++.dg/lto/20101126-1_0.C:3:11: warning: type of 'a' does not match original declaration [enabled by default]^M /space/rguenther/src/svn/trunk/gcc/testsuite/g++.dg/lto/20101126-1_1.c:3:4: note: previously declared here^M /space/rguenther/src/svn/trunk/gcc/testsuite/g++.dg/lto/20101126-1_0.C:4:11: warning: type of 'b' does not match original declaration [enabled by default]^M /space/rguenther/src/svn/trunk/gcc/testsuite/g++.dg/lto/20101126-1_1.c:4:4: note: previously declared here^M (gdb) call debug_tree (prevailing->decl) <var_decl 0x7ffff5d071e0 a type <record_type 0x7ffff5d083f0 T1 SI size <integer_cst 0x7ffff7ed36e0 constant 32> unit size <integer_cst 0x7ffff7ed33e8 constant 4> align 32 symtab 0 alias set -1 structural equality fields <field_decl 0x7ffff7edbda8 i type <integer_type 0x7ffff7ee3498> nonlocal SI file t4.ii line 1 col 22 size <integer_cst 0x7ffff7ed36e0 32> unit size <integer_cst 0x7ffff7ed33e8 4> align 32 offset_align 128 offset <integer_cst 0x7ffff7ed3410 constant 0> bit offset <integer_cst 0x7ffff7ed3b18 constant 0> context <record_type 0x7ffff5d083f0 T1>>> public static SI file t4.ii line 3 col 4 size <integer_cst 0x7ffff7ed36e0 32> unit size <integer_cst 0x7ffff7ed33e8 4> align 32 context <translation_unit_decl 0x7ffff7edde60 D.2010>> (gdb) call debug_tree (decl) <var_decl 0x7ffff5d07000 a type <record_type 0x7ffff5d080a8 SI size <integer_cst 0x7ffff7ed36e0 constant 32> unit size <integer_cst 0x7ffff7ed33e8 constant 4> align 32 symtab 0 alias set -1 structural equality fields <field_decl 0x7ffff7edbd10 i type <integer_type 0x7ffff7ee3498> SI file t3.i line 1 col 22 size <integer_cst 0x7ffff7ed36e0 32> unit size <integer_cst 0x7ffff7ed33e8 4> align 32 offset_align 128 offset <integer_cst 0x7ffff7ed3410 constant 0> bit offset <integer_cst 0x7ffff7ed3b18 constant 0> context <record_type 0x7ffff5d080a8>>> used public external common SI file t3.i line 3 col 11 size <integer_cst 0x7ffff7ed36e0 32> unit size <integer_cst 0x7ffff7ed33e8 4> align 32 context <translation_unit_decl 0x7ffff7eddb80 D.2001>> if we want to have the same canonical type for the above we can simply disregard all names, which makes sense for canonical type merging, but that makes the hash functions very weak again (we share the type hashes for type and canonical type merging). It also can run foul of too much merging when facing incomplete types which can happen in various places.
On Fri, 26 Nov 2010, rguenth at gcc dot gnu.org wrote: > I'd like to hear opinions from C and C++ FE people as to why the current > state illustrated in comment #6 makes sense and if the behavior can be > commonized between C and C++. C++ has special rules about using typedefs for anonymous structure types for linkage purposes. C has no such rules. Thus, for C you can have typedef struct { int a; } T1; void f(T1); in one translation unit and typedef struct { int a; } T2; void f(T2); in another and they refer to the same function (cross-translation-unit struct compatibility rules apply) - for C++ they do not (and are mangled differently).
Author: rguenth Date: Thu Dec 2 12:24:46 2010 New Revision: 167367 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=167367 Log: 2010-12-02 Richard Guenther <rguenther@suse.de> PR lto/44871 * gimple.c (canonical_type_hash_cache): New hashtable. (gimple_type_hash): Make a wrapper around ... (gimple_type_hash_1): ... this. Take gtc_mode argument. (gimple_canonical_type_hash): Likewise. (gtc_visit): Take a gtc_mode argument. (gimple_types_compatible_p_1): Likewise. Do not compare struct tag names or field names when computing canonical types. (gimple_types_compatible_p): Adjust. (visit): Take a gtc_mode argument. (iterative_hash_gimple_type): Likewise. Do not hash struct tag names or field names when computing hashes of canonical types. (gimple_register_canonical_type): Use gimple_canonical_type_hash for the hash. (print_gimple_types_stats): Dump stats of canonical_type_hash_cache. (free_gimple_type_tables): Free canonical_type_hash_cache. * g++.dg/lto/20101126-1_0.C: New testcase. * g++.dg/lto/20101126-1_1.c: Likewise. Added: trunk/gcc/testsuite/g++.dg/lto/20101126-1_0.C trunk/gcc/testsuite/g++.dg/lto/20101126-1_1.c Modified: trunk/gcc/ChangeLog trunk/gcc/gimple.c trunk/gcc/testsuite/ChangeLog
Fixed.