[PATCH] Re: Canonical type nodes, or, comptypes considered harmful
Doug Gregor
doug.gregor@gmail.com
Mon Nov 20 10:18:00 GMT 2006
On 09 Nov 2006 23:09:07 -0800, Ian Lance Taylor <iant@google.com> wrote:
> I meant something very simple: for every type, there is a
> TYPE_CANONICAL field. This is how you tell whether two types are
> equivalent:
> TYPE_CANONICAL (a) == TYPE_CANONICAL (b)
> That is what I mean when I saw one memory dereference and one pointer
> comparison.
I've gone through and implemented this in the C++ front end. The
general idea is to add two new fields to every _TYPE in GCC:
TYPE_CANONICAL(t) refers to the "canonical" type associated with the
type t, which might be t itself. The "canonical" type is defined by
the language's type equivalence rules, so, e.g., "int" and "typedef
int foo" would both refer to the canonical type "int".
TYPE_STRUCTURAL_EQUALITY(t) is a flag that is set when the type t
needs to use structural equality checks. This happens when we can't
trust TYPE_CANONICAL to tell us for certain when two types are
equivalent, e.g., because we weren't able to maintain canonical types.
The basic approach to the attached patch is simple: find every place
in the C++ front end (and the parts of the C front end that it relies
on) where one duplicates a type node. For each of those places,
propagate canonical type and structural equality information.
Hwo do we find these places? Grep finds most of them (calls to
build_variant_type_copy are the main culprits). To find the rest, the
patch has a canonical types verification mode (lamely enabled/disabled
by a #define in cp/typeck.c) where it performs the structural check
and then checks if using TYPE_CANONICAL would come to the same
conclusion. If not, it errors out. By running the test suites under
this mode, I was able to find the places where we duplicate type nodes
and fix the TYPE_CANONICAL field. No types have actually been
eliminated, so error messages haven't changed, and we probably haven't
saved any memory... but type comparison is very fast, and the cost of
maintaining these new fields doesn't seem to be too ridiculous.
Some brief performance numbers when running "make check-g++" on GCC
with --enable-checking=tree:
+ Without my patch: 5m57.451s
+ Canonical types verification mode: 5m54.356s
+ Canonical types (no verification): 5m36.943s
I can't quite explain why the verification mode is 3s faster than the
unpatched GCC, but it's less than a percent and I'm on a noisy system.
Still, canonical types give us an 18-second improvement (about 5%),
which is promising. Note that I haven't canonicalized function types
yet (they're always structural), which might help improve performance
further; see the latter part of this message for more information
about "troublemaker" type nodes in GCC.
Some brief performance numbers when running "make check" on
libstdc++-v3 testsuite, again with --enable-checking=tree:
+ Without my patch: 17m41.156s
+ Canonical types verification mode: 17m26.116s
+ Canonical types (no verification): 17m12.235s
Only about a 3% speedup here. If I try out some more template-heavy
code (e.g., some of the tests in the Boost Graph Library), I've seen
some speedups up in the 9.6% range. I imagine I could do better with
more sophisticated template libraries.
The attached patch introduces no new regressions on i686-pc-linux-gnu,
in either the canonical types verification mode or the "full-speed"
mode. ChangeLogs for the C and C++ parts follow, although only the C++
front end uses the canonical types right now.
I won't even ask about committing this patch, because it is a straw
man. Despite my testing, this is a risky patch... if we missed a case
where we need to canonicalize types, we will break code until we fix
that case. Most of them are easy, some are not quite as easy. At a
minimum, I'd want to bootstrap/test the C, C++, Objective C++, and
Java front ends, try some template-heavy code in the C++ compiler
(e.g., Boost), and test more platforms before even considering this
patch. That said, I think this is a promising direction. We don't seem
to lose out on performance, and we can see some wins... but it's a big
change.
So, are we interested in pursing this further? If so, several other
questions pop up:
1) What about the C, Objective-C, Objective-C++ front ends?
2) Should the canonical types verification mode be a configure-time
switch (--enable-checking=canonical-types)? Or a compiler flag
(-fcheck-canonical-types)? The latter might be a greater help to us,
because we can ask bug submitters to add the flag and provide the
resulting output.
3) Anyone interested in helping?
Cheers,
Doug Gregor
Open Systems Lab @ Indiana University
FUNCTION_TYPE, METHOD_TYPE:
- Default arguments should not be a part of the type
- Throw specifiers should not be part of the type
- Many attributes should not be part of the type
- We can still canonicalize these.
TYPENAME_TYPE:
- This node poses some interesting problems, because we can't
result TYPENAME_TYPEs that occur in the return type or type
of out-of-line member definitions until we've seen that they
are, in fact, member definitions. Worse, we might end up
comparing TYPENAME_TYPEs in different contexts, where the
currently open classes differ. Thus, the "canonical" type
for a TYPENAME_TYPE is dependent on the context, so we must
always perform structural comparisons.
TEMPLATE_TEMPLATE_PARM:
- When we reduce the level of a template template parameter
node, we don't also reduce the level of its template
parameters. This makes it impossible to match the new,
level-reduced template template parameter to its canonical
template template parameter type, because the canonical
template template parameter will have its template
parameters at a different level. So, when we reduce the
level of a TEMPLATE_TEMPLATE_PARM, we mark is as requiring
structural equality.
UNBOUND_CLASS_TEMPLATE:
- These nodes always require structural equality checks,
because they are not hashed. We could add a hash table for
these nodes.
INTEGER_TYPE:
- We force this node to use structural equality when building
the index type for an array type with value-dependent bounds
inside a template, because we don't have an effective way to
hash on expressions. See compute_array_index_type.
ARRAY_TYPE:
- We try to layout array types when we build them, even if
they refer to incomplete types. In this case, we compute the
alignments incorrectly, creating duplicate ARRAY_TYPE nodes
that are not easily placed together again. Ideally, we would
have some kind of "no computed yet" alignment value, that
gets filled in when necessary.
2006-11-20 Douglas Gregor <doug.gregor@gmail.com>
* builtins.c (std_gimplify_va_arg_expr): Keep track of the
canonical type when building a variant with a different alignment.
* c-common.c (c_common_nodes_and_builtins): Since variants of
void_type_node get built before it is given a name, we need to
give those variants the name, too.
(handle_packed_attribute): When building the type variant, set the
canonical type appropriately.
(handle_unused_attribute): When building the type variant, set the
canonical type appropriately.
(handle_aligned_attribute): Ditto.
(handle_deprecated_attribute): Ditto.
(complete_array_type): We need to work with the canonical main
type of the array, from which we will build the qualified version.
* stor-layout.c (layout_type): When we don't know the
alignment of a type for which we're building an array, we end up
guessing wrong, so make the type require structural equality.
* tree.c (make_node_stat): When we build a new type, it is its
own canonical type.
(build_type_attribute_qual_variant): When building an attribute
variant, its canonical type is the non-attribute variant.
(build_qualified_type): Ditto.
(build_distinct_type_copy): When building a distinct type from
another type, the new type is its own canonical type.
(build_pointer_type_for_mode): When building a pointer type, also
build a canonical type pointer.
(build_reference_type_for_mode): When building a reference type,
also build a canonical type reference.
(build_index_type): When we can't hash an index type (e.g.,
because its maximum value is negative), the index type requires
structural equality tests.
(build_array_type): Build the canonical form of an array type.
(build_function_type): Function types require structural equality,
because they contain default arguments, attributes, etc.
(build_method_type_directly): Ditto for method types.
(build_offset_type): Build the canonical offset type.
(build_complex_type): Build the canonical vector type.
* tree.h (TYPE_CANONICAL): New.
(TYPE_STRUCTURAL_EQUALITY): New.
(struct tree_type): Added structural_equality, unused_bits,
canonical fields.
2006-11-20 Douglas Gregor <doug.gregor@gmail.com>
* typeck.c (structural_comptypes): Was "comptypes".
(comptypes): Use canonical type information to perform fast type
comparison. When VERIFY_CANONICAL_TYPES, verify that the canonical
type comparison returns the same results as we would see from the
current, structural check. Support COMPARE_STRUCTURAL when we need
structural checks.
* decl.c (typename_compare): Fix comment.
(build_typename_type): TYPENAME_TYPE nodes require structural
equality checks, because they resolve different based on the
current class type.
(make_unbound_class_template): UNBOUND_CLASS_TEMPLATE nodes
require structural equality checks (for now).
(build_ptrmemfunc_type): Build the canonical pointer to member
function type.
(compute_array_index_type): Whenever we build a new index type
to represent the size of an array in a template, we need to mark
this index type as requiring structural equality. This goes for
arrays with value-dependent sizes with the current ABI, or all
arrays with ABI-1.
* tree.c (cplus_array_hash): New.
(struct cplus_array_info): New.
(cplus_array_compare): New.
(cplus_array_htab): New.
(build_cplus_array_type_1): Use a hash table to cache the array
types we build. Build the canonical array type for each array
type.
(cp_build_qualified_type_real): When building a cv-qualified array
type, use the hash table of array types and build canonical array
types as necessary.
(bind_template_template_parm): BOUND_TEMPLATE_TEMPLATE_PARM nodes
use structural equality (for now).
* cp-tree.h (COMPARE_STRUCTURAL): New.
* pt.c (canonical_template_parms): New.
(canonical_type_parameter): New.
(process_template_parm): Find the canonical type parameter.
(lookup_template_class): When we have named the primary template
type, set the canonical type for our template class to the primary
template type. If any of the template arguments need structural
equality checks, the template class needs structural equality
checks.
(tsubst): When reducing the level of a template template
parameter, we require structural equality tests for the resulting
parameter because its template parameters have not had their types
canonicalized. When reducing a template type parameter, find the
canonical reduced type parameter.
(any_template_arguments_need_structural_equality_p): New.
* name-lookup.c (pushdecl_maybe_friend): When creating a new type
for a TYPE_DECL, make its canonical type the original type.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: canonical-types.patch
Type: text/x-patch
Size: 30493 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20061120/a0a0a870/attachment.bin>
More information about the Gcc-patches
mailing list