This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Re: [PATCH] Canonical types (1/3)

From: "Doug Gregor" <doug dot gregor at gmail dot com>
To: "Mark Mitchell" <mark at codesourcery dot com>
Cc: "GCC Patches" <gcc-patches at gcc dot gnu dot org>
Date: Tue, 5 Dec 2006 10:55:47 -0500
Subject: Re: Re: [PATCH] Canonical types (1/3)
References: <24b520d20611280755x5117377fh7e4e1344575fb17d@mail.gmail.com> <4574C3CA.6080509@codesourcery.com>

On 12/4/06, Mark Mitchell <mark@codesourcery.com> wrote:

The new macros in tree.h (TYPE_CANONICAL and TYPE_STRUCTURAL_EQUALITY)
need documentation, at least in tree.h, and, ideally, in the ill-named
c-tree.texi as well.

Will do.

I want to make sure I understand this idiom, in
build_pointer_type_for_mode, and elsewhere:

+  if (TYPE_CANONICAL (to_type) != to_type)
+    TYPE_CANONICAL (t) =
+      build_pointer_type_for_mode (TYPE_CANONICAL (to_type),
+                                  mode, can_alias_all);

If there was already a pointer type to the canonical type of to_type,
then the call build_pointer_type_for_mode will return it.  If there
wasn't, then we will build a new canonical type for that pointer type.
We can't use the pointer type we're building now (i.e., "T") as the
canonical pointer type because we have would have no way to find it in
future, when creating another pointer type for the canonical version of
to_type.


Yes. Note that the canonical type is always the most basic language
type, e.g., "int" rather than a typedef of int, "float**" rather than
a typedef of "float**" or a pointer to a typedef of "float*".

So, we are actually creating more type nodes in this case.  That seems
unfortunate, though I fully understand we're intentionally trading space
for speed just by adding the new type fields.  A more dramatic version
of your change would be to put the new pointer type on the
TYPE_POINTER_TO list for the canonical to_type, make it the canonical
pointer type, and then have the build_pointer_type_for_mode always go to
the canonical to_type to search TYPE_POINTER_TO, considering types to be
an exact match only if they had more fields in common (like, TYPE_NAME
and TYPE_CONTEXT, say).  Anyhow, your approach is fine, at least for now.


Yes, that's possible for pointers. We don't always have a place in the
node where we store such variants, e.g., we don't have a
"TYPE_ARRAYS_OF" list.

We do create more type nodes. In the -fmem-report I'd posted earlier,
the size of the type hash went from 20644 to 20715 with my patch.

Now, we *could* use canonical types to decrease the size of the tree
overall, by only building the canonical version of the types, ignoring
typedef names. We wouldn't do it by default (because it would change
error messages), but we could have a -fskip-typedef-names option that
would avoid building new types for, e.g., foo_t* when foo_t is just a
typedef of "int" (we would just re-use the int* type node). So,
-fskip-typedef-names would decrease the memory footprint and possible
improve compilation performance, at the cost of less readable error
messages. But users don't make mistakes in their code anyway, right?
:)

+ TYPE_STRUCTURAL_EQUALITY (t) = TYPE_STRUCTURAL_EQUALITY (to_type);
Does it ever make sense to have both TYPE_CANONICAL and
TYPE_STRUCTURAL_EQUALITY set?

No, it does not make sense for both to be set.

If we have to do the structural equality
test, then it seems to me that the canonical type isn't useful, and we
might as well not construct it.


Right. Also, we could use TYPE_CANONICAL == NULL_TREE to indicate when
the type requires structural equality, removing the 1-bit
TYPE_STRUCTURAL_EQUALITY flag (and its associated padding). That would
help reduce the impact of this patch on memory usage.

> +      type = build_variant_type_copy (orig_type);
>        TYPE_ALIGN (type) = boundary;
> +      TYPE_CANONICAL (type) = TYPE_CANONICAL (orig_type);

Eek.  So, despite having different alignments, we consider these types
"the same"?  If that's what we already do, then it's OK to preserve that
behavior, but it sure seems worrisome.


Yes, we consider types with different alignments the same (from the
language point of view). It's a little surprising what we do now in
comptypes. We basically check qualifiers and whether the type is a
Java type. If those match, and the TYPE_MAIN_VARIANT is the same for
the two types, we conclude that the types are the same... which means
that we ignore attributes, alignment, sizes, and a host of other
things. Now, this makes some sense, because type attributes and
alignment don't really exist in the C++ type system, so two types (in
the C++ sense) can't differ by their attributes.

+  else if (strict == COMPARE_STRUCTURAL)
+    return structural_comptypes (t1, t2, COMPARE_STRICT);
Why do we ever want the explicit COMPARE_STRUCTURAL?


It comes in useful when you've created a new type node but you need to
search through a list somewhere to find its canonical type. For
example, the new canonical_type_parameter routine (in pt.c) finds the
canonical type for a template type parameter by searching through a
list of canonical type parameters. We can't do a comparison based on
canonical types, because we're trying to find the canonical type. So,
COMPARE_STRUCTURAL lets us perform structural comparisons at the top
level to do this search. We would need the same thing if we were to
search through the TYPE_POINTER_TO list for a canonical type, as you
mentioned before.

+static hashval_t
+cplus_array_hash (const void* k)
+{
+  hashval_t hash;
+  tree t = (tree) k;
+
+  hash = (htab_hash_pointer (TREE_TYPE (t))
+         ^ htab_hash_pointer (TYPE_DOMAIN (t)));
+
+  return hash;
+}

Since this hash function is dependent on pointer values, we'll get
different results on different hosts.  I was worried that will lead to
differences in generated debug information, perhaps due to different
TYPE_UIDs -- but it looks like there is only ever one matching entry in
the table, so we never have to worry about the compiler "randomly"
choosing between two equally good choices?


Right, there's only ever one matching entry in the table, so the
compiler never needs to choose. This is the same kind of hashing we do
for types in several other places in the C and C++ front ends.

Have you tested with flag_check_canonical_types on, and verified that
you get no warnings?


Yes. I bootstrapped with C, C++, Objective-C, Objective-C++, and Java,
then ran "make check" for the first four languages, along with "make
check" for libstdc++. No warnings anywhere. We check canonical types
by default when --enable-checking, so if we've made a mistake, we'll
know sooner rather than later. All this was on i686-pc-linux-gnu.

 (I agree that a --param for that would be better;
if a user ever has to turn this on, we broke the compiler.)

Okay, I'll switch to the --param mechanism.

 Thanks!
 Doug

Follow-Ups:
- Re: [PATCH] Canonical types (1/3)
  - From: Mark Mitchell

References:
- Re: [PATCH] Canonical types (1/3)
  - From: Mark Mitchell

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]