This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Canonical type nodes, or, comptypes considered harmful

From: "Doug Gregor" <doug dot gregor at gmail dot com>
To: "Mike Stump" <mrs at apple dot com>
Cc: "Richard Kenner" <kenner at vlsi1 dot ultra dot nyu dot edu>, gdr at integrable-solutions dot net, gcc at gcc dot gnu dot org
Date: Fri, 10 Nov 2006 10:11:50 -0500
Subject: Re: Canonical type nodes, or, comptypes considered harmful
References: <1A16D953-52B9-45E4-9FFE-741C5A9EF2AF@cs.indiana.edu> <10611071958.AA29579@vlsi1.ultra.nyu.edu> <m3bqnigbo6.fsf@zeus.integrable-solutions.net> <10611072213.AA02304@vlsi1.ultra.nyu.edu> <5EF3ABF2-FEB2-4D61-A322-D4214146944B@apple.com> <24b520d20611071913m2ab2011aiacd929bfd258d481@mail.gmail.com> <139C26C4-86BF-4E10-82B2-78B34EBFFEF7@apple.com> <24b520d20611080559m3d0bd5cep1e270c12bf0f4917@mail.gmail.com> <D646350F-FF55-4252-BC87-CC6555F26C44@apple.com>

On 11/9/06, Mike Stump <mrs@apple.com> wrote:

On Nov 8, 2006, at 5:59 AM, Doug Gregor wrote:
> However, this approach could have some odd side effects when there are
> multiple mappings within one context. For instance, we could have
> something like:
>
>  typedef int foo_t;
>  typedef int bar_t;
>
>  foo_t* x = strlen("oops");

x is a decl, the decl has a type, the context of that instance of the
type is x.

map(int,x) == foo_t.

It is this, because we know that foo_x was used to create x and we
set map(int,x) equal to foo_t as it is created.


Ah, I understand now. When you wrote "context", I was thinking of some
coarse-grained approach to context, e.g., a scope or a block. With
variable-level granularity, your idea certainly works.

> The error message that pops out would likely reference "bar_t *"

map(int,x) doesn't yield bar_t.

Right, got it.

> We can't literally combine T and U into a single canonical
> type node, because they start out as different types.

?


To get a truly canonical type node, whenever we create a new type that
may be equivalent to an existing type, we need to find that existing
type node at the time that we create the new type, e.g.,

typedef int foo_t;

When we create the decl for "foo_t", it's TREE_TYPE will be "int" (the
canonical type node) and with its context we know the name the user
wrote ("foo_t"). When we create "foo_t*", the idea is the same: find
the canonical type node (int*) and its context will till us the actual
type written ("foo_t *"). All the time, we're finding the canonical
type node for a particular type ("foo_t *") before we go and create a
new type node.

With concepts, there are cases where we end up creating two different
type nodes that we later find out should have been the same type node.
Here's an example:

 template<typename T, typename U>
 where LessThanComparable<T> && SameType<T, U>
 const T& weird_min(const T& t, const U& u) {
   return t < u? t : u;
 }

When we parse the template header, we create two different type nodes
for T and U. Then we start parsing the where clause, and create a type
node for LessThanComparable<T>. Now we hit the SameType requirement,
which says that T and U are the same type. It's a little late to make
T and U actually have the same type node (unless we want to re-parse
the template or re-write the AST).

> Granted, we could layer a union-find implementation (that better
> supports
> concepts) on top of this approach.

Ah, but once you break the fundamental quality that different
addresses implies different types, you limit things to structural
equality and that is slow.


Not necessarily. If you have an efficient way to map from a type to
its canonical type node, then you pay for that mapping but not for a
structural equality check. In a union-find data structure, the mapping
amounts to a bit of pointer chasing... but in most cases it's only one
pointer hop. Actually, without concepts we can guarantee that it's
only one pointer hop... with concepts, we need to keep a little more
information around in the AST and we sometimes have more than one
pointer hop to find the answer.

I already use a union-find data structure inside ConceptGCC, because I
don't have the option to map to a canonical type when type nodes are
initially created. But, since there are no canonical type nodes in GCC
now, I have to use a hash table that hashes based on structural
properties to keep track of the canonical type nodes. *Any* system
that gives us canonical type nodes in GCC would be a huge benefit for
ConceptGCC.

 Cheers,
 Doug

Follow-Ups:
- Re: Canonical type nodes, or, comptypes considered harmful
  - From: Gabriel Dos Reis

References:
- Canonical type nodes, or, comptypes considered harmful
  - From: Douglas Gregor
- Re: Canonical type nodes, or, comptypes considered harmful
  - From: Richard Kenner
- Re: Canonical type nodes, or, comptypes considered harmful
  - From: Gabriel Dos Reis
- Re: Canonical type nodes, or, comptypes considered harmful
  - From: Richard Kenner
- Re: Canonical type nodes, or, comptypes considered harmful
  - From: Mike Stump
- Re: Re: Canonical type nodes, or, comptypes considered harmful
  - From: Doug Gregor
- Re: Canonical type nodes, or, comptypes considered harmful
  - From: Mike Stump
- Re: Re: Canonical type nodes, or, comptypes considered harmful
  - From: Doug Gregor
- Re: Canonical type nodes, or, comptypes considered harmful
  - From: Mike Stump

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]