This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Martin dot Uecker at med dot uni-goettingen dot de
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Richard Sandiford <richard dot sandiford at arm dot com>, Jason Merrill <jason at redhat dot com>, nd <nd at arm dot com>, Nathan Sidwell <nathan at acm dot org>, "Joseph S. Myers" <joseph at codesourcery dot com>
- Date: Tue, 16 Oct 2018 09:52:11 +0200
- Subject: Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"
- References: <87d0sbjn97.fsf@arm.com> <1539628827.4845.3.camel@med.uni-goettingen.de>
On Mon, Oct 15, 2018 at 8:40 PM Uecker, Martin
<Martin.Uecker@med.uni-goettingen.de> wrote:
>
>
> Hi Richard,
>
> as Joseph pointed out, there are some related discussions
> on the WG14 reflector. How a about moving the discussion
> there?
>
> I find your approach very interesting and that it already
> comes with an implementation is of course very useful
>
> But I don't really understand the reasons why this is not based
> on (2). These types are not "sizeless" at all, their size
> just isn't known at compile time. So to me this seems to me
> a misnomer.
>
> In fact, to me these types *do* in fact seem very similar
> to VLAs as VLAs are also complete types which also do no
> have a known size at compile time.
>
> That arrays decay to pointers doesn't mean that we
> couldn't have similar vectors types which don't decay.
> This is hardly a fundamental problem.
>
> I also don't understand the problem about the array
> size. If I understand this correctly, the size is somehow
> known at run-time and implicitly passed along with the
> values. So these new types do not need to have a
> size expression (as in your proposal).
>
> Assignment, the possibility to return the type from
> functions, and something like __sizeless_structs would
> make sense for VLAs too.
>
> So creating a new category "variable-length types" for
> both VLAs and variably-length vector types seems do make
> much more sense to me. As I see it, this would be mainly
> a change in terminology and not so much of the underlying
> approach.
I agree - those types very much feel like VLAs.
I think there's also the existing vector extension of GCC to
factor in which introduces (non-VLA) vector types to the
language and allows arithmetic on them as well as
indexing and passing and returing them by values.
Variable-size vectors should play well in that context.
Richard.
>
> Best,
> Martin
>
> Am Montag, den 15.10.2018, 15:30 +0100 schrieb Richard Sandiford:
> > The C standard says:
> >
> > At various points within a translation unit an object type may be
> > "incomplete" (lacking sufficient information to determine the size of
> > objects of that type) or "complete" (having sufficient information).
> >
> > For AArch64 SVE, we'd like to split this into two concepts:
> >
> > * has the type been fully defined?
> > * would fully-defining the type determine its size?
> >
> > This is because we'd like to be able to represent SVE vectors as C and C++
> > types. Since SVE is a "vector-length agnostic" architecture, the size
> > of the vectors is determined by the runtime environment rather than the
> > programmer or compiler. In that sense, defining an SVE vector type does
> > not determine its size. It's nevertheless possible to use SVE vector types
> > in meaningful ways, such as having automatic vector variables and passing
> > vectors between functions.
> >
> > The main questions in the RFC are:
> >
> > 1) is splitting the definition like this OK in principle?
> > 2) are the specific rules described below OK?
> > 3) coding-wise, how should the split be represented in GCC?
> >
> > Terminology
> > -----------
> >
> > Going back to the second bullet above:
> >
> > * would fully-defining the type determine its size?
> >
> > the rest of the RFC calls a type "sized" if fully defining it would
> > determine its size. The type is "sizeless" otherwise.
> >
> > Contents
> > --------
> >
> > The RFC is organised as follows. I've erred on the side of including
> > detail rather than leaving it out, but each section is meant to be
> > self-contained and skippable:
> >
> > - An earlier RFC
> > - Quick overview of SVE
> > - Why we need SVE types in C and C++
> > - How we ended up with this definition
> > - The SVE types in more detail
> > - Outline of the type system changes
> > - Sizeless structures (and testing on non-SVE targets)
> > - Other variable-length vector architectures
> > - Edits to the C standard
> > - Base changes
> > - Updates for consistency
> > - Sizeless structures
> > - Edits to the C++ standard
> > - GCC implementation questions
> >
> > I'll follow up with patches that implement the split.
> >
> >
> >
> > An earlier RFC
> > ==============
> >
> > For the record (in case this sounds familiar) I sent an RFC about the
> > sizeless type extension a while ago:
> >
> > https://gcc.gnu.org/ml/gcc/2017-08/msg00012.html
> >
> > The rules haven't changed since then, but this version includes more
> > information and includes support for sizeless structures.
> >
> >
> > Quick overview of SVE
> > =====================
> >
> > SVE is a vector extension to AArch64. A detailed description is
> > available here:
> >
> > https://static.docs.arm.com/ddi0584/a/DDI0584A_a_SVE_supp_armv8A.pdf
> >
> > but the only feature that really matters for this RFC is that SVE has no
> > fixed or preferred vector length. Implementations can instead choose
> > from a range of possible vector lengths, with 128 bits being the minimum
> > and 2048 bits being the maximum. Priveleged software can further
> > constrain the vector length within the range offered by the implementation;
> > e.g. linux currently provides per-thread control of the vector length.
> >
> >
> > Why we need SVE types in C and C++
> > ==================================
> >
> > SVE was designed to be an easy target for autovectorising normal scalar
> > code. There are also various language extensions that support explicit
> > data parallelism or that make explicit vector chunking easier to do in
> > an architecture-neutral way (e.g. C++ P0214). This means that many users
> > won't need to do anything SVE-specific.
> >
> > Even so, there's always going to be a place for writing SVE-specific
> > optimisations, with full access to the underlying ISA. As for other
> > vector architectures, we'd like users to be able to write such routines
> > in C and C++ rather than force them to go all the way to assembly.
> >
> > We'd also like C and C++ functions to be able to take SVE vector
> > parameters and return SVE vector results, which is particularly useful
> > when implementing things like vector math routines. In this case in
> > particular, the types need to map directly to something that fits in
> > an SVE register, so that passing and returning vectors has minimal
> > overhead.
> >
> >
> > How we ended up with this definition
> > ====================================
> >
> > Requirements
> > ------------
> >
> > We need the SVE vector types to define and use SVE intrinsic functions
> > and to write SVE vector library routines. The key requirements when
> > defining the types were:
> >
> > * They must be available in both C and C++ (because we want to be able
> > add SVE optimisations to C-only codebases).
> >
> > * They must fit in an SVE vector register (so there can be no on-the-side
> > information).
> >
> > * It must be possible to define automatic variables with these types.
> >
> > * It must be possible to pass and return objects of these types
> > (since that's what intrinsics and vector library routines need to do).
> >
> > * It must be possible to use the types in _Generic associations
> > (so that _Generic can be used to provide tgmath.h-style overloads).
> >
> > * It must be possible to use pointers or references to the types
> > (for passing or returning by pointer or reference, and because not
> > allowing references would be semantically difficult in C++).
> >
> > Ideally, there'd also be a way of grouping SVE vectors together into tuples,
> > since the ISA has instructions like LD2 that return multiple vectors.
> > It would be good if users could also define their own tuple types, on top
> > of the ones needed by the intrinsics, although that's more "nice to have".
> >
> > Possible approaches
> > -------------------
> >
> > The main complication is that the size of an SVE vector is not a
> > compile-time constant. It seems that any approach to handling this
> > would fall into one of three categories:
> >
> > (1) Limit the types in such a way that there is no concept of size.
> >
> > (2) Define the size of the types to be variable.
> >
> > (3) Define the size of the types to be constant, either with the
> > constant being large enough for all possible vector lengths or
> > with the types pointing to separate memory (as for C++ classes
> > like std::string).
> >
> > Why (2) seemed like a bad idea
> > ------------------------------
> >
> > (2) seemed initially appealing since C already has the concept of
> > variable-length arrays. However, variable-length built-in types
> > would work in a significantly different way. Arrays often decay to
> > pointers (which of course are fixed-length types), whereas vector
> > types never would. Unlike arrays, it should be possible to pass
> > variable-length vectors to functions, return them from functions,
> > and assign them by value.
> >
> > One particular difficulty is that the semantics of variable-length arrays
> > rely on having a point at which the array size is evaluated. It would
> > be difficult to extend this approach to declarations of functions that
> > pass or return variable-length types.
> >
> > As well as the extension itself being relatively complex (especially
> > for C++), it might be difficult to define it in a way that interacts
> > naturally with other (unseen) extensions, even those that are aware of
> > variable-length arrays. Also, AIUI, variable-length arrays were added
> > to an early draft of C++14, but were later removed as too controversial
> > and didn't make it into the final standard. C++17 still requires sizeof
> > to be constant and C11 makes variable-length arrays optional.
> >
> > (2) therefore felt like a complicated dead-end.
> >
> > Why (3) seemed like a bad idea
> > ------------------------------
> >
> > (3) can be divided into two:
> >
> > (3a) The vector types have a constant size and are large enough for all
> > possible vector lengths.
> >
> > The main problem with this is that the maximum size of an SVE
> > vector (2048 bits) is much larger than the minimum size (128 bits).
> > Using a fixed size of 2048 bits would be extremely inefficient for
> > smaller vector lengths, and of course the whole point of using
> > vectors is to make things *more* efficient.
> >
> > Also, we would need to define the types such that only the bytes
> > associated with the actual vector length are significant. This would
> > make it possible to pass or return the types in registers and treat
> > them as register values when copying. This perhaps has some similarity
> > with overaligned structures such as:
> >
> > struct s { _Alignas(16) int i; };
> >
> > except that the amount of padding would only be known at runtime.
> >
> > There's also a significant conceptual problem: encoding a fixed size
> > goes against a guiding principle of SVE, in which there is no preferred
> > vector length. There's nothing particularly magical about the current
> > limit of 2048 bits and it would be better to avoid an ABI break if the
> > maximum ever did increase in future.
> >
> > (3b) The vector types have a constant size and refer to separate storage
> > (as for std::string etc.)
> >
> > This would be difficult to do without C++-style constructor, destructor,
> > copy and move semantics, so wouldn't work well in C. And in C++ it would
> > be less efficient than the proposed approach, since presumably an Allocator
> > would be needed to allocate the separate storage. It would also require
> > a complicated ABI mapping to ensure that the vectors can still be passed
> > and returned in registers.
> >
> > Chosen approach
> > ---------------
> >
> > We therefore took approach (1) and classified C and C++ types as "sized"
> > (having a measurable size when fully-defined) and "sizeless" (never
> > having a measurable size). Sizeless types have no defined size,
> > alignment or layout at the language level, with those things becoming
> > purely an ABI-level detail.
> >
> > We then treated all sizeless types as permanently incomplete.
> > On its own, this would put them in a similar situation to "void"
> > (although they wouldn't be exactly the same, since there are some
> > specific rules for "void" that don't apply to incomplete types in
> > general). We then relaxed specific rules until the types were actually
> > useful.
> >
> > Things in favour of (1)
> > -----------------------
> >
> > The reasons above were mostly negative, arriving at (1) by elimination.
> > A more positive justification of this approach is that it seems
> > to meet the requirements in the most efficient way possible. The
> > vectors can use their natural (native) representation, and the type
> > system prevents uses that would make that representation problematic.
> >
> > Also, the approach of starting with very restricted types and then
> > specifically allowing certain things should be more future-proof
> > and interact better with other language extensions. By default,
> > any language extension would treat the new types like other incomplete
> > types and choose conservatively-correct behaviour. It would then be
> > possible to relax the language extension if this default behaviour
> > turns out to be too restrictive.
> >
> > (That said, treating the types as permanently incomplete still won't
> > avoid all clashes with other extensions. For example, we need to
> > allow objects of automatic storage duration to have certain forms of
> > incomplete type, whereas an extension might implicitly assume that all
> > such objects must already have complete type. The approach should still
> > avoid the worst effects though.)
> >
> >
> > The SVE types in more detail
> > ============================
> >
> > Arm has published an SVE "ACLE" that specifies the SVE types and intrinsic
> > functions in detail. For reference this is available without registration
> > at:
> >
> > https://static.docs.arm.com/100987/0000/acle_sve_100987_0000_00_en.pdf
> >
> > but I'll try to keep this self-contained.
> >
> > The ACLE defines a vector type sv<base>_t for each supported element type
> > <base>_t, so that the complete set is:
> >
> > svint8_t svint16_t svint32_t svint64_t
> > svuint8_t svuint16_t svuint32_t svuint64_t
> > svfloat16_t svfloat32_t svfloat64_t
> >
> > The types in each column have the same number of lanes and have twice
> > as many lanes as those in the column to the right. Every vector has
> > the same number of bytes in total, with the number of bytes being
> > determined at runtime.
> >
> > The ACLE also defines a single predicate type:
> >
> > svbool_t
> >
> > that has the same number of lanes as svint8_t and svuint8_t.
> >
> > All these types are opaque builtin types and are only expected to
> > be used with the associated ACLE intrinsics. There are intrinsics for
> > creating vectors from scalars, loading from scalars, storing to scalars,
> > reinterpreting one type as another, etc.
> >
> > The idea is that the vector types would only be used for short-term
> > register-sized working data. Longer-term data would typically be stored
> > out to arrays.
> >
> > For example, the vector function underlying:
> >
> > #pragma omp declare simd
> > double sin(double);
> >
> > would be:
> >
> > svfloat64_t mangled_sin(svfloat64_t, svbool_t);
> >
> > (The svbool_t is because SVE functions should be predicated by default,
> > to avoid the need for a scalar epilogue loop.)
> >
> > The ACLE also defines x2, x3 and x4 tuple types for each vector type;
> > for example, svint8x3_t is a tuple of 3 svint8_ts. The tuples are
> > structure-like types with fields v0, v1, v2 and v3, up to the number
> > required.
> >
> >
> > Outline of the type system changes
> > ==================================
> >
> > Going back to the summary at the start of the RFC, C classifies types as
> > "complete" (the size of objects can be calculated) or "incomplete" (the
> > size of objects can't be calculated). There's very little you can do
> > with a type until it becomes complete.
> >
> > The approach we took was to treat all the SVE types as permanently
> > incomplete. We then went through the standard relaxing specific
> > rules until the types were actually useful.
> >
> > The first step was to classify types as:
> >
> > * "indefinite" (lacking sufficient information to create an object of
> > that type) or "definite" (having sufficient information)
> >
> > * "sized" (will have a known size when definite) or "sizeless" (will
> > never have a known size)
> >
> > * "incomplete" (lacking sufficient information to determine the size of
> > objects of that type) or "complete" (having sufficient information)
> >
> > where the wording for the final bullet is unchanged from the standard.
> > Thus a "definite type" is one that has been fully-defined rather than
> > simply declared, and "complete" is now equivalent to "sized and definite".
> > All standard types are "sized" (even "void", although it's always
> > indefinite and incomplete).
> >
> > We then needed to make some rules use the distinction between "indefinite"
> > and "definite" rather than "incomplete" and "complete". The specific
> > things we wanted to allow were:
> >
> > * defining automatic variables with sizeless definite type
> > * defining functions whose parameters have sizeless definite type
> > * defining functions that return a sizeless definite type
> > * using sizeless definite types in _Generic associations
> > * dereferencing pointers to sizeless definite types
> >
> > Specific things we wanted to remain invalid -- by inheriting the rules from
> > incomplete types -- were:
> >
> > * creating or accessing arrays that have sizeless element types
> > * doing pointer arithmetic on pointers to sizeless types
> > * using sizeof and _Alignof with a sizeless type (or object of sizeless type)
> > * defining (sized) unions or structures with sizeless members
> >
> > It also seemed worth adding an extra restriction:
> >
> > * variables with sizeless type must not have static or thread-local
> > storage duration
> >
> > In practice it's impossible to define such variables with incomplete type,
> > but having an explicit rule means that things like:
> >
> > extern svint8_t foo; // An SVE vector of int8_t elements.
> >
> > are outright invalid rather than simply useless (because no other
> > translation unit could ever define foo). Similarly, without an
> > explicit rule:
> >
> > svint8_t foo; // An SVE vector of int8_t elements.
> >
> > would be a valid tentative definition at the point it occurs and only
> > become invalid at the end of the translation unit, because svint8_t is
> > never completed.
> >
> > This restriction isn't critical but it gives better diagnostics.
> >
> >
> > Sizeless structures (and testing on non-SVE targets)
> > ====================================================
> >
> > We're planning to build all SVE intrinsic types directly into GCC
> > (patches already written). SVE therefore doesn't strictly need a syntax
> > for creating new sizeless types in C and C++. However, having a way of
> > creating new structure-like "sizeless" types would be useful for three
> > reasons:
> >
> > - Functions could return arbitrary data by value. The SVE ABI allows
> > a function to return up to 8 vectors and 4 predicates in registers,
> > which is far more flexible than the intrinsic types.
> >
> > - We could use these sizeless structure types to test the functionality
> > on all targets.
> >
> > - A lot of the C++ frontend is concerned with classes, and having
> > a way of creating sizeless classes would help make the C++ changes
> > more consistent.
> >
> > The patches therefore add a new "__sizeless_struct" keyword to denote
> > structures that are sizeless rather than sized. Unlike normal
> > structures, these structures can have members of sizeless type in
> > addition to members of sized type. On the other hand, they have all
> > the same limitations as other sizeless types (described in earlier
> > sections).
> >
> > E.g., a sizeless structure definition might look like:
> >
> > __sizeless_struct data {
> > double *array;
> > svuint64_t indices; // An SVE vector of uint64_t elements.
> > svbool_t active; // An SVE predicate.
> > };
> >
> > Adding a new keyword seemed better than using an attribute because it
> > means that the sized vs. sizeless distinction is fixed by the declaration.
> > E.g.:
> >
> > struct data; // Is it sized or sizeless?
> > extern struct data global_data; // OK if sized, not if sizeless.
> > struct __attribute__((sizeless)) data {
> > double *array;
> > svuint64_t indices; // An SVE vector of uint64_t elements.
> > svbool_t active; // An SVE predicate.
> > };
> >
> > would lead to the declaration of "global_data" sneaking through
> > despite being invalid when "data" is sizeless.
> >
> > The tests in the patches all use these __sizeless_structs; they contain
> > nothing SVE- or AArch64-specific.
> >
> >
> > Other variable-length vector architectures
> > ==========================================
> >
> > The proposed RISC-V vector extension also has variable-length vectors.
> > When this language change was discussed on the clang developers' list,
> > Bruce Hoult (from SiFive, but speaking personally) replied with:
> >
> > http://lists.llvm.org/pipermail/cfe-dev/2018-May/057943.html
> >
> > That message covers some of the background about the vector extension.
> > On the language changes, Bruce said:
> >
> > > However, even though the length is variable, the concept of a
> > > "register-sized" C and C++ vector type makes just as much sense for SVE
> > > as it does for other vector architectures. Vector library functions
> > > take such register-sized vectors as input and return them as results.
> > > Intrinsic functions are also just as useful as they are for other vector
> > > architectures, and they too take register-sized vectors as input and
> > > return them as results.
> >
> > Intrinsic functions are absolutely required, and are I think the main
> > reason for such a low-level register-sized vector type to exist.
> >
> > [ Bruce went on to say:
> >
> > I'm not sure whether user-written functions operating on register-sized
> > vectors are useful enough to support. User-written functions would normally
> > take and return a higher-level vector type, and would implement the desired
> > functionality in terms of calls to other user-written functions (operating
> > on the high level vector as a whole) and/or explicit loops iterating
> > through the high level vector type using intrinsic functions on the
> > register-sized vector type proposed here.
> >
> > But this use case is very important for SVE, since it will allow us
> > to implement vector math routines in a way that works with the OpenMP
> > "declare simd" construct. There was also talk on gcc@ recently about
> > supporting this style of interface for RISC-V. ]
> >
> > [...]
> >
> > > All these types are opaque builtin types and are only intended to be
> > > used with the associated ACLE intrinsics. There are intrinsics for
> > > creating vectors from scalars, loading from scalars, storing to scalars,
> > > reinterpreting one type as another, etc.
> > >
> > > The idea is that the vector types would only be used for short-term
> > > register-sized working data. Longer-term data would typically be stored
> > > out to arrays.
> >
> > I agree with this.
> >
> > [...]
> >
> > > The approach we took was to treat all the SVE types as permanently
> > > incomplete.
> >
> > This seems reasonable.
> >
> > So it looks like this extension would be useful for at least one
> > architecture besides SVE.
> >
> >
> > Edits to the C standard
> > =======================
> >
> > This section specifies the behaviour for sizeless types as an edit to N1570.
> > There are three stages:
> >
> > - base changes, which add enough support for built-in sizeless
> > vector types
> >
> > - updates for consistency, which change some of the wording without
> > changing the meaning
> >
> > - support for sizeless structures
> >
> > In each case, -strikethrough- indicates deleted text and *bold*
> > includes additional text.
> >
> >
> > Base changes
> > ------------
> >
> > These changes are enough to support sizeless built-in vector types.
> >
> > 6.2.5 Types
> > -----------
> >
> > 1. The meaning of a value stored in an object or returned by a
> > function is determined by the type of the expression used to access
> > it. … Types are partitioned into object types (types that
> > describe objects) and function types (types that describe
> > functions). -At various points within a translation unit an object
> > type may be incomplete (lacking sufficient information to determine
> > the size of objects of that type) or complete (having sufficient
> > information).37)- *Object types are further partitioned into sized and
> > sizeless; all basic and derived types defined in this standard are
> > sized, but an implementation may provide additional sizeless types.*
> >
> > 1A. *At various points within a translation unit an object type may
> > be indefinite (lacking sufficient information to construct an object
> > of that type) or definite (having sufficient information).37) An
> > object type is said to be complete if it is both sized and definite;
> > all other object types are said to be incomplete. Complete types
> > have sufficient information to determine the size of an object of
> > that type while incomplete types do not.*
> >
> > 1B. *Arrays, structures, unions and enumerated types are always
> > sized, so for them the term incomplete is equivalent to (and used
> > interchangeably with) the term indefinite.*
> >
> > …
> >
> > 19. The void type comprises an empty set of values; it is -an
> > incomplete- *a sized indefinite* object type that cannot be completed
> > *(made definite)*.
> >
> > …
> >
> > 37) A type may be -incomplete- *indefinite* or -complete- *definite*
> > throughout an entire translation unit, or it may change states at
> > different points within a translation unit.
> >
> > …
> >
> > 6.3.2.1 Lvalues, arrays, and function designators
> > -------------------------------------------------
> >
> > 1. An lvalue is an expression (with an object type other than void)
> > that potentially designates an object;64) … A modifiable lvalue is
> > an lvalue that does not have array type, does not have an
> > -incomplete- *indefinite* type, does not have a const-qualified
> > type, …
> >
> > 2. Except when it is the operand of the sizeof operator, the
> > _Alignof operator, the unary & operator, the ++ operator, the --
> > operator, or the left operand of the . operator or an assignment
> > operator, an lvalue that does not have array type is converted to
> > the value stored in the designated object (and is no longer an
> > lvalue); this is called lvalue conversion. … If the lvalue has an
> > -incomplete- *indefinite* type and does not have array type, the
> > behavior is undefined. …
> >
> > …
> >
> > 6.5.1.1 Generic selection
> > -------------------------
> >
> > …
> >
> > Constraints
> >
> > 2. A generic selection shall have no more than one default generic
> > association. The type name in a generic association shall specify a
> > -complete- *definite* object type other than a variably modified
> > type. …
> >
> > …
> >
> > 6.5.2.2 Function calls
> > ----------------------
> >
> > Constraints
> >
> > 1. The expression that denotes the called function92) shall have
> > type pointer to function returning void or returning a -complete-
> > *definite* object type other than an array type.
> >
> > …
> >
> > Semantics
> >
> > …
> >
> > 4. An argument may be an expression of any -complete- *definite* object
> > type. …
> >
> > …
> >
> > 6.5.2.5 Compound literals
> > -------------------------
> >
> > Constraints
> >
> > 1. The type name shall specify a -complete- *definite* object type or an
> > array of unknown size, but not a variable length array type.
> >
> > …
> >
> > 6.7 Declarations
> > ----------------
> >
> > Constraints
> >
> > …
> >
> > 4A. *If an identifier for an object does not have automatic storage
> > duration, its type must be sized rather than sizeless.*
> >
> > Semantics
> >
> > …
> >
> > 7. If an identifier for an object is declared with no linkage, the
> > type for the object shall be -complete- *definite* by the end of its
> > declarator, or by the end of its init-declarator if it has an
> > initializer; in the case of function parameters (including in
> > prototypes), it is the adjusted type (see 6.7.6.3) that is required
> > to be -complete- *definite*.
> >
> > …
> >
> > 6.7.6.3 Function declarators (including prototypes)
> > ---------------------------------------------------
> >
> > Constraints
> >
> > …
> >
> > 4. After adjustment, the parameters in a parameter type list in a
> > function declarator that is part of a definition of that function
> > shall not have -incomplete- *indefinite* type.
> >
> > …
> >
> > 6.7.9 Initialization
> > --------------------
> >
> > Constraints
> >
> > …
> >
> > 3. The type of the entity to be initialized shall be an array of
> > unknown size or a -complete- *definite* object type that is not a
> > variable length array type.
> >
> > …
> >
> > 6.9.1 Function definitions
> > --------------------------
> >
> > Constraints
> >
> > …
> >
> > 3. The return type of a function shall be void or a -complete-
> > *definite* object type other than array type.
> >
> > …
> >
> > Semantics
> >
> > …
> >
> > 7. The declarator in a function definition specifies the name of the
> > function being defined and the identifiers of its parameters. …
> > [T]he type of each parameter is adjusted as described in
> > 6.7.6.3 for a parameter type list; the resulting type shall be a
> > -complete- *definite* object type.
> >
> > …
> >
> > J.2 Undefined behavior
> > ----------------------
> >
> > …
> > * A non-array lvalue with -an incomplete- *an indefinite* type is used
> > in a context that requires the value of the designated object
> > (6.3.2.1).
> > …
> > * An identifier for an object is declared with no linkage and the
> > type of the object is -incomplete- *indefinite* after its
> > declarator, or after its init-declarator if it has an
> > initializer (6.7).
> > …
> > * An adjusted parameter type in a function definition is not a
> > -complete- *definite* object type (6.9.1).
> > …
> >
> > Updates for consistency
> > -----------------------
> >
> > These changes are a prerequisite for sizeless structures. They have no
> > effect otherwise, but might be preferred anyway because they make the
> > terminology more consistent. They apply on top of the previous edits.
> >
> > 6.2.5 Types
> > -----------
> >
> > …
> >
> > 22. An array type of unknown size is an -incomplete- *indefinite*
> > type. It is -completed- *made definite*, for an identifier of that type,
> > by specifying the size in a later declaration (with internal or
> > external linkage). A structure or union type of unknown content (as
> > described in 6.7.2.3) is an -incomplete- *indefinite* type. It is
> > -completed- *made definite*, for all declarations of that type, by
> > declaring the same structure or union tag with its defining content
> > later in the same scope.
> >
> > …
> >
> > 6.2.7 Compatible type and composite type
> > ----------------------------------------
> >
> > 1. Two types have compatible type if their types are the same. …
> > Moreover, two structure, union, or enumerated types declared in
> > separate translation units are compatible if their tags and members
> > satisfy the following requirements: If one is declared with a tag,
> > the other shall be declared with the same tag. If both are
> > -completed- *made definite* anywhere within their respective
> > translation units, then the following additional requirements apply: …
> >
> > …
> >
> > 6.7.2.1 Structure and union specifiers
> > --------------------------------------
> >
> > …
> >
> > Semantics
> >
> > …
> >
> > 8. The presence of a struct-declaration-list in a
> > struct-or-union-specifier declares a new type, within a translation
> > unit. The struct-declaration-list is a sequence of declarations for
> > the members of the structure or union. If the struct-declaration-list
> > does not contain any named members, either directly or via an anonymous
> > structure or anonymous union, the behavior is undefined. The type is
> > -incomplete- *indefinite* until immediately after the } that terminates
> > the list, and -complete- *definite* thereafter.
> >
> > …
> >
> > 6.7.2.2 Enumeration specifiers
> > ------------------------------
> >
> > …
> >
> > Semantics
> >
> > …
> >
> > 4. … The enumerated type is -incomplete- *indefinite* until
> > immediately after the } that terminates the list of enumerator
> > declarations, and -complete- *definite* thereafter.
> >
> > …
> >
> > 6.7.2.3 Tags
> > ------------
> >
> > …
> >
> > Semantics
> >
> > 4. All declarations of structure, union, or enumerated types that
> > have the same scope and use the same tag declare the same
> > type. Irrespective of whether there is a tag or what other
> > declarations of the type are in the same translation unit, the type
> > is -incomplete- *indefinite* 129) until immediately after the closing
> > brace of the list defining the content, and -complete- *definite*
> > thereafter.
> >
> > …
> >
> > 8. If a type specifier of the form
> >
> > struct-or-union identifier
> >
> > occurs other than as part of one of the above forms, and no other
> > declaration of the identifier as a tag is visible, then it declares
> > an -incomplete- *indefinite* structure or union type, and declares the
> > identifier as the tag of that type.131)
> >
> > …
> >
> > 129) An -incomplete- *indefinite* type may only by used when -the
> > size of an object- *the ability to create an object* of that type
> > is not needed. It is not needed, for example, when a typedef name
> > is declared to be a specifier for a structure or union, or when a
> > pointer to or a function returning a structure or union is being
> > declared. (See -incomplete- *indefinite* types in 6.2.5.) The
> > specification has to be -complete- *definite* before such a function
> > is called or defined.
> >
> > 6.7.6.3 Function declarators (including prototypes)
> > ---------------------------------------------------
> >
> > …
> >
> > Semantics
> >
> > …
> >
> > 12. If the function declarator is not part of a definition of that
> > function, parameters may have -incomplete- *indefinite* type and may use
> > the [*] notation in their sequences of declarator specifiers to
> > specify variable length array types.
> >
> > …
> >
> > J.2 Undefined behavior
> > ----------------------
> >
> > …
> > * When the -complete- *definite* type is needed, an -incomplete-
> > *indefinite* structure or union type is not completed in the same
> > scope by another declaration of the tag that defines the content
> > (6.7.2.3).
> > …
> >
> > Sizeless structures
> > -------------------
> >
> > These additional changes to N1570 add the concept of a sizeless structure.
> > Again they apply on top of the edits above:
> >
> > 6.2.3 Name spaces of identifiers
> > --------------------------------
> >
> > 1. If more than one declaration of a particular identifier is
> > visible at any point in a translation unit, the syntactic context
> > disambiguates uses that refer to different entities. Thus, there
> > are separate name spaces for various categories of identifiers, as
> > follows:
> >
> > …
> >
> > * the tags of *sized* structures, *sizeless structures,* unions, and
> > enumerations (disambiguated by following any32) of the keywords
> > struct, *__sizeless_struct,* union, or enum);
> >
> > …
> >
> > 6.2.5 Types
> > -----------
> >
> > 1. … Types are partitioned into object types (types that describe
> > objects) and function types (types that describe functions).
> > Object types are further partitioned into sized and sizeless;
> > -all basic and derived types defined in this standard are
> > sized, but an implementation may provide additional sizeless types.-
> > *the only sizeless types defined by this standard are __sizeless_structs,
> > but an implementation may provide additional sizeless types.*
> >
> > …
> >
> > 1B. Arrays, -structures,- unions and enumerated types are always
> > sized, so for them the term incomplete is equivalent to (and used
> > interchangeably with) the term indefinite.
> >
> > …
> >
> > 20. Any number of derived types can be constructed from the object
> > and function types, as follows: …
> >
> > * A *sized* structure type describes a sequentially allocated
> > nonempty set of sized member objects (and, in certain
> > circumstances, an incomplete array), each of which has an
> > optionally specified name and possibly distinct type.
> >
> > * *A sizeless structure type describes a set of non-overlapping
> > member objects whose types may be sizeless and whose relative
> > positions are unspecified. It is also unspecified whether the
> > structure occupies a single contiguous piece of storage or
> > whether it requires several disjoint pieces.*
> >
> > …
> >
> > *20A. The term structure type refers collectively to sized structure
> > types and sizeless structure types.*
> >
> > …
> >
> > 6.4.1 Keywords
> > --------------
> >
> > Syntax
> >
> > 1. *(Add __sizeless_struct to the list and update the copy in A.1.2)*
> >
> > …
> >
> > 6.5.8 Relational operators
> > --------------------------
> >
> > …
> >
> > Semantics
> >
> > …
> >
> > 5. When two pointers are compared, the result depends on the
> > relative locations in the address space of the objects pointed to.
> > … If the objects pointed to are members of the same aggregate object,
> > pointers to *sized* structure members declared later compare greater
> > than pointers to members declared earlier in the structure, and
> > pointers to array elements with larger subscript values compare
> > greater than pointers to elements of the same array with lower
> > subscript values. …
> >
> > …
> >
> > 6.7.2.1 Structure and union specifiers
> > --------------------------------------
> >
> > Syntax
> >
> > struct-or-union-specifier:
> > struct-or-union identifieropt { struct-declaration-list }
> > struct-or-union identifier
> >
> > struct-or-union:
> > struct
> > *__sizeless_struct*
> > union
> >
> > …
> >
> > 3. A *sized* structure or union shall not contain a member with
> > incomplete or function type …, except that the last member of a
> > structure with more than one named member may have incomplete array
> > type; such a structure (and any union containing, possibly
> > recursively, a member that is such a structure) shall not be a
> > member of a structure or an element of an array. *Simlarly, a
> > sizeless structure shall not contain a member with indefinite or
> > function type; the exception for incomplete array types does not
> > apply.*
> >
> > …
> >
> > Semantics
> >
> > 6. As discussed in 6.2.5, a *sized* structure is a type consisting
> > of a sequence of members, whose storage is allocated in an ordered
> > sequence; *a sizeless structure is a type consisting of
> > non-overlapping members whose relative position is unspecified,*
> > and a union is a type consisting of a sequence of members whose
> > storage overlap.
> >
> > 7. Structure and union specifiers have the same form. The keywords
> > struct, *__sizeless_struct* and union indicate that the type being
> > specified is, respectively, a *sized* structure type, *a sizeless
> > structure type,* or a union type.
> >
> > …[8 is as above]…
> >
> > 9. A member of a structure or union may have any complete object
> > type other than a variably modified type.123) *A member of a sizeless
> > structure may also have a sizeless definite type.* In addition, a
> > member *of a structure or union* may be declared to consist of a
> > specified number of bits (including a sign bit, if any). Such a
> > member is called a bit-field;124) its width is preceded by a colon.
> >
> > …
> >
> > 15. Within a *sized* structure object, the non-bit-field members and
> > the units in which bit-fields reside have addresses that increase in
> > the order in which they are declared. A pointer to a *sized* structure
> > object, suitably converted, points to its initial member (or if that
> > member is a bit-field, then to the unit in which it resides), and
> > vice versa. There may be unnamed padding within a *sized* structure
> > object, but not at its beginning.
> >
> > 15A. *The representation of a sizeless structure object is
> > unspecified. It is possible to form pointers to the structure
> > itself and to its individual members, but the relationship between
> > their addresses is unspecified. The structure may occupy a single
> > piece of contiguous storage or it may occupy several disjoint
> > pieces.*
> >
> > …
> >
> > 18 As a special case, the last element of a *sized* structure with
> > more than one named member may have an incomplete array type; this
> > is called a flexible array member. …
> >
> > …
> >
> > 6.7.2.3 Tags
> > ------------
> >
> > Constraints
> >
> > …
> >
> > 2. Where two declarations that use the same tag declare the same
> > type, they shall both use the same choice of struct, *__sizeless_struct,*
> > union, or enum.
> >
> > …
> >
> >
> > Edits to the C++ standard
> > =========================
> >
> > We have a similar set of changes to the C++ standard, but this RFC is
> > long enough already, so I've not included them here. I also didn't find
> > them to be particularly useful when writing the C++ patches, since most
> > of the changes were obvious given a few basic rules. Those rules are:
> >
> > - type traits can be used with sizeless types (unlike incomplete types)
> >
> > - sizeless structures cannot have base classes or be used as base classes
> >
> > - sizeless structures cannot have virtual members
> >
> > - pointers to member variables are invalid for sizeless structures
> > (although taking the address of a member of a specific sizeless object
> > is fine, as for C)
> >
> > - sizeless types are not literal types
> >
> > - sizeless types cannot be created by operator new (as for incomplete types)
> >
> > - sizeless types cannot be deleted (so, unlike for incomplete types,
> > this is an error rather than a warning)
> >
> > - sizeless types cannot be thrown or caught (as for incomplete types)
> >
> > - sizeless types cannot be used with typeid() (as for incomplete types)
> >
> >
> > GCC implementation questions
> > ============================
> >
> > The GCC patches are pretty simple in principle. The language changes
> > involve going through the standard replacing "complete" with "definite"
> > and most of the GCC patches go through the frontend code making the
> > same kind of change.
> >
> > New type flag for sizeless types
> > --------------------------------
> >
> > The patches add a new flag TYPE_SIZELESS_P to represent the negative of:
> >
> > * would fully-defining the type determine its size?
> >
> > from the summary above. Negative names are usually a bad thing,
> > but the natural default is for the flag to be off.
> >
> > There are currently 17 bits free in tree_type_common, so the patches
> > steal one of those. Is that OK?
> >
> > The effect on COMPLETE_TYPE_P
> > -----------------------------
> >
> > The current definition of COMPLETE_TYPE_P is:
> >
> > /* Nonzero if this type is a complete type. */
> > #define COMPLETE_TYPE_P(NODE) (TYPE_SIZE (NODE) != NULL_TREE)
> >
> > Although the SVE types don't have a measurable size at the language
> > level, they still have a TYPE_SIZE and TYPE_SIZE_UNIT, with the sizes
> > using placeholders for the runtime vector size. So after the split
> > described in the summary, TYPE_SIZE (NODE) != NULL_TREE means
> > "the type is fully defined" rather than "the type is complete".
> > With TYPE_SIZELESS_P, the definition of "complete type" would be:
> >
> > #define COMPLETE_TYPE_P(NODE) \
> > (TYPE_SIZE (NODE) != NULL_TREE && !TYPE_SIZELESS_P (NODE))
> >
> > i.e. the type is fully-defined, and fully-defining it determines
> > its size at the language level.
> >
> > Uses of COMPLETE_TYPE_P outside the frontends
> > ---------------------------------------------
> >
> > The main complication is that the concept of "complete type" is exposed
> > outside the frontends, with COMPLETE_TYPE_P being defined in tree.h.
> >
> > I tried to audit all uses outside the frontends and it looks like
> > they're all testing whether "the type is fully defined" and don't
> > care about the distinction between sized and sizeless. This means
> > that the current definition (rather than the new definition)
> > should be correct in all cases.
> >
> > In some cases the tests are simple null checks, like:
> >
> > /* Try to approach equal type sizes. */
> > if (!COMPLETE_TYPE_P (type_a)
> > || !COMPLETE_TYPE_P (type_b)
> > || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a))
> > || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b)))
> > break;
> >
> > IMO it's more obvious to test TYPE_SIZE_UNIT directly for null here.
> > Having a wrapper doesn't add much.
> >
> > In places like:
> >
> > if (!COMPLETE_TYPE_P (t))
> > layout_type (t);
> >
> > and:
> >
> > if (COMPLETE_TYPE_P (t) && TYPE_CANONICAL (t)
> > && TYPE_MODE (t) != TYPE_MODE (TYPE_CANONICAL (t)))
> > ...
> >
> > it's testing whether the type has been laid out already.
> >
> > So the patches do two things:
> >
> > * Expand the definition of the current COMPLETE_TYPE_P macro outside
> > the frontends if the macro is simply protecting against a null
> > dereference.
> >
> > * Make COMPLETE_TYPE_P local to the frontends and rename all uses
> > outside the frontends.
> >
> > As far as the second point goes, I wasn't sure what new name to use
> > outside the front ends. Possibilities include:
> >
> > - DEFINITE_TYPE_P
> > - USABLE_TYPE_P
> > - VALID_VAR_TYPE_P
> > - TYPE_LAID_OUT_P
> > - TYPE_DEFINED_P
> > - TYPE_FULLY_DEFINED_P
> > - TYPE_READY_P
> > ...other suggestions welcome...
> >
> > I went for DEFINITE_TYPE_P because that's what the SVE specification
> > uses, but something more neutral like TYPE_DEFINED_P might be better.
> >
> > Frontend changes
> > ----------------
> >
> > The frontend patches change COMPLETE_TYPE_P to DEFINITE_TYPE_P where
> > necessary. I've tried where possible to accompany each individual
> > change with a test.
> >
> > This worked fairly naturally (IMO) for C, and most of the changes could
> > be tied directly to the language edits above.
> >
> > For C++ it was more difficult (not surprisingly). There are a lot of
> > tests for COMPLETE_TYPE_P that are obviously testing whether a class
> > has been fully defined, and are more concerned with name lookup than
> > TYPE_SIZE. The same goes for COMPLETE_OR_OPEN_TYPE_P and whether the
> > definition has been started. So while the C changes were relatively
> > small and self-contained, the C++ changes replace many more uses of
> > COMPLETE_TYPE_P than they keep. This makes me wonder whether it's a
> > good idea to keep COMPLETE_TYPE_P at all, or whether it would be better
> > to replace the remaining uses with something more explicit like:
> >
> > TYPE_SIZE_KNOWN_P
> > TYPE_SIZE_DEFINED_P
> > TYPE_SIZE_MEASURABLE_P
> > TYPE_SIZE_COMPLETE_P
> > ...suggestions again welcome...
> >
> > Thanks,
> > Richard