[PATCH 02/14] Add D frontend (GDC) implementation.

Richard Sandiford richard.sandiford@arm.com
Tue Oct 16 11:39:00 GMT 2018


Iain Buclaw <ibuclaw@gdcproject.org> writes:
> On 18 September 2018 at 02:33, Iain Buclaw <ibuclaw@gdcproject.org> wrote:
>> This patch adds the D front-end implementation, the only part of the
>> compiler that interacts with GCC directly, and being the parts that I
>> maintain, is something that I can talk about more directly.
>>
>> For the actual code generation pass, that converts the front-end AST
>> to GCC trees, most parts use a separate Visitor interfaces to do a
>> certain kind of lowering, for instance, types.cc builds *_TYPE trees
>> from AST Type's.  The Visitor class is part of the DMD front-end, and
>> is defined in dfrontend/visitor.h.
>>
>> There are also a few interfaces which have their headers in the DMD
>> frontend, but are implemented here because they do something that
>> requires knowledge of the GCC backend (d-target.cc), does something
>> that may not be portable, or differ between D compilers
>> (d-frontend.cc) or are a thin wrapper around something that is managed
>> by GCC (d-diagnostic.cc).
>>
>> Many high level operations result in generation of calls to D runtime
>> library functions (runtime.def), all with require some kind of runtime
>> type information (typeinfo.cc).  The compiler also generates functions
>> for registering/deregistering compiled modules with the D runtime
>> library (modules.cc).
>>
>> As well as the D language having it's own built-in functions
>> (intrinsics.cc), we also expose GCC builtins to D code via a
>> `gcc.builtins' module (d-builtins.cc), and give special treatment to a
>> number of UDAs that could be applied to functions (d-attribs.cc).
>>
>>
>> That is roughly the high level jist of how things are currently organized.
>>
>> ftp://ftp.gdcproject.org/patches/v4/02-v4-d-frontend-gdc.patch
>>
>
> https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00587.html
>
> During the last round, the last comment given was that the reviewer
> wasn't going to dive deep into this code, as it's essentially
> converting between the different representations and is code that I'd
> be maintaining.
>
> As this is code that other gcc maintainers will be potentially looking
> after as well, I'd like any glaring problems to be dealt with
> immediately.

I won't claim that I dived deep into the code either, since with a patch
of this size that would take weeks.  But FWIW I read it through trying
to scan for anything that stood out.

I think the patch is OK besides the below.  The comments are in patch
order and so mix very trivial stuff with things that seem more fundamental.
I think the only real blocker is the point below about translations.

> +/* Define attributes that are mutually exclusive with one another.  */
> +static const struct attribute_spec::exclusions attr_noreturn_exclusions[] =
> +{
> +  ATTR_EXCL ("noreturn", true, true, true),

Why does noreturn exclude itself?  Probably worth a comment if deliberate.

> +tree
> +build_attributes (Expressions *eattrs)
> +{
> [...]
> +      tree list = build_tree_list (get_identifier (name), args);
> +      attribs =  chainon (attribs, list);

Too many spaces before "chainon".

> +static tree
> +handle_const_attribute (tree *node, tree ARG_UNUSED (name),
> +			tree ARG_UNUSED (args), int ARG_UNUSED (flags),
> +			bool * ARG_UNUSED (no_add_attrs))
> +{
> +  tree type = TREE_TYPE (*node);
> +
> +  /* See FIXME comment on noreturn in c_common_attribute_table.  */

Seems strange to be referencing c-family code here, especially since
there's no corresponding comment for handle_noreturn_attribute and
also no FIXME comment in the D version of the table.  Think we should
just drop the comment above.

It's unfortunate that there's so much cut-&-paste with c-attribs.c,
but there again, the current split between target-independent code
(attribs.*) and frontend code (c-attribs.*) isn't very clear-cut,
so I don't think this should hold up acceptance.

I'm going to review it as new code though, so:

> +/* Handle a "malloc" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +tree
> +handle_malloc_attribute (tree *node, tree ARG_UNUSED (name),
> +			 tree ARG_UNUSED (args), int ARG_UNUSED (flags),
> +			 bool * ARG_UNUSED (no_add_attrs))
> +{
> +  if (TREE_CODE (*node) == FUNCTION_DECL
> +      && POINTER_TYPE_P (TREE_TYPE (TREE_TYPE (*node))))
> +    DECL_IS_MALLOC (*node) = 1;
> +  else
> +    gcc_unreachable ();
> +
> +  return NULL_TREE;
> +}
> +
> +/* Handle a "pure" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +static tree
> +handle_pure_attribute (tree *node, tree ARG_UNUSED (name),
> +		       tree ARG_UNUSED (args), int ARG_UNUSED (flags),
> +		       bool * ARG_UNUSED (no_add_attrs))
> +{
> +  if (TREE_CODE (*node) == FUNCTION_DECL)
> +    DECL_PURE_P (*node) = 1;
> +  else
> +    gcc_unreachable ();
> +
> +  return NULL_TREE;
> +}
> +
> +/* Handle a "no vops" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +static tree
> +handle_novops_attribute (tree *node, tree ARG_UNUSED (name),
> +			 tree ARG_UNUSED (args), int ARG_UNUSED (flags),
> +			 bool * ARG_UNUSED (no_add_attrs))
> +{
> +  gcc_assert (TREE_CODE (*node) == FUNCTION_DECL);
> +  DECL_IS_NOVOPS (*node) = 1;
> +  return NULL_TREE;
> +}

Think the first two should follow the example of novops and use
gcc_assert rather than gcc_unreaachable.  Same for some later functions.

> +/* Build D frontend type from tree TYPE type given.  This will set the backend
> +   type symbol directly for complex types to save build_ctype() the work.
> +   For other types, it is not useful or will causes errors, such as casting

s/causes/cause/ or s/will //

> +  /* For now, we need to tell the D frontend what platform is being targeted.
> +     This should be removed once the frontend has been fixed.  */
> +  if (strcmp (ident, "linux") == 0)
> +    global.params.isLinux = true;
> +  else if (strcmp (ident, "OSX") == 0)
> +    global.params.isOSX = true;
> +  else if (strcmp (ident, "Windows") == 0)
> +    global.params.isWindows = true;
> +  else if (strcmp (ident, "FreeBSD") == 0)
> +    global.params.isFreeBSD = true;
> +  else if (strcmp (ident, "OpenBSD") == 0)
> +    global.params.isOpenBSD = true;
> +  else if (strcmp (ident, "Solaris") == 0)
> +    global.params.isSolaris = true;
> +  else if (strcmp (ident, "X86_64") == 0)
> +    global.params.is64bit = true;

Probably worth adding a comment to say why only X86_64 causes is64bit
to be set, and to say that it's OK to silently ignore other strings
(assuming it is).

> +      tf->purity = DECL_PURE_P (decl) ?   PUREstrong :
> +	TREE_READONLY (decl) ? PUREconst :
> +	DECL_IS_NOVOPS (decl) ? PUREweak :
> +	!DECL_ASSEMBLER_NAME_SET_P (decl) ? PUREweak :
> +	PUREimpure;
> +      tf->trust = !DECL_ASSEMBLER_NAME_SET_P (decl) ? TRUSTsafe :
> +	TREE_NOTHROW (decl) ? TRUSTtrusted :
> +	TRUSTsystem;

Formatting nit, think this should be:

      tf->purity = (DECL_PURE_P (decl) ? PUREstrong
		    : TREE_READONLY (decl) ? PUREconst
		    : DECL_IS_NOVOPS (decl) ? PUREweak
		    : !DECL_ASSEMBLER_NAME_SET_P (decl) ? PUREweak
		    : PUREimpure);

etc.

> +/* Search for any `extern(C)' functions that match any known GCC library builtin
> +   function in D and override it's internal backend symbol.  */

s/it's/its/

> +/* Used to help initialize the builtin-types.def table.  When a type of
> +   the correct size doesn't exist, use error_mark_node instead of NULL.
> +   The later results in segfaults even when a decl using the type doesn't
> +   get invoked.  */

s/later/latter/.  (The typo is in the code you copied this from, but might
as will fix it here.)

> +/* Build nodes that would have be created by the C front-end; necessary
> +   for including builtin-types.def and ultimately builtins.def.  */

"would be" or "would have been"

> +/* Build nodes that are used by the D front-end.
> +   These are distinct from C types.  */
> +
> +static void
> +d_build_d_type_nodes (void)
> +{
> +  /* Integral types.  */
> +  byte_type_node = make_signed_type (8);
> +  ubyte_type_node = make_unsigned_type (8);
> +
> +  short_type_node = make_signed_type (16);
> +  ushort_type_node = make_unsigned_type (16);
> +
> +  int_type_node = make_signed_type (32);
> +  uint_type_node = make_unsigned_type (32);
> +
> +  long_type_node = make_signed_type (64);
> +  ulong_type_node = make_unsigned_type (64);

It's a bit confusing for the D type to be long_type_node but the C/ABI type
to be long_integer_type_node.  The D type is surely an integer too. :-)
With this coming among the handling of built-in functions, it initially
looked related, and I was wondering how it was safe on ILP32 systems
before realising the difference.

Maybe prefixing them all with "d_" would be too ugly, but it would at
least be good to clarify the comment to say that these are "distinct
type nodes" (rather than just distinct definitions, as I'd initially
assumed) and that they're not used outside the frontend, or by the C
imports.

> +/* Return the GCC location for the D frontend location LOC.   */
> +
> +location_t
> +get_linemap (const Loc& loc)

FWIW, I think GCC style is to add a space before "&" rather than
after it, as for pointers.  But since you use this style consistently
(and I'm guessing would instinctively use it in later work), it's
probably better to leave it as-is.  The go interface similarly
has local conventions regarding things like spaces before "(".

> +tree
> +d_array_value (tree type, tree len, tree data)
> +{
> +  /* TODO: Assert type is a D array.  */

Just curious, what makes asserting difficult enough to be a TODO?

> +/* Returns the .funcptr component from the D delegate EXP.  */
> +
> +tree
> +delegate_method (tree exp)
> +{
> +  /* Get the backend type for the array and pick out the array length
> +     field (assumed to be the second field).  */
> +  tree method_field = TREE_CHAIN (TYPE_FIELDS (TREE_TYPE (exp)));
> +  return component_ref (exp, method_field);
> +}
> +
> +/* Returns the .object component from the delegate EXP.  */
> +
> +tree
> +delegate_object (tree exp)
> +{
> +  /* Get the backend type for the array and pick out the array data
> +     pointer field (assumed to be the first field).  */
> +  tree obj_field = TYPE_FIELDS (TREE_TYPE (exp));
> +  return component_ref (exp, obj_field);
> +}

Looks like the body comments might be a cut-&-pasto?  They reference the
right index, but don't match the function comments.

> +/* Return TRUE if EXP is a valid lvalue.  Lvalues references cannot be
> +   made into temporaries, otherwise any assignments will be lost.  */

s/Lvalues references/Lvalue references/?

> +	      /* If the index is NULL, then just assign it to the next field.
> +		 This is expect of layout_typeinfo(), which generates a flat
> +		 list of values that we must shape into the record type.  */
> +	      if (index == field || index == NULL_TREE)
> +		{
> +		  init->ordered_remove (idx);
> +		  if (!finished)
> +		    is_initialized = true;
> +		  break;
> +		}

"This is expect of" looks like a typo.

The function is quadratic in the initialiser size.  It would be
good to avoid that if possible.  That said, it looks like
build_class_instance is also quadratic, so maybe just changing
this on its own wouldn't help much.

> +	  for (size_t i = 0; fd->parameters && i < fd->parameters->dim; i++)
> +	    {
> +	      VarDeclaration *v = (*fd->parameters)[i];
> +	      /* Remove if already in closureVars so can push to front.  */
> +	      for (size_t j = i; j < fd->closureVars.dim; j++)
> +		{
> +		  Dsymbol *s = fd->closureVars[j];
> +		  if (s == v)
> +		    {
> +		      fd->closureVars.remove (j);
> +		      break;
> +		    }
> +		}
> +	      fd->closureVars.insert (i, v);
> +	    }

Looks like this also is quadratic.

> +      /* For nested functions, default to creating a frame.  Even if there is no
> +	 fields to populate the frame, create it anyway, as this will be used as
> +	 the record type instead of `void*` for the this parameter.  */

s/there is no fields/there are no fields/

> +      /* If we are taking the address of a decl that can never be null,
> +	 then the return result is always true.  */
> +      if (DECL_P (TREE_OPERAND (expr, 0))
> +	  && (TREE_CODE (TREE_OPERAND (expr, 0)) == PARM_DECL
> +	      || TREE_CODE (TREE_OPERAND (expr, 0)) == LABEL_DECL
> +	      || !DECL_WEAK (TREE_OPERAND (expr, 0))))
> +	return boolean_true_node;

Does something warn about this?

> +tree
> +convert (tree type, tree expr)
> ...
> +    case COMPLEX_TYPE:
> +      if (TREE_CODE (etype) == REAL_TYPE && TYPE_IMAGINARY_FLOAT (etype))
> +	ret = build2 (COMPLEX_EXPR, type,
> +		      build_zero_cst (TREE_TYPE (type)),
> +		      convert (TREE_TYPE (type), expr));
> +      else
> +	ret = convert_to_complex (type, e);
> +      goto maybe_fold;
> ...
> +    maybe_fold:
> +      if (!TREE_CONSTANT (ret))
> +	ret = fold (ret);
> +      return ret;

etc.  Any reason not to use fold_build* here and in subroutines (for all
cases, not just COMPLEX_TYPE), rather than fold at the end?  Or is the
!TREE_CONSTANT deliberately trying to avoid folding constants that
fold_build* would normally fold?   build* routines set TREE_CONSTANT
for operations with constant operands as well as constant literals,
so is the code trying to avoid folding those?

The routines use fold_build* in some places and plain build* with
a goto in others, but it wasn't obvious why.  Seems like it deserves
a comment.

> +	  /* Strings are treated as dynamic arrays D2.  */

Should that be by/in D2?

> +  return result ? result :
> +    convert (build_ctype (totype), exp);

Odd formatting, fits easily on a line.  If you want to split it,
the ":" should be on the second line rather than the first.

> +/* Apply semantics of assignment to a values of type TOTYPE to EXPR
> +   (e.g., pointer = array -> pointer = &array[0])

Type ("to a values")

> +   Return a TREE representation of EXPR implictly converted to TOTYPE
> +   for use in assignment expressions MODIFY_EXPR, INIT_EXPR.  */

"implicitly"

> +/* Helper routine for all error routines.  Reports a diagnostic specified by
> +   KIND at the explicit location LOC, where the message FORMAT has not yet
> +   been translated by the gcc diagnostic routines.  */
> +
> +static void ATTRIBUTE_GCC_DIAG(3,0)
> +d_diagnostic_report_diagnostic (const Loc& loc, int opt, const char *format,
> +				va_list ap, diagnostic_t kind, bool verbatim)
> +{
> +  va_list argp;
> +  va_copy (argp, ap);
> +
> +  if (loc.filename || !verbatim)
> +    {
> +      rich_location rich_loc (line_table, get_linemap (loc));
> +      diagnostic_info diagnostic;
> +      char *xformat = expand_format (format);
> +
> +      diagnostic_set_info (&diagnostic, xformat, &argp, &rich_loc, kind);

How does this work with translation?  xgettext will only see the original
format string, not the result of expand_format.  Do you have some scripting
to do the same format mangling when collecting the translation strings?
Same concern:

> +void ATTRIBUTE_GCC_DIAG(2,0)
> +verror (const Loc& loc, const char *format, va_list ap,
> +	const char *p1, const char *p2, const char *)
> +{
> +  if (!global.gag || global.params.showGaggedErrors)
> +    {
> +      char *xformat;
> +
> +      /* Build string and emit.  */
> +      if (p2 != NULL)
> +	xformat = xasprintf ("%s %s %s", p1, p2, format);
> +      else if (p1 != NULL)
> +	xformat = xasprintf ("%s %s", p1, format);
> +      else
> +	xformat = xasprintf ("%s", format);

...with the concatenation here.

It looks at first glance like the callers to d_diagnostic_report_diagnostic
should be doing the conversion themselves via _(format), before doing
any concatenation.  Then d_diagnostic_report_diagnostic can use
expand_format on the translated string and use
diagnostic_set_info_translated.  But I'm not an expert on this stuff.

Also, the function should document what p1 and p2 are.  Do they
need to be translated?

Does xgettext get to see all frontend error messages?  Haven't got
to the later patches yet.

Please add an overview comment somewhere in the diagnostic routines
explaining how translation works wrt errors that come from the D
frontend.

> +void ATTRIBUTE_GCC_DIAG(2,0)
> +vmessage(const Loc& loc, const char *format, va_list ap)

Missing space before "(".

+/* Start gagging. Return the current number of gagged errors.  */

Should be two spaces before "Return"

> +/* Write a little-endian 32-bit VALUE to BUFFER.  */
> +
> +void
> +Port::writelongLE (unsigned value, void *buffer)
> +{
> +    unsigned char *p = (unsigned char*) buffer;
> +
> +    p[0] = (unsigned) value;
> +    p[1] = (unsigned) value >> 8;
> +    p[2] = (unsigned) value >> 16;
> +    p[3] = (unsigned) value >> 24;
> +}
> ...
> +/* Write a big-endian 32-bit VALUE to BUFFER.  */
> +
> +void
> +Port::writelongBE (unsigned value, void *buffer)
> +{
> +    unsigned char *p = (unsigned char*) buffer;
> +
> +    p[0] = (unsigned) value >> 24;
> +    p[1] = (unsigned) value >> 16;
> +    p[2] = (unsigned) value >> 8;
> +    p[3] = (unsigned) value;
> +}

Overindented bodies.  Missing space before "*" in "(unsigned char*)"
in all these functions.

Obviously this stuff assumes host CHAR_BIT == 8, but let's be realistic :-)
Is it also used in ways that require the target BITS_PER_UNIT to be 8
as well?  That could realistically be a different value (and we've had
ports like that in the past).

> +/* Return a hash value for real_t value R.  */
> +
> +size_t
> +CTFloat::hash (real_t r)
> +{
> +    return real_hash (&r.rv ());
> +}

Overindented body.

> +/* Write out all dependencies of a given MODULE to the specified BUFFER.
> +   COLMAX is the number of columns to word-wrap at (0 means don't wrap).  */
> +
> +static void
> +deps_write (Module *module, OutBuffer *buffer, unsigned colmax = 72)
> +{
> ...
> +  /* Write out all make dependencies.  */
> +  while (modlist.dim > 0)
> +  {

Misindented while body.

> +      else if (empty_modify_p (TREE_TYPE (op0), op1))
> +	{
> +	  /* Remove any copies of empty aggregates.  Also drop volatile
> +	     loads on the RHS to avoid infinite recursion from
> +	     gimplify_expr trying to load the value.  */
> +	  gimplify_expr (&TREE_OPERAND (*expr_p, 0), pre_p, post_p,
> +			 is_gimple_lvalue, fb_lvalue);
> +	  if (TREE_SIDE_EFFECTS (op1))
> +	    {
> +	      if (TREE_THIS_VOLATILE (op1)
> +		  && (REFERENCE_CLASS_P (op1) || DECL_P (op1)))
> +		op1 = build_fold_addr_expr (op1);
> +
> +	      gimplify_and_add (op1, pre_p);
> +	    }
> +	  *expr_p = TREE_OPERAND (*expr_p, 0);
> +	  ret = GS_OK;
> +	}

It seems odd to gimplify the address of a volatile decl.  When would
that have meaningful side-effects?

What goes wrong if you don't have this block and let the gimplifier
handle empty modify_exprs normally?  The comments explain clearly what
the code is trying to do, but it isn't obvious why this needs to be
treated as a special case.

> +/* Implements the lang_hooks.get_alias_set routine for language D.
> +   Get the alias set corresponding the type or expression T.

s/corresponding the/corresponding to/

> +/* Implements the lang_hooks.get_alias_set routine for language D.
> +   Get the alias set corresponding the type or expression T.
> +   Return -1 if we don't do anything special.  */
> +
> +static alias_set_type
> +d_get_alias_set (tree t)
> +{
> +  /* Permit type-punning when accessing a union, provided the access
> +     is directly through the union.  */
> +  for (tree u = t; handled_component_p (u); u = TREE_OPERAND (u, 0))
> +    {
> +      if (TREE_CODE (u) == COMPONENT_REF
> +	  && TREE_CODE (TREE_TYPE (TREE_OPERAND (u, 0))) == UNION_TYPE)
> +	return 0;
> +    }
> +
> +  /* That's all the expressions we handle.  */
> +  if (!TYPE_P (t))
> +    return get_alias_set (TREE_TYPE (t));
> +
> +  /* For now in D, assume everything aliases everything else,
> +     until we define some solid rules.  */
> +  return 0;
> +}

Any reason for not returning 0 unconditionally?  Won't the queries
deferred to get_alias_set either return 0 directly or come back here
and get 0?

Might be worth adding a comment referencing build_vconvert, which stood
out as a source of what in C would be alias violations.

> +static tree
> +d_eh_personality (void)
> +{
> +  if (!d_eh_personality_decl)
> +    {
> +      d_eh_personality_decl
> +	= build_personality_function ("gdc");
> +    }

Redundant braces, canonical formatting would be:

  if (!d_eh_personality_decl)
    d_eh_personality_decl = build_personality_function ("gdc");

> +/* this bit is set if they did `-lstdc++'.  */

s/this/This/

> +/* What do with libgphobos:
> +   -1 means we should not link in libgphobos
> +   0  means we should link in libgphobos if it is needed
> +   1  means libgphobos is needed and should be linked in.
> +   2  means libgphobos is needed and should be linked statically.
> +   3  means libgphobos is needed and should be linked dynamically. */
> +static int library = 0;

Might be worth using a more specific variable name, since the code
also deals with libgcc and libstdc++.  Maybe add named constants
for the values above?

> +void
> +lang_specific_driver (cl_decoded_option **in_decoded_options,
> +		      unsigned int *in_decoded_options_count,
> +		      int *in_added_libraries)
> +{
> ...
> +  /* If true, the user gave `-g'.  Used by -debuglib */
> ...
> +  /* What default library to use instead of phobos */
> ...
> +  /* What debug library to use instead of phobos */

Comments should end with ".  */"

> +	case OPT_SPECIAL_input_file:
> +	    {

Over-indented block.

> +	      /* Record that this is a D source file.  */
> +	      int len = strlen (arg);
> +	      if (len <= 2 || strcmp (arg + len - 2, ".d") == 0)
> +		{
> +		  if (first_d_file == NULL)
> +		    first_d_file = arg;
> +
> +		  args[i] |= DSOURCE;
> +		}
> +
> +	      /* If we don't know that this is a interface file, we might
> +		 need to link against libphobos library.  */
> +	      if (library == 0)
> +		{
> +		  if (len <= 3 || strcmp (arg + len - 3, ".di") != 0)
> +		    library = 1;
> +		}
> +
> +	      /* If this is a C++ source file, we'll need to link
> +		 against libstdc++ library.  */
> +	      if ((len <= 3 || strcmp (arg + len - 3, ".cc") == 0)
> +		  || (len <= 4 || strcmp (arg + len - 4, ".cpp") == 0)
> +		  || (len <= 4 || strcmp (arg + len - 4, ".c++") == 0))
> +		need_stdcxx = true;
> +
> +	      break;

The len handling looks a bit odd here, since something like "a.d" would
be treated as a c++ file too.  Might be better to use:

    dot = strrchr (arg, '.');

    if (dot && strcmp (dot, ".cc") == 0)

etc.

> +/* Called before linking.  Returns 0 on success and -1 on failure.  */
> +
> +int lang_specific_pre_link (void)

int
lang_specific_pre_link (void)

> +  /* If there is no hardware support, check if we an safely emulate it.  */

s/we an/we can/

> +/* For a vendor-specific type, return a string containing the C++ mangling.
> +   In all other cases, return NULL.  */
> +
> +const char *
> +Target::cppTypeMangle (Type *type)
> +{
> +    if (type->isTypeBasic () || type->ty == Tvector || type->ty == Tstruct)
> +    {
> +        tree ctype = build_ctype (type);
> +        return targetm.mangle_type (ctype);
> +    }
> +
> +    return NULL;
> +}

Misindented if and function body.

> +/* Return the type that will really be used for passing the given parameter
> +   ARG to an extern(C++) function.  */
> +
> +Type *
> +Target::cppParameterType (Parameter *arg)
> +{
> +    Type *t = arg->type->merge2 ();
> +    if (arg->storageClass & (STCout | STCref))
> +      t = t->referenceTo ();
> +    else if (arg->storageClass & STClazy)
> +      {
> +	/* Mangle as delegate.  */
> +	Type *td = TypeFunction::create (NULL, t, 0, LINKd);
> +	td = TypeDelegate::create (td);
> +	t = t->merge2 ();
> +      }
> +
> +    /* Could be a va_list, which we mangle as a pointer.  */
> +    if (t->ty == Tsarray && Type::tvalist->ty == Tsarray)
> +      {
> +	Type *tb = t->toBasetype ()->mutableOf ();
> +	if (tb == Type::tvalist)
> +	  {
> +	    tb = t->nextOf ()->pointerTo ();
> +	    t = tb->castMod (t->mod);
> +	  }
> +      }
> +
> +    return t;
> +}

Function body indented too far.

> +/* Used to represent a D inline assembly statement, an intermediate
> +   representation of an ASM_EXPR.  Reserved for future use and
> +   implementation yet to be defined.  */
> +DEFTREECODE (IASM_EXPR, "iasm_expr", tcc_statement, 5)

Do you need to reserve it?  The codes are local to the frontend
and you can add new ones whenever you want, so it would be good
to leave this out if possible.

> +/* Frame information for a function declaration.  */
> +
> +struct GTY(()) tree_frame_info
> +{
> +    struct tree_common common;
> +    tree frame_type;
> +};

Over-indented body.

> +/* Gate for when the D frontend make an early call into the codegen pass, such

s/make/makes/

> +    /* Anonymous structs/unions only exist as part of others,
> +       do not output forward referenced structs's.  */

s/'s//, unless I've misunderstood what the comment is saying.

> +    /* Ensure all semantic passes have ran.  */

"passes have run" or "passes ran".

> +      /* The real function type may differ from it's declaration.  */

s/it's/its/

> +      /* Initialiser must never be bigger than symbol size.  */

"Initializer" (alas)

> +/* Thunk code is based on g++ */

Comment should end with ".  */"

> +/* Get offset of base class's vtbl[] initialiser from start of csym.  */
> +
> +unsigned
> +base_vtable_offset (ClassDeclaration *cd, BaseClass *bc)
> +{
> ...
> +  return ~0;
> +}

Would be good to document the ~0 return, and probably also assert
that it isn't ~0:

> +	    /* The vtable of the interface length and ptr.  */
> +	    size_t voffset = base_vtable_offset (cd, b);
> +	    value = build_offset (csym, size_int (voffset));

...here.

> +  /* Build an expression of code CODE, data type TYPE, and operands ARG0 and
> +     ARG1.  Perform relevant conversions needs for correct code operations.  */

"relevant conversions needs" looks like a typo.

> +    if (POINTER_TYPE_P (t0) && POINTER_TYPE_P (t1))
> +      {
> +	/* Need to convert pointers to integers because tree-vrp asserts
> +	   against (ptr MINUS ptr).  */
> +	tree ptrtype = lang_hooks.types.type_for_mode (ptr_mode,
> +						       TYPE_UNSIGNED (type));
> +	arg0 = d_convert (ptrtype, arg0);
> +	arg1 = d_convert (ptrtype, arg1);
> +
> +	ret = fold_build2 (code, ptrtype, arg0, arg1);

Should probably now use POINTER_DIFF_EXPR for this.

> +    /* The LHS expression could be an assignment, to which it's operation gets
> +       lost during gimplification.  */
> +    if (TREE_CODE (lhs) == MODIFY_EXPR)
> +      {
> +	lexpr = compound_expr (lexpr, lhs);
> +	lhs = TREE_OPERAND (lhs, 0);
> +      }

s/it's/its/.  But the code looks a bit odd -- won't you end up with
double evaluation of the lhs of the MODIFY_EXPR?  Or is the creator of
the MODIFY_EXPR already guaranteed to have wrapped the lhs in a SAVE_EXPR
where necessary?  Probably worth a comment if so.

> +  /* Build a power expression, which raises its left operand to the power of
> +     its right operand.   */
> +
> +  void visit (PowExp *e)
> +  {
> ...
> +    /* If type is int, implicitly convert to double.  This allows backend
> +       to fold the call into a constant return value.  */
> +    if (e->type->isintegral ())
> +      powtype = double_type_node;

Seems dangerous if pow is defined as an integer operation.
E.g. ((1 << 62) | 1) ** 1 should be (1 << 62) | 1, which isn't
representable in double.

> +  /* Build a assignment operator expression.  The right operand is implicitly
> +     converted to the type of the left operand, and assigned to it.  */

"an assignment operator"

> +  /* Convert for initialising the DECL_RESULT.  */

"initializing"

> +/* Clear the DECL_BUILT_IN_CLASS flag on the function in CALLEXP.  */
> +
> +static void
> +clear_intrinsic_flag (tree callexp)
> +{
> +  tree decl = CALL_EXPR_FN (callexp);
> +
> +  if (TREE_CODE (decl) == ADDR_EXPR)
> +    decl = TREE_OPERAND (decl, 0);
> +
> +  gcc_assert (TREE_CODE (decl) == FUNCTION_DECL);
> +
> +  DECL_INTRINSIC_CODE (decl) = INTRINSIC_NONE;
> +  DECL_BUILT_IN_CLASS (decl) = NOT_BUILT_IN;
> +  DECL_FUNCTION_CODE (decl) = BUILT_IN_NONE;
> +}

It seems wrong that a particular call to a function would change the
function's properties for all calls, and I don't think there's any
expectation in the midend that this kind of change might happen.

Why's it needed?  Whatever the reason, I think it needs to be done in a
different way.

> +static tree
> +expand_intrinsic_bsf (tree callexp)
> ...
> +  built_in_function code = (argsize <= INT_TYPE_SIZE) ? BUILT_IN_CTZ :
> +    (argsize <= LONG_TYPE_SIZE) ? BUILT_IN_CTZL :
> +    (argsize <= LONG_LONG_TYPE_SIZE) ? BUILT_IN_CTZLL : END_BUILTINS;

":" should be at the start of the line rather than the end.  Same for
later functions.

> +static tree
> +expand_intrinsic_bswap (tree callexp)
> +{
> +  tree arg = CALL_EXPR_ARG (callexp, 0);
> +  int argsize = TYPE_PRECISION (TREE_TYPE (arg));
> +
> +  /* Which variant of __builtin_bswap* should we call?  */
> +  built_in_function code = (argsize == 32) ? BUILT_IN_BSWAP32 :
> +    (argsize == 64) ? BUILT_IN_BSWAP64 : END_BUILTINS;
> +
> +  /* Fallback on runtime implementation, which shouldn't happen as the
> +     argument for bswap() is either 32-bit or 64-bit.  */
> +  if (code == END_BUILTINS)
> +    {
> +      clear_intrinsic_flag (callexp);
> +      return callexp;
> +    }

The earlier intrinsics had this too, but I assumed there it was
because the code was mapping D types to target-dependent ABI types,
and you couldn't guarantee that long long was a 64-bit type.
In the above function there's no doubt that 32 and 64 bits are
the sizes available, so shouldn't this simply assert that
code != END_BUILTINS?

> +static tree
> +expand_intrinsic_sqrt (intrinsic_code intrinsic, tree callexp)
> +{
> +  tree arg = CALL_EXPR_ARG (callexp, 0);
> +  tree type = TYPE_MAIN_VARIANT (TREE_TYPE (arg));
> +
> +  /* arg is an integral type - use double precision.  */
> +  if (INTEGRAL_TYPE_P (type))
> +    arg = fold_convert (double_type_node, arg);
> +
> +  /* Which variant of __builtin_sqrt* should we call?  */
> +  built_in_function code = (intrinsic == INTRINSIC_SQRT) ? BUILT_IN_SQRT :
> +    (intrinsic == INTRINSIC_SQRTF) ? BUILT_IN_SQRTF :
> +    (intrinsic == INTRINSIC_SQRTL) ? BUILT_IN_SQRTL : END_BUILTINS;
> +
> +  gcc_assert (code != END_BUILTINS);
> +  return call_builtin_fn (callexp, code, 1, arg);
> +}

Converting integers to double precision isn't right for SQRTF and SQRTL.
But how do such calls reach here?  It looks like the deco strings in the
function entries provide types for the 3 sqrt intrinsics, is that right?
Does that translate to a well-typed FUNCTION_DECL, or do the function
types use some opaque types instead?  If the intrinsic function itself
is well-typed then an incoming CALLEXP with integer arguments would be
invalid.

> +static tree
> +expand_intrinsic_copysign (tree callexp)
> +{
> ...
> +  /* Which variant of __builtin_copysign* should we call?  */
> +  built_in_function code = (type == float_type_node) ? BUILT_IN_COPYSIGNF :
> +    (type == double_type_node) ? BUILT_IN_COPYSIGN :
> +    (type == long_double_type_node) ? BUILT_IN_COPYSIGNL : END_BUILTINS;

You can use mathfn_built_in for this.

> +  /* Assign returned result to overflow parameter, however if overflow is
> +     already true, maintain it's value.  */

s/it's/its/

> +/* DEF_D_INTRINSIC (CODE, ALIAS, NAME, MODULE, DECO, CTFE)
> +   CODE	    The enum code used to refer this intrinsic.
> +   ALIAS    The enum code used to refer the function DECL_FUNCTION_CODE, if

s/refer/refer to/ in both cases.  Same in runtime.texi.

> +/* This is the contribution to the `default_compilers' array in gcc.c
> +   for the D language.  */
> +
> +{".d", "@d", 0, 1, 0 },
> +{".dd", "@d", 0, 1, 0 },
> +{".di", "@d", 0, 1, 0 },
> +{"@d",
> +  "%{!E:cc1d %i %(cc1_options) %I %{nostdinc*} %{i*} %{I*} %{J*} \
> +    %{H} %{Hd*} %{Hf*} %{MD:-MD %b.deps} %{MMD:-MMD %b.deps} \
> +    %{M} %{MM} %{MF*} %{MG} %{MP} %{MQ*} %{MT*} \
> +    %{X:-Xf %b.json} %{Xf*} \
> +    %{v} %{!fsyntax-only:%(invoke_as)}}", 0, 1, 0 },

Guess this has been said before, but cc1d seems a strange name,
since "cc1" is "C Compiler pass 1".

> +fdeps
> +D
> +This switch is deprecated; do not use.
> +
> +fdeps=
> +D Joined RejectNegative
> +This switch is deprecated; do not use.
> ...
> +fintfc
> +D Alias(H)
> +; Deprecated in favour of -H
> +
> +fintfc-dir=
> +D Joined RejectNegative Alias(Hd)
> +; Deprecated in favour of -Hd
> +
> +fintfc-file=
> +D Joined RejectNegative Alias(Hf)
> +; Deprecated in favour of -Hf
> ...
> +ftransition=safe
> +D Alias(ftransition=dip1000)
> +; Deprecated in favor of -ftransition=dip1000

Any chance of just removing them?  Would be better not to add new code
(from GCC's POV) that's already deprecated.

> +ftransition=all
> +D RejectNegative
> +List information on all language changes

All docstrings should end in ".".  This should be tested by adding
the D frontend to:

foreach cls { "ada" "c" "c++" "fortran" "go" \
                    "optimizers" "param" "target" "warnings" } {

in gcc.misc-tests/help.exp

> +struct longdouble
> +{
> +public:
> +   /* Return the hidden real_value from the longdouble type.  */
> +  const real_value& rv (void) const
> +  { return *(const real_value *) this; }

Overindented comment.

> +/* Use ldouble() to explicitely create a longdouble value.  */

Typo: "explicitly"

> +/* D generates module information to inform the runtime library which modules
> +   needs some kind of special handling.  All `static this()', `static ~this()',

s/needs/need/

> +/* Record information about module initialization, termination,
> +   unit testing, and thread local storage in the compilation.  */
> +
> +struct module_info
> +{
> +  vec<tree, va_gc> *ctors;
> +  vec<tree, va_gc> *dtors;
> +  vec<tree, va_gc> *ctorgates;
> +
> +  vec<tree, va_gc> *sharedctors;
> +  vec<tree, va_gc> *shareddtors;
> +  vec<tree, va_gc> *sharedctorgates;
> +
> +  vec<tree, va_gc> *unitTests;
> +};
> ...
> +/* Static constructors and destructors (not D `static this').  */
> +
> +static vec<tree, va_gc> *static_ctor_list;
> +static vec<tree, va_gc> *static_dtor_list;

Looks odd to be using GC data structures without marking them.  Think it
deserves a comment if correct.

> +/* Generate a call to LIBCALL, returning the result as TYPE.  NARGS is the
> +   number of call arguments, the expressions of wwhich are provided in `...'.

s/wwhich/which/

> +/* Finish the statement tree rooted at T.  */
> +
> +tree
> +pop_stmt_list (void)
> +{
> +  tree t = d_function_chain->stmt_list->pop ();
> +
> +  /* If the statement list is completely empty, just return it.  This is just
> +     as good small as build_empty_stmt, with the advantage that statement
> +     lists are merged when they appended to one another.  So using the

Typo: "good small"
s/they appended/they are appended/

> +	/* Convert for initialising the DECL_RESULT.  */

"initializing"

> +  /* D Inline Assembler is not implemented, as would require a writing
> +     an assembly parser for each supported target.  Instead we leverage
> +     GCC extended assembler using the GccAsmStatement class.  */

s/as would require a writing/as it would require writing/

> +  /* Write out the interfacing vtable[] of base class BCD that will be accessed
> +     from the overriding class CD.  If both are the same class, then this will
> +     be it's own vtable.  INDEX is the offset in the interfaces array of the

s/it's/its/

> +    /* Internal reference should be public, but not visible outside the
> +       compilation unit, as itself is referencing public COMDATs.  */

Comment is hard to parse.

> +/* Returns true if the TypeInfo for type should be placed in
> +   the runtime library.  */
> +
> +static bool
> +builtin_typeinfo_p (Type *type)

"TypeInfo for TYPE"

> +  /* Let C++ do the RTTI generation, and just reference the symbol as
> +     extern, the knowing the underlying type is not required.  */

s/the knowing/knowing/?

> +	  if (!tinfo_types[tk])
> +	    {
> +	      make_internal_typeinfo (tk, ident, ptr_type_node,
> +				      array_type_node, NULL);
> +	    }

Redundant braces.

> +/* Implements a visitor interface to check whether a type is speculative.
> +   TypeInfo_Struct would refer the members of the struct it is representing

"refer to" or "reference"

> +Type *
> +get_object_type (void)
> +{
> +  if (ClassDeclaration::object)
> +    return ClassDeclaration::object->type;
> +
> +  ::error ("missing or corrupt object.d");

Why "::"?  Is it needed to avoid shadowing issues?

> +	  /* Insert the field declaration at it's given offset.  */

"its"

> +      /* Strip const modifiers from type before building.  This is done
> +	 to ensure that backend treats i.e: const (T) as a variant of T,
> +	 and not as two distinct types.  */

s/i.e/e.g./

Probably worth running tabify on the sources.  I noticed a couple
of indents by 8 spaces instead of tabs, but it didn't seem worth
listing them individually.  You can use contrib/check_GNU_style.{sh,py}
to find that kind of thing (although it generates a lot of false
positives for other coding style violations, so take with a pinch
of salt).

Also, s/layed out/laid out/ throughout.

Thanks,
Richard



More information about the Gcc-patches mailing list