[PATCH 02/14] Add D frontend (GDC) implementation.

Iain Buclaw ibuclaw@gdcproject.org
Mon Oct 22 01:13:00 GMT 2018


On Sat, 20 Oct 2018 at 11:03, Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Iain Buclaw <ibuclaw@gdcproject.org> writes:
> > On 14 October 2018 at 17:29, Richard Sandiford
> > <rdsandiford@googlemail.com> wrote:
> >> [Sorry if this turns out to do be a dup]
> >>
> >> Iain Buclaw <ibuclaw@gdcproject.org> writes:
> >>> +/* Build nodes that are used by the D front-end.
> >>> +   These are distinct from C types.  */
> >>> +
> >>> +static void
> >>> +d_build_d_type_nodes (void)
> >>> +{
> >>> +  /* Integral types.  */
> >>> +  byte_type_node = make_signed_type (8);
> >>> +  ubyte_type_node = make_unsigned_type (8);
> >>> +
> >>> +  short_type_node = make_signed_type (16);
> >>> +  ushort_type_node = make_unsigned_type (16);
> >>> +
> >>> +  int_type_node = make_signed_type (32);
> >>> +  uint_type_node = make_unsigned_type (32);
> >>> +
> >>> +  long_type_node = make_signed_type (64);
> >>> +  ulong_type_node = make_unsigned_type (64);
> >>
> >> It's a bit confusing for the D type to be long_type_node but the C/ABI type
> >> to be long_integer_type_node.  The D type is surely an integer too. :-)
> >> With this coming among the handling of built-in functions, it initially
> >> looked related, and I was wondering how it was safe on ILP32 systems
> >> before realising the difference.
> >>
> >> Maybe prefixing them all with "d_" would be too ugly, but it would at
> >> least be good to clarify the comment to say that these are "distinct
> >> type nodes" (rather than just distinct definitions, as I'd initially
> >> assumed) and that they're not used outside the frontend, or by the C
> >> imports.
> >>
> >
> > If prefixing with "d_", perhaps dropping the "_node" would make them
> > sufficiently not ugly (d_long_type, d_uint_type, d_byte_type).
>
> Sounds good to me FWIW.
>
> >>> +/* Helper routine for all error routines.  Reports a diagnostic specified by
> >>> +   KIND at the explicit location LOC, where the message FORMAT has not yet
> >>> +   been translated by the gcc diagnostic routines.  */
> >>> +
> >>> +static void ATTRIBUTE_GCC_DIAG(3,0)
> >>> +d_diagnostic_report_diagnostic (const Loc& loc, int opt, const char *format,
> >>> +                             va_list ap, diagnostic_t kind, bool verbatim)
> >>> +{
> >>> +  va_list argp;
> >>> +  va_copy (argp, ap);
> >>> +
> >>> +  if (loc.filename || !verbatim)
> >>> +    {
> >>> +      rich_location rich_loc (line_table, get_linemap (loc));
> >>> +      diagnostic_info diagnostic;
> >>> +      char *xformat = expand_format (format);
> >>> +
> >>> +      diagnostic_set_info (&diagnostic, xformat, &argp, &rich_loc, kind);
> >>
> >> How does this work with translation?  xgettext will only see the original
> >> format string, not the result of expand_format.  Do you have some scripting
> >> to do the same format mangling when collecting the translation strings?
> >> Same concern:
> >>
> >
> > These diagnostic routines handle errors coming from the dmd front-end,
> > which are not translated - all sources are listed under po/EXCLUDES in
> > another patch.
>
> OK.  In that case I think you want to use diagnostic_set_info_translated
> instead of diagnostic_set_info, so that we don't try to translate things
> that aren't meant to be translated.  Also it would be good to reword the
> comment above the function, since "where the message FORMAT has not yet
> been translated by the gcc diagnostic routines" made it sound like these
> messages were supposed to be translated at some point, which is where the
> confusion started. :-)
>
> >>> +/* Write a little-endian 32-bit VALUE to BUFFER.  */
> >>> +
> >>> +void
> >>> +Port::writelongLE (unsigned value, void *buffer)
> >>> +{
> >>> +    unsigned char *p = (unsigned char*) buffer;
> >>> +
> >>> +    p[0] = (unsigned) value;
> >>> +    p[1] = (unsigned) value >> 8;
> >>> +    p[2] = (unsigned) value >> 16;
> >>> +    p[3] = (unsigned) value >> 24;
> >>> +}
> >>> ...
> >>> +/* Write a big-endian 32-bit VALUE to BUFFER.  */
> >>> +
> >>> +void
> >>> +Port::writelongBE (unsigned value, void *buffer)
> >>> +{
> >>> +    unsigned char *p = (unsigned char*) buffer;
> >>> +
> >>> +    p[0] = (unsigned) value >> 24;
> >>> +    p[1] = (unsigned) value >> 16;
> >>> +    p[2] = (unsigned) value >> 8;
> >>> +    p[3] = (unsigned) value;
> >>> +}
> >>
> >> Overindented bodies.  Missing space before "*" in "(unsigned char*)"
> >> in all these functions.
> >>
> >> Obviously this stuff assumes host CHAR_BIT == 8, but let's be realistic :-)
> >> Is it also used in ways that require the target BITS_PER_UNIT to be 8
> >> as well?  That could realistically be a different value (and we've had
> >> ports like that in the past).
> >>
> >
> > These read(long|word)(BE|LE) functions should only ever be used when
> > reading the BOM of a UTF-16 or UTF-32 file.
> >
> > I've done a grep, and the write(long|word)(BE|LE) are no longer used
> > by the dmd frontend, so there's little point keeping them around.
> >
> > If there's any utility in libiberty or another location then I'd be
> > more than happy to delegate this action to that here.
>
> If it's for UTF-16 and UTF-32 then I think it's fine as-is (if you keep it).
> Was just asking in case.
>
> >>> +/* Implements the lang_hooks.get_alias_set routine for language D.
> >>> +   Get the alias set corresponding the type or expression T.
> >>> +   Return -1 if we don't do anything special.  */
> >>> +
> >>> +static alias_set_type
> >>> +d_get_alias_set (tree t)
> >>> +{
> >>> +  /* Permit type-punning when accessing a union, provided the access
> >>> +     is directly through the union.  */
> >>> +  for (tree u = t; handled_component_p (u); u = TREE_OPERAND (u, 0))
> >>> +    {
> >>> +      if (TREE_CODE (u) == COMPONENT_REF
> >>> +       && TREE_CODE (TREE_TYPE (TREE_OPERAND (u, 0))) == UNION_TYPE)
> >>> +     return 0;
> >>> +    }
> >>> +
> >>> +  /* That's all the expressions we handle.  */
> >>> +  if (!TYPE_P (t))
> >>> +    return get_alias_set (TREE_TYPE (t));
> >>> +
> >>> +  /* For now in D, assume everything aliases everything else,
> >>> +     until we define some solid rules.  */
> >>> +  return 0;
> >>> +}
> >>
> >> Any reason for not returning 0 unconditionally?  Won't the queries
> >> deferred to get_alias_set either return 0 directly or come back here
> >> and get 0?
> >>
> >> Might be worth adding a comment referencing build_vconvert, which stood
> >> out as a source of what in C would be alias violations.
> >>
> >
> > All previous attempts at doing more with this has caused silent
> > corruption at runtime, the existence of build_vconvert likely doesn't
> > help either.
> >
> > Although it isn't in the spec, D should be "strict aliasing".  But
> > there's a lot of production code that looks like `intvar = *(int
> > *)&floatvar;` that is currently expected to just work.
> >
> > By rule of thumb in D, the C behaviour should be followed if it looks
> > like C code.  The only places where we deviate are made to be a
> > compiler error.
>
> But my point was more that the function seems to have the effect of
> returning 0 all the time, so wouldn't it be better to get rid of the
> code before the final "return 0"?  Or is there a case I missed?
>
> >>> +/* Clear the DECL_BUILT_IN_CLASS flag on the function in CALLEXP.  */
> >>> +
> >>> +static void
> >>> +clear_intrinsic_flag (tree callexp)
> >>> +{
> >>> +  tree decl = CALL_EXPR_FN (callexp);
> >>> +
> >>> +  if (TREE_CODE (decl) == ADDR_EXPR)
> >>> +    decl = TREE_OPERAND (decl, 0);
> >>> +
> >>> +  gcc_assert (TREE_CODE (decl) == FUNCTION_DECL);
> >>> +
> >>> +  DECL_INTRINSIC_CODE (decl) = INTRINSIC_NONE;
> >>> +  DECL_BUILT_IN_CLASS (decl) = NOT_BUILT_IN;
> >>> +  DECL_FUNCTION_CODE (decl) = BUILT_IN_NONE;
> >>> +}
> >>
> >> It seems wrong that a particular call to a function would change the
> >> function's properties for all calls, and I don't think there's any
> >> expectation in the midend that this kind of change might happen.
> >>
> >> Why's it needed?  Whatever the reason, I think it needs to be done in a
> >> different way.
> >>
> >
> > There are some D library math functions that are only treated as
> > built-in during compile-time only.  When it comes to run-time code
> > generation, we want to call the D library functions, not the C library
> > (or built-ins).
>
> But e.g. we can treat printf as a built-in and still call the real printf
> without doing this.
>
> >>> +static tree
> >>> +expand_intrinsic_sqrt (intrinsic_code intrinsic, tree callexp)
> >>> +{
> >>> +  tree arg = CALL_EXPR_ARG (callexp, 0);
> >>> +  tree type = TYPE_MAIN_VARIANT (TREE_TYPE (arg));
> >>> +
> >>> +  /* arg is an integral type - use double precision.  */
> >>> +  if (INTEGRAL_TYPE_P (type))
> >>> +    arg = fold_convert (double_type_node, arg);
> >>> +
> >>> +  /* Which variant of __builtin_sqrt* should we call?  */
> >>> +  built_in_function code = (intrinsic == INTRINSIC_SQRT) ? BUILT_IN_SQRT :
> >>> +    (intrinsic == INTRINSIC_SQRTF) ? BUILT_IN_SQRTF :
> >>> +    (intrinsic == INTRINSIC_SQRTL) ? BUILT_IN_SQRTL : END_BUILTINS;
> >>> +
> >>> +  gcc_assert (code != END_BUILTINS);
> >>> +  return call_builtin_fn (callexp, code, 1, arg);
> >>> +}
> >>
> >> Converting integers to double precision isn't right for SQRTF and SQRTL.
> >> But how do such calls reach here?  It looks like the deco strings in the
> >> function entries provide types for the 3 sqrt intrinsics, is that right?
> >> Does that translate to a well-typed FUNCTION_DECL, or do the function
> >> types use some opaque types instead?  If the intrinsic function itself
> >> is well-typed then an incoming CALLEXP with integer arguments would be
> >> invalid.
> >>
> >
> > As with pow(), I'll have to check by running this through the
> > testsuite to see what fails and why.
> >
> > From memory, the reason is likely some requirement of CTFE, where this
> > call *must* be const folded at compile-time, which may not happen if
> > the types are wrong.
>
> The reason for picking this one in particular is that the INTRINSIC_SQRT{,F,L}
> FUNCTION_DECLs looked like they would have correct types, so any CALL_EXPR
> tree with mismatched types would be invalid in terms of GCC internals.
> If that happens, it sounds like there's a bug earlier on.
>

I'm just going to post the diff since the original here, just to show
what's been done since review comments.

I think I've covered all that's been addressed, except for the couple
of notes about the quadratic parts (though I think one of them is
actually O(N^2)).  I've raised bug reports on improving them later.

I've also rebased them against trunk, so there's a couple new things
present that are just to support build.

Regards
--
Iain

---
gcc/d/:
2018-10-22  Iain Buclaw  <ibuclaw@gdcproject.org>

    * Make-lang.in (selftest-d): New.

2018-10-22  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-spec.cc (lang_specific_driver): Always link against phobos if any
    input file is given.

2018-10-21  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-lang.cc (d_get_alias_set): Always return zero.

2018-10-21  Iain Buclaw  <ibuclaw@gdcproject.org>

    * intrinsics.cc (maybe_set_intrinsic): Don't set built-in flag on
    unsupported pow() overloads.

2018-10-20  Iain Buclaw  <ibuclaw@gdcproject.org>

    * expr.cc (ExprVisitor::binop_assignment): Call stabilize_reference on
    LHS construct if it has side effects.

2018-10-20  Iain Buclaw  <ibuclaw@gdcproject.org>

    * intrinsics.cc (clear_intrinsic_flag): Remove function.
    (maybe_expand_intrinsic): Remove clear_intrinsic_flag call.

2018-10-20  Iain Buclaw  <ibuclaw@gdcproject.org>

    * intrinsics.cc (expand_intrinsic_copysign): Use mathfn_built_in to
    determine correct built-in to call.
    (expand_intrinsic_pow): Likewise.

2018-10-20  Iain Buclaw  <ibuclaw@gdcproject.org>

    * intrinsics.cc (expand_intrinsic_sqrt): Remove implicit int to double
    conversion.
    (expand_intrinsic_pow): Likewise.

2018-10-20  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-codegen.cc (get_frame_for_symbol): Use error_at.
    (build_frame_type): Likewise.
    (get_framedecl): Likewise.
    * d-lang.cc (d_parse_file): Likewise.
    * decl.cc (DeclVisitor::visit(StructDeclaration)): Likewise.
    (DeclVisitor::finish_vtable): Likewise.
    (DeclVisitor::visit(ClassDeclaration)): Likewise.
    (DeclVisitor::visit(InterfaceDeclaration)): Likewise.
    (DeclVisitor::visit(EnumDeclaration)): Likewise.
    (DeclVisitor::visit(VarDeclaration)): Likewise.
    * toir.cc (IRVisitor::check_goto): Likewise.
    (IRVisitor::check_previous_goto): Likewise.
    (IRVisitor::visit(ThrowStatement)): Likewise.

2018-10-20  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-codegen.cc (get_array_length): Use quoted format flag in message.
    (d_build_call): Likewise.
    * d-lang.cc (d_handle_option): Likewise.
    * decl.cc (DeclVisitor::finish_vtable): Likewise.
    * expr.cc (ExprVisitor::visit(ArrayLengthExp)): Likewise.
    (ExprVisitor::visit(DeleteExp)): Likewise.
    (ExprVisitor::visit(RemoveExp)): Likewise.
    (ExprVisitor::visit(RemoveExp)): Likewise.
    (ExprVisitor::visit(CallExp)): Likewise.
    (ExprVisitor::visit(DotVarExp)): Likewise.
    (ExprVisitor::visit(VarExp)): Likewise.
    (ExprVisitor::visit(ScopeExp)): Likewise.
    (ExprVisitor::visit(TypeExp)): Likewise.
    (build_expr): Likewise.

2018-10-20  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-diagnostic.cc (d_diagnostic_report_diagnostic): Skip translation
    by instead calling diagnostic_set_info_translated.

2018-10-20  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-tree.h (bool_type_node): Rename to d_bool_type.
    (byte_type_node): Rename to d_byte_type.
    (ubyte_type_node): Rename to d_ubyte_type.
    (short_type_node): Rename to d_short_type.
    (ushort_type_node): Rename to d_ushort_type.
    (int_type_node): Rename to d_int_type.
    (uint_type_node): Rename to d_uint_type.
    (long_type_node): Rename to d_long_type.
    (ulong_type_node): Rename to d_ulong_type.
    (cent_type_node): Rename to d_cent_type.
    (ucent_type_node): Rename to d_ucent_type.

2018-10-20  Iain Buclaw  <ibuclaw@gdcproject.org>

    * expr.cc (ExprVisitor::visit(PowExp)): Remove function.

2018-10-19  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-attribs.c: Rename to d-attribs.cc.
    * d-spec.c: Rename to d-spec.cc.

2018-10-19  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-lang.cc (d_gimplify_expr): Don't handle TREE_THIS_VOLATILE.

2018-10-19  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-diagnostic.cc (vwarning): Update to use Diagnostic enum.
    (vdeprecation): Likewise.
    (vdeprecationSupplemental): Likewise.
    * d-lang.cc (d_init_options): Explicitly set warnings and deprecations
    as DIAGNOSTICoff.
    (d_handle_option): Update to use Diagnostic enum.
    (d_post_options): Likewise.

2018-10-18  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-diagnostic.cc (expand_format): Rename to expand_d_format.
    Updated all callers.

2018-10-17  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-codegen.cc (get_linemap): Rename function to make_location_t.
    Updated all callers.
    * d-tree.h (get_linemap): Rename declaration to make_location_t.

2018-10-17  Iain Buclaw  <ibuclaw@gdcproject.org>

    * expr.cc (ExprVisitor::binary_op): Use POINTER_DIFF_EXPR.

2018-10-17  Iain Buclaw  <ibuclaw@gdcproject.org>

    * intrinsics.cc (expand_intrinsic_bsf): Assert that built-in function
    code is not END_BUILTINS.
    (expand_intrinsic_bsr): Likewise.
    (expand_intrinsic_bswap): Likewise.
    (expand_intrinsic_popcnt): Likewise.

2018-10-17  Iain Buclaw  <ibuclaw@gdcproject.org>

    * config-lang.in (gtfiles): Add modules.cc.
    * modules.cc: Include gt-d-modules.h.
    (module_info): Mark with GTY.
    (static_ctor_list): Likewise.
    (static_dtor_list): Likewise.

2018-10-17  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-spec.c (lang_specific_driver): Use strrchr and strcmp to check
    input file suffix.

2018-10-17  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-spec.c (phobos_action): New enum.
    (library): Rename to phobos_library.
    (lang_specific_driver): Update to use phobos_library.
    (lang_specific_pre_link): Likewise.

2018-10-16  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-frontend.cc (Port::writelongLE): Remove function.
    (Port::writelongBE): Remove function.

2018-10-16  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-convert.cc (convert): Remove goto maybe_fold.

2018-10-16  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-codegen.cc (warn_for_null_address): New function.
    (build_boolop): Warn about comparing address of decl to null.
    * d-convert.cc (decl_with_nonnull_addr_p): New function.
    (d_truthvalue_conversion): Warn about evaluating address as boolean.
    * d-tree.h (decl_with_nonnull_addr_p): Add declaration.
    * lang.opt (Waddress): New option.

2018-10-16  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-codegen.cc (d_array_length): Assert that argument type is a
    dynamic array.
    (d_array_ptr): Likewise.
    (d_array_value): Likewise.
    (delegate_method): Assert that argument type is a delegate.
    (delegate_object): Likewise.

2018-10-16  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-attribs.c (handle_malloc_attribute): Use gcc_assert instead of
    gcc_unreachable.
    (handle_pure_attribute): Likewise.
    (handle_nothrow_attribute): Likewise.

2018-10-16  Iain Buclaw  <ibuclaw@gdcproject.org>

    * Make-lang.in: Rename compiler proper to d21.
    * config-lang.in (compilers): Rename compiler to d21.
    * d-spec.c (lang_specific_driver): Update comments.
    * lang-specs.h: Rename compiler to d21.

2018-10-16  Iain Buclaw  <ibuclaw@gdcproject.org>

    * lang.opt: Add missing periods to the ends of sentences.

2018-10-16  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-lang.cc (d_handle_option): Remove handling of -fdeps.
    (d_parse_file): Don't generate module dependencies.
    * lang.opt (fdeps, fdeps=): Remove options.
    (fintfc, fintfc-dir=, fintfc-file=): Remove options.
    (ftransition=safe): Remove option.

2018-10-16  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-lang.cc (d_init_ts): Remove handling of IASM_EXPR.
    (d_gimplify_expr): Likewise.
    * d-tree.def (IASM_EXPR): Remove tree code.

2018-10-15  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-attrib.c (attr_noreturn_exclusions): Attribute not mutually
    exclusive with self.
    * typeinfo.cc (TypeInfoVisitor::layout_interfaces): Assert that
    base class vtable is found in interface.

2018-10-08  Iain Buclaw  <ibuclaw@gdcproject.org>

    * decl.cc (DeclVisitor): Add using Visitor::visit.
    * expr.cc (ExprVisitor): Likewise.
    * imports.cc (ImportVisitor): Likewise.
    * toir.cc (IRVisitor): Likewise.
    * typeinfo.cc (TypeInfoVisitor): Likewise.
    (TypeInfoDeclVisitor): Likewise.
    (SpeculativeTypeVisitor): Likewise.
    * types.cc (TypeVisitor): Likewise.

2018-10-01  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-frontend.cc: Include compiler.h, errors.h, expression.h.
    (genCmain): Rename function to Compiler::genCmain.
    (Compiler::paintAsType): New function.
    (Compiler::loadModule): New function.
    (getTypeInfoType): Call error function directly.
    * d-lang.cc (deps_write): Use hash_set for dependency tracking.
    (d_parse_file): Call Compiler::loadModule.
    * d-target.cc: Remove include identifier.h, module.h.
    (Target::paintAsType): Remove function.
    (Target::loadModule): Remove function.
    (Target::getTargetInfo): New function.

2018-10-01  Eugene Wissner  <belka@caraus.de>

    * decl.cc (finish_thunk): Adjust call to cgraph_node::create_thunk.

2018-09-25  Eugene Wissner  <belka@caraus.de>

    * d-codegen.cc (d_assert_call): Don't make STRING_CSTs larger than they
    are.
    * expr.cc (ExprVisitor::visit(StringExp)): Likewise.
    * typeinfo.cc (TypeInfoVisitor::layout_string): Likewise.

2018-09-24  Iain Buclaw  <ibuclaw@gdcproject.org>

    * d-builtins.cc: Include expression.h, identifier.h.
    * d-codegen.cc: Include identifier.h.
    * d-convert.cc: Include declaration.h.
    * d-frontend.cc: Include identifier.h.
    * d-lang.cc: Include declaration.h, expression.h, identifier.h.
    (d_parse_file): Call moduleToBuffer to get string dump of contents.
    * d-target.cc: Include declaration.h, expression.h, identifier.h.
    * expr.cc: Include identifier.h.
    * imports.cc: Include identifier.h.
    * intrinsics.cc: Include identifier.h.
    * modules.cc: Include identifier.h.
    * toir.cc: Include expression.h, identifier.h.
    * typeinfo.cc: Include expression.h, identifier.h.
    * types.cc: Include expression.h, identifier.h.
---
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 02-v4v5-d-frontend-gdc.patch.xz
Type: application/octet-stream
Size: 45444 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20181022/5f718dcc/attachment.obj>


More information about the Gcc-patches mailing list