Differences between revisions 5 and 6
Revision 5 as of 2012-11-24 23:26:08
Size: 9762
Editor: DiegoNovillo
Comment:
Revision 6 as of 2013-01-08 16:08:04
Size: 9786
Editor: DiegoNovillo
Comment:
Deletions are marked like this. Additions are marked like this.
Line 5: Line 5:
<<TableOfContents(3)>>

Simplifying the generation of GIMPLE IL

Original proposal and discussion: http://gcc.gnu.org/ml/gcc/2012-11/msg00223.html

Problem

Generating GIMPLE and tree expressions require lots of detail, which is hard to remember and easy to get wrong. There is some amount of boilerplate code that can, in most cases, be reduced and managed automatically.

We will add a set of helper classes to be used as local variables to manage the details of handling the existing types. That is, a layer over gimple_build_*. We intend to provide helpers for those facilities that are both commonly used and have room for significant simplification.

Generating an Expression

Suppose one wants to generate the expression (shadow != 0) & (((base_addr & 7) + offset) >= shadow), where offset is a value and the other identifiers are variables. The current code to generate this expression is as follows:

/* t = shadow != 0 */
g = gimple_build_assign_with_ops (NE_EXPR,
            make_ssa_name (boolean_type_node, NULL),
            shadow,
            build_int_cst (shadow_type, 0));
gimple_set_location (g, location);
gsi_insert_after (&gsi, g, GSI_NEW_STMT);
t = gimple_assign_lhs (g);

/* a = base_addr & 7 */
g = gimple_build_assign_with_ops (BIT_AND_EXPR,
            make_ssa_name (uintptr_type, NULL),
            base_addr,
            build_int_cst (uintptr_type, 7));
gimple_set_location (g, location);
gsi_insert_after (&gsi, g, GSI_NEW_STMT);

/* b = (shadow_type)a */
g = gimple_build_assign_with_ops (NOP_EXPR,
            make_ssa_name (shadow_type, NULL),
            gimple_assign_lhs (g),
            NULL_TREE);
gimple_set_location (g, location);
gsi_insert_after (&gsi, g, GSI_NEW_STMT);

/* c = b + offset */
g = gimple_build_assign_with_ops (PLUS_EXPR,
            make_ssa_name (shadow_type, NULL),
            gimple_assign_lhs (g),
            build_int_cst (shadow_type, offset));
gimple_set_location (g, location);
gsi_insert_after (&gsi, g, GSI_NEW_STMT);

/* d = c >= shadow */
g = gimple_build_assign_with_ops (GE_EXPR,
            make_ssa_name (boolean_type_node, NULL),
            gimple_assign_lhs (g),
            shadow);
gimple_set_location (g, location);
gsi_insert_after (&gsi, g, GSI_NEW_STMT);

/* r = t & d */
g = gimple_build_assign_with_ops (BIT_AND_EXPR,
            make_ssa_name (boolean_type_node, NULL),
            t,
            gimple_assign_lhs (g));
gimple_set_location (g, location);
gsi_insert_after (&gsi, g, GSI_NEW_STMT);
r = gimple_assign_lhs (g);

We propose a simplified form using a new build helper class gimple_stmt and an extension to gimple_seq that would allow the above code to be written as follows:

gimple_seq q;
gimple_stmt t = q.add_ssa_stmt (NE_EXPR, shadow, 0);
gimple_stmt a = q.add_ssa_stmt (BIT_AND_EXPR, base_addr, 7);
gimple_stmt b = q.add_ssa_stmt (NOP_EXPR, shadow_type, a);
gimple_stmt c = q.add_ssa_stmt (PLUS_EXPR, b, offset);
gimple_stmt d = q.add_ssa_stmt (GE_EXPR, c, shadow);
gimple_stmt e = q.add_ssa_stmt (BIT_AND_EXPR, t, d);
q.set_location (location);
r = e.lhs ();

There are a few important things to note about this example.

  • We extend the type gimple_seq to know how to sequence statements automatically and can build expressions out of types.

  • Every statement created produces an SSA name, if generated with gimple_seq::add_ssa_stmt. Referencing a gimple_stmt instance in an argument to gimple_seq::add_ssa_stmt retrieves the SSA name generated by that statement. Likewise, if the statement was generated with gimple_seq::gimple_stmt, the new statement will produce a GIMPLE temporary as its LHS.

  • The statement result type is that of the arguments.
  • The type of integral constant arguments is that of the other argument. (Which implies that only one argument can be constant.)
  • The gimple_seq::add_ssa_stmt method handles linking the statement into the sequence.

  • The gimple_seq::set_location method iterates over all statements.

There will be another class of builders for generating GIMPLE in normal form (gimple_seq::gimple_stmt). We expect that this will mostly affect all transformations that need to generate new expressions and statements, for example:

  • The vectorizer passes that generate SIMD expressions and statements with semantics equivalent to the scalar instructions that are being vectorized.
  • Expression lowering passes, e.g. the pass_convert_switch pass that lowers GIMPLE_SWITCH statements to sequences of branch-and-compare, and the pass_lower_complex* passes that lower complex expressions to scalar expressions.

  • Instrumentation passes, e.g. the counters instrumentation generated by pass_profile, and the pass_asan and pass_tsan sanitizer instrumentation passes.

  • Complex code transformations that rewrite into and out of different intermediate representations, e.g. GRAPHITE.

We also expect to reduce calls to tree expression builders by allowing the use of numeric and string constants to be converted to the appropriate tree *_CST node. This will only work when the type of the constant can be deduced from the other argument in some expressions, of course.

In addition to the new builder methods, we will provide access to all the gimple_* functions from the gimple_seq class. This will facilitate turning gimple into a proper class hierarchy.

Generating statements that affect control flow

Statements that affect the CFG (e.g., conditional branches) require accompanying code that creates new basic blocks, edges, etc. We will provide helpers that will automatically take care of the boilerplate code.

For example, if we wanted to generate the following code:

if (cond) {
  s1;
  s2;
} else {
  r1;
  r2;
}

Currently, the user needs to create the basic blocks needed to hold the predicate and the two branches of the above conditional. We could shorten that with:

gimple_seq then_body(s1);
then_body.add_ssa_stmt(s2);

gimple_seq else_body(r1);
else_body.add_ssa_stmt(r2);
gimple_stmt if_then_else(cond, then_body, else_body);

Inserting if_then_else in a basic block or an edge would then split the container and create the corresponding connections. The interface would also allow the user to specify branch probabilities.

Generating a Type

Consider the generation of the following type:

struct __asan_global {
  const_ptr_type_node __beg;
  inttype __size;
  inttype __size_with_redzone;
  const_ptr_type_node __name;
  inttype __has_dynamic_init;
};

The current code to generate it is as follows:

tree inttype = build_nonstandard_integer_type (POINTER_SIZE, 1);
tree ret = make_node (RECORD_TYPE);
TYPE_NAME (ret) = get_identifier ("__asan_global");
tree beg = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
                       get_identifier ("__beg"), const_ptr_type_node);
DECL_CONTEXT (beg) = ret;
TYPE_FIELDS (ret) = beg;
tree size = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
                        get_identifier ("__size"), inttype);
DECL_CONTEXT (size) = ret;
DECL_CHAIN (beg) = size;
tree red = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
                       get_identifier ("__size_with_redzone"), inttype);
DECL_CONTEXT (red) = ret;
DECL_CHAIN (size) = red;
tree name = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
                        get_identifier ("__name"), const_ptr_type_node);
DECL_CONTEXT (name) = ret;
DECL_CHAIN (red) = name;
tree init = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
                        get_identifier ("__has_dynamic_init"), inttype);
DECL_CONTEXT (init) = ret;
DECL_CHAIN (name) = init;
layout_type (ret);

We propose a form as follows:

tree inttype = build_nonstandard_integer_type (POINTER_SIZE, 1);
record_builder rec ("__asan_global");
rec.field ("__beg", const_ptr_type_node);
rec.field ("__size", inttype);
rec.field ("__size_with_redzone", inttype);
rec.field ("__name", const_ptr_type_node);
rec.field ("__has_dynamic_init", inttype);
rec.finish ();
tree ret = rec.as_tree ();

There are a few important things to note about this example.

  • The record_builder::field method will add context and chain links.

  • The record_builder::field method is overloaded on both strings and identifiers.

  • The record_builder::finish method lays out the type. Optionally this method could take flags to direct the lay-out of the fields, e.g. to sort the fields by alignment requirements.

We expect that this interface can be used by transformations that need to generate new types. These transformations currently include at least:

  • The instrumentation passes mentioned above (ASAN/TSAN/coverage/profiling).
  • Lowering of setjmp/longjmp exception handing in gcc/except.c:init_eh().

  • Lowering of nested functions in gcc/tree-nested.c.

  • Lowering of TLS variables for emulated TLS (e.g. gcc/tree-emutls.c:default_emutls_var_fields() and other implementations of the targetm.emutls.var_fields hook).

Proposal

Create a set of IL builder classes that provide a simplified IL building interface. Essentially, these builder classes will abstract most of the bookkeeping code required by the current interfaces.

These classes will not replace the existing interfaces. We do not expect that all the IL generation done in current transformations will be able to use the simplified interfaces. The goal is to simplify most of them, however.

None: cxx-conversion/gimple-generation (last edited 2013-01-08 16:08:04 by DiegoNovillo)