This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: struct-reorg optimization


"Diego Novillo" <dnovillo@google.com> wrote on 06/09/2007 23:43:00:

> On 8/31/07, Olga Golovanevsky <OLGA@il.ibm.com> wrote:
>
> >    Contributed by Olga Golovanevsky <olga@il.ibm.com>
> >    (initial version of this code was developed
> >                                   by Caroline Tice and Mostafa Hagog)
>   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Spacing.

done.

>
> > and str_t_1 as it was shown above, then array A will be replaced
> > by two arrays as foolows:
>
> s/foolows/follows/

done.

>
> > The field access of fields a of ellement i of array A: A[i].a will be
> > replace by access to field a of ellement i of array A_0: A_0[i].a.
>
> s/ellement/element/
> s/replace/replaced/
> s/by access/by an access/
>

done.

>
> > The decision how to peel structure gains to utilize spatial locality.
>
> "The goal of structure peeling is to improve spatial locality."

done.

>
> > For example, if one of fields of structure has intensive use in the
loop:
>
> s/if one of/if one of the/
> s/of structure/of a structure/
> s/has intensive use in the loop/is accessed frequently in the loop:/
>

done.

> >    the allocation of field a of str_t in the memory in contiguous
> manner will
>
> s/in the memory in contiguous manner/contiguously in memory/

done.

>
> >    increase probablity of relevant data to be fatched into cache.
>
> "increase the chances of fetching the field from cache."

done.

>
> >    got peeled out of the structure:
>
> s/got/get/

done.

>
>
> >    freq(f) > C * max_field_freq_in_struct
> >
> >    where max_field_freq_in_struct is maximum field frequency in
structure.
>
> s/is maximum/is the maximum/
> s/in structure/in the structure/

done.

>
> >    The C is a constant defining which portion of
max_field_freq_in_struct
> >    the filds should have in order to be peeled.
>
> s/The C/C/
> s/the filds/the fields/
>

done.

> >    If profiling information is provided, its is used to calculate the
>
> s/its is/it is/

done.

>
> >    frequency of field accesses. Otherwise, structure is fully peeled.
>
> /structure is/the structure is/

done.

>
> Oh?  Don't we use estimated execution frequencies to compute this?

We would want to do it. As I have learnt form Jan, the estimated execution
frequencies cannot be used at ipa level right now.

>
> >
> >    The ipa-type-escape analysis are used to provide safety of the
peeling.
>
> "IPA type-escape analysis is used to determine when it is safe to peel
> a structure."
>

done.

>
> >    The optimization is activated by flag -fipa-struct-reorg.
> > */
>
> Closing comment on previous line.

done.

>
>
> > /* Ratio of the structure count to the hottest
> >    structure count to consider the structure cold.  */
> > #define COLD_STRUCTURE_RATIO 10
>
> Better convert into a --param.

done.

>
>
> > struct new_var_data {
> > [ ... ]
> >   struct new_var_data *next;
> > };
>
> Why not use a VEC and avoid having to manage the single-linked list
> yourself?  Similarly with structures malloc_call, malloc_struct and
> struct_list.

done.

>
> > static new_var is_in_new_vars_list (tree, new_var);
> > static void add_struct_to_data_struct_list (tree);
> > static void add_call_to_malloc_list (tree, tree, d_str);
> > static inline void update_fields_mapping (struct field_cluster *, tree,
> >                  struct data_field_entry *, int);
> > static void create_new_var (tree, new_var *);
> > static tree get_final_malloc_stmt (tree malloc_stmt);
> > static tree_list add_tree_to_tree_list (tree, tree_list);
> > static void free_tree_list (tree_list);
> > static malloc_d get_malloc_list_for_func (tree);
> > static struct field_access_site *
> > add_field_acc_to_acc_sites (struct field_access_site *,
> >              struct field_access_site *);
> > static struct access_site * add_access_to_acc_sites (tree, tree,
> >                        struct access_site *);
> > static bool is_result_of_mult (tree, tree *, tree);
> > static void remove_struct_from_data_struct_list (list);
> > static tree gen_size (tree, tree, tree *);
> > static tree gen_cast_stmt (tree, tree, tree, tree *);
> > static void create_new_stmts_for_general_acc (struct access_site *,
d_str);
> > static void create_new_stmts_for_cond_expr (tree);
> > static bool is_equal_types (tree, tree);
> > static tree * find_pos_in_stmt (tree, tree);
> > static void free_data_structure_list (void);
>
> I'd rather see the functions defined in such a way that you don't need
> forward declarations.  This forces the main functions to be grouped at
> the bottom of the file, making it easier to find them later.  It also
> simplifies changing function signatures in the future.
>

done, except mutually calling functions.

> Not a requirement, though.  That's mostly personal preference.
>
>
>
> >   if (TREE_CODE (var) == PARM_DECL)
> >       return DECL_ARG_TYPE (var);
> >
> >   else
> >     return TREE_TYPE (var);
> > }
>
> Spacing.

done.

>
> > /* Check whether the type of VAR is potential candidate for peeling.
> >    Returns true if yes, false otherwise.  If yes, TYPE_P contains
> >    candidate type.  */
> > static bool
> > is_candidate_for_data_struct_list (tree var, tree *type_p,
> >                bitmap non_suitable_types)
>
> No documentation for NON_SUITABLE_TYPES.

done.

>
> > static const char *
> > get_type_name (tree type_node)
> < [ ... ]
> >   if (TREE_CODE (TYPE_NAME (type_node)) == IDENTIFIER_NODE)
> >     return (char *) IDENTIFIER_POINTER (TYPE_NAME (type_node));
>
> s/char *//

done.

>
> >   else if (TREE_CODE (TYPE_NAME (type_node)) == TYPE_DECL
> >       && DECL_NAME (TYPE_NAME (type_node)))
> >     return (char *) IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME
(type_node)));
>
> > /* Print structure type , its name, if it exists, and body.  */
> > static void
> > print_struct_type (tree type, char *indent)
> > {
>
> s/type , /TYPE, /

done.

>
> No documentation for INDENT.
>

done.

> > /* Given structure type SRT_TYPE and field FIELD,
> >    this funciton is looking for a field with the same name
>
> s/funciton/function/

done.

>
> >       char * str_field_name;
> >       char * field_name;
>
> const char *

done.

>
> >       if (!strcmp (str_field_name, field_name))
> >    {
>
> Why?  Wouldn't is_equal_types() be enough here?

It is for fields. They can be of the same type (for example, int),
but have different names ('a' and 'b').

>
> > /* This function is a hack to overcome the types problem.
> >    When number of compilation units are compiled together
>
> s/number of/several/

done.

>
> >    under -combine flag, the TYPE_MAIN_VARIANT cannot
>
> s/under -combine flag/with -combine/

done.

>
> >    be trusted. First we comparing names,  but they do not
> >    always appear, like for example, in array_refs.
>
> I'm not sure what this means.  Could you re-phrase?
>

done.

> >    Only then we are going inside structures.
> > */
>
> Closing comment on previous line.

done.

>
> > static inline struct field_access_site *
> > make_field_acc_node (void)
> > {
> >     return (struct field_access_site *)
> >       xcalloc (1, sizeof (struct field_access_site));
> > }
>
> Indentation.

done.

>
> >       *walk_subtrees=1;
> > [ ... ]
> >       *walk_subtrees=0;
>
> Spacing between '='.  Happens elsewhere too.

done.

>
> >       }
> >       break;
> >     case VAR_DECL:
> >     case SSA_NAME:
>
> Spacing after 'break;'.

done.

>
> >       /* In asm stmt we cannot always track the agruments,
> >        so we just give up.  */
> >       if (TREE_CODE (stmt) == ASM_EXPR)
> >       free_data_structure_list ();
>
> s/agruments/arguments/
>
> If we are giving up, why does the loop keep going?

done.

>
> > static tree
> > build_basic_struct (tree fields, char * name, tree orig_struct)
> > {
> >
> >   tree attributes = NULL_TREE;
>
> No spacing after '{'.

done.

>
> > gen_cluster_name (tree decl, int clust_num, int str_num)
> > {
> >   char * old_name = (char *) get_type_name (decl);
>
> Better declare old_name as 'const char *'.
>
>
> > /* Generate new structure name.  */
> > static inline char *
> > gen_cluster_name (tree decl, int clust_num, int str_num)
> > {
> >   char * old_name = (char *) get_type_name (decl);
> >   char * new_name;
> >
> >   if (!old_name)
> >     {
> >       old_name = (char *) xmalloc ((10) * sizeof (char));
> >       sprintf (old_name, "struct_%d", str_num);
> >     }
> >
> >   new_name = (char *) xmalloc ((strlen (old_name) + 11) * sizeof
(char));
> >   sprintf (new_name, "%s_sub_%d", old_name, clust_num);
> >
> >   /* Check that this name was not used before.  */
> >   while (maybe_get_identifier (new_name))
> >     {
> >       int len = strlen (new_name) + 3;
> >       char * tmp_name = (char *) xmalloc (len * sizeof (char));
> >       sprintf (tmp_name, "%s.0", new_name);
> >       new_name = tmp_name;
> >     }
> >
> >   return new_name;
> > }
>
> No.  Better use alloca and make sure there are no possibility of
> buffer overflow.  This function leaks memory as well.  See
> create_omp_child_function_name and create_tmp_var_name.
>
> Similarly, in gen_var_name.  Why not use create_tmp_var_name?
>

Both gen_cluster_name and gen_var_name are re-written to be similar to
create_omp_child_function_name.

> > /* Update field_mapping of fields in CLUSTER with NEW_TYPE.  */
> > static inline void
> > update_fields_mapping (struct field_cluster *cluster, tree new_type,
> >                      struct data_field_entry * fields, int num_fields)
>
> No documentation for FIELDS and NUM_FIELDS.

done.

>
> > static struct field_access_site *
> > is_in_field_acc_list (tree stmt, struct field_access_site *list_p)
> > {
> >   struct field_access_site *curr;
> >
> >   for (curr = list_p; curr; curr = curr->next)
> >     if (curr->stmt == stmt)
> >       return curr;
>
> Would it make sense to keep a pointer-map or a hashtable to avoid
> these linear searches?

done.

>
> > static struct access_site *
> > is_in_acc_list (tree stmt, struct access_site *list_p)
> > {
> >   struct access_site *curr;
> >
> >   for (curr = list_p; curr; curr = curr->next)
> >     if (curr->stmt == stmt)
> >       return curr;
>
> Likewise here.

done.

>
> > struct type_wrapper
> > {
> >   /* 0 stand for pointer wrapper,
> >      and 1 for array wrapper.  */
> >   bool wrap;
> >   tree domain; /* Relevant for arrays as domain or index.  */
> >   struct type_wrapper * next;
> > };
>
> Similar comment as the other structures.  It seems easier to put these
> inside VECs instead of having to manage the single-linked lists on
> your own.

done.

>
> > static void
> > create_new_local_vars (void)
> > {
>
> This function looks like it's doing too much work.  It's going over
> the same DECLs over and over again.  Why not just traverse
> referenced_vars?
>

done.

>
> >    is converted into followinf set of stmts:
>
> s/into followinf/into the following/

done.

>
>
> > static void
> > add_bb_info (basic_block bb, tree stmt_list)
> > {
> >   if (TREE_CODE (stmt_list) == STATEMENT_LIST)
> >     {
> >       tree_stmt_iterator tsi;
> >       for (tsi = tsi_start (stmt_list); !tsi_end_p (tsi);
> >       tsi_next (&tsi))
>
> tsi_next() fits in the previous line.

done.

>
> > /* This function returns a tree representing
> >    the number of structure instances allocated by MALLOC_STMT.  */
> > gen_num_of_structs_in_malloc (tree stmt, tree str_decl,
> >                tree *new_stmts_p)
>
> new_stmts_p fits in the previous line.

done.

>
> Missing documentation for NEW_STMTS_P and STR_DECL.  I suppose that
> MALLOC_STMT is STMT?

done.

>
>
> >   /* Get malloc agrument.  */
>
> s/agrument/argument/

done.

>
>
> >   /* Generate argument to malloc as multiplication of num and size
> of new_type.  */
>
> Line's too long.

done.

>
> > /* This function build an edge between BB and E->dest and updates
>
> s/build/builds/

done.

>
>
> > /* If MALLOC_STMT is D.2225_7 = malloc (D.2224_6);
> >    it can be casting stmt as for example:
> >    p_8 = (struct str_t *) D.2225_7;
> >    which is returned by this function.  */
>
> It's not clear what this means.  Could you rephrase?  Perhaps
> something like "Return the first statement that uses the result of the
> call to MALLOC_STMT."

done.

>
> >   return build3 (COMPONENT_REF, TREE_TYPE (field),
> >        base, field, NULL_TREE);
>
> No need to wrap this line.

done.

>
> >   struct type_wrapper * wrapper = NULL;
> >   struct type_wrapper * wr = NULL;
>
> No space after '*'.

done.

>
> >   if (TREE_CODE (acc->stmt) == GIMPLE_MODIFY_STMT)
> >     {
> >
> >       lhs = GIMPLE_STMT_OPERAND (acc->stmt, 0);
>
> No space after '{'

done.

>
> > /* In this function we create new stmts that has the same
> >    form as ORIG_STMT, but of the type NET_TYPE. The stmts
>
> s/stmts/statements/
> s/that has/that have/
> s/of the type/of type/


done.

>
> >    treated by this function are simple assignments,
> >    like assignments:  p.8_7 = p; or stmts with rhs of code
> >    PLUS_EXPR or MINUS_EXPR.  */
> > static tree
> > create_base_plus_offset (tree orig_stmt, tree new_type,
> >           tree offset)
> > {
> > [ ... ]
> >    gcc_assert ( tmp0 || tmp1);
>
> No space after '('.

done.

>
> >    if (!new_op0)
> >      new_op0 = offset;
> >    if (!new_op1)
> >      new_op1 = offset;
> >
> >    new_rhs = build2 (TREE_CODE (rhs), TREE_TYPE (new_op0),
> >            new_op0, new_op1);
>
> If RHS is a POINTER_PLUS_EXPR and new_op0 is OFFSET, you'll likely
> create an invalid expression here.

true. If RHS is a POINTER_PLUS_EXPR, then new_op0 is found by
find_new_var_of_type, and won't be equal to OFFSET.

>
>
> >   new_stmt = build2 (GIMPLE_MODIFY_STMT, TREE_TYPE (new_lhs),
> >            new_lhs, new_rhs);
>
> Statements have no type.  That should be void_type_node.  Similarly in
> other places where assignments are created.

changed to use build_gimple_modify_stmt ().

>
> > static tree
> > gen_cast_stmt (tree before_cast, tree new_type, tree
> old_cast_stmt, tree *res_p)
>
> This line is too long.

done.

>
>
> > static void
> > create_new_field_access (struct field_access_site *acc, struct
> data_field_entry field)
>
> Likewise.

done.

>
>
>
> > /* This function creates new non-field accesses for
> >    each new type in STR.  */
> > static void
> > create_new_stmts_for_general_acc (struct access_site * acc, d_str str)
> > {
>
> Missing documentation for ACC.

done.

>
> > /* This function creates new condition stmt and
> >    redirect edges to correspondent BBs.
> >    NEW_VAR position in stmt is POS.  */
> > static void
> > create_new_stmts_for_cond_expr_1 (tree new_var, tree cond_stmt, bool
pos)
> > {
>
> Missing documentation for POS and COND_STMT.

done.

>
> s/correspondent/corresponding/

done.

>
> >   new_cond = copy_node (COND_EXPR_COND (cond_stmt));
> >
>
> Better call unshare_expr() here.  Similarly in other calls to
> copy_node that copy a statement.

done.

>
> >   new_stmt = build3 (COND_EXPR,TREE_TYPE (cond_stmt),
> >            new_cond, NULL_TREE, NULL_TREE);
>
> Space after first comma.

done.

>
> >   if (!list_p)
> >     return NULL;
> >   else
> >     {
> >       node = list_p;
> >       while (node)
> >    {
> >      if (node->orig_var == decl)
> >        return node;
> >      node = node->next;
> >    }
> >     }
> >   return NULL;
>
> As before.  Wouldn't it be better to implement this on a pointer-map
> or hash table?

done.

>
> > /* Stage 3.  */
> > static void
> > do_reorg (void)
>
> Stage 3?  What does that mean?

a comment is added.

>
>
> >   /* Do reorg for each funciton separately.  */
>
> s/funciton/function/

done.

>
>
> Several functions have documentation missing for arguments.  Also,
> there should be a blank line between function comments and the start
> of the function.

done.

>
> All in all, the patch seems fine.  The major things I noticed were
> the, perhaps unnecessary, use of manually-maintained single-linked
> lists and the O(N) searching.  Those could be solved with VECs and
> pointer-maps/hashtables.

no link lists any more, only vectors and hashtables.

Bootstrapped on ppc970.

Ok for mainline?

Olga

ChangeLog

2007-10-21  Olga Golovanevsky  <olga@il.ibm.com>

      * ipa-struct-reorg.c, ipa-struct-reorg.h: New files.
      * tree-pass.h: Add pass_ipa_struct_reorg.
      * common.opt: Add ipa-struct-reorg flag.
      * Makefile.in: Add ipa-strcut-reorg.o compilation.
      * passes.c: Add pass pass_ipa_struct_reorg.
      * params.h:  Add STRUCT_REORG_COLD_STRUCT_RATIO.
      * params.def: Add PARAM_STRUCT_REORG_COLD_STRUCT_RATIO.

(See attached file: ipa-struct-reorg.txt)(See attached file:
ipa-struct-reorg.h)(See attached file: ipa-struct-reorg.c)

Attachment: ipa-struct-reorg.txt
Description: Text document

Attachment: ipa-struct-reorg.h
Description: Binary data

Attachment: ipa-struct-reorg.c
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]