This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: struct-reorg optimization
- From: Jan Hubicka <hubicka at ucw dot cz>
- To: Olga Golovanevsky <OLGA at il dot ibm dot com>
- Cc: Daniel Berlin <dberlin at dberlin dot org>, Diego Novillo <dnovillo at google dot com>, Kenneth Zadeck <zadeck at naturalbridge dot com>, Jan Hubicka <jh at suse dot cz>, gcc-patches at gcc dot gnu dot org
- Date: Fri, 20 Jul 2007 15:48:01 +0200
- Subject: Re: struct-reorg optimization
- References: <OF7F6DF635.65D3E497-ONC2257314.0031B431-C2257314.0031D9BD@il.ibm.com>
Olga,
I made first pass through the patch. It seems sane overall, I will get
into more detail on the new pass itself in followup.
Can you please split ipa-type-escape changes into separate patch? I
think those can go in quite easilly independently.
Since you started on teaching ipa-type-escape about SSA world, it would
be nice to move it away from walk_tree into use of SSA operand
infrastructure like our regular IPA passes does. But that would be
independent patch anyway.
Honza
>
> This patch implements structure peeling, that is one of struct-reorg
> optimizations developed on struct-reorg-branch. A structure type can be
> peeled into separate fields, or groups of them, according to profile
> information, or deliberately. The gain is to utilize spatial locality,
> when, for example, one of the fields is sequentially accessed through
> array elements. Then separating this field from others allows to make
> separation in array allocation, and thus release cache from irrelevant
> data fetched with other fields.
>
> The full description of this optimization is at the beginning of
> struct-reorg.c (attached below).
>
> The optimization is global, both types of accesses - through pointers
> and array refs - are handled. When allocation is made through malloc,
> it is replaced by allocations of peeled structures. Only build-in malloc
> is supported right now, but it can be easily extended.
>
> In this patch, the decision how to peel is made by frequency based model,
> where frequency of field accesses can be calculated through profiling or
> statically predicted as with -fbranch-probabilitied flag. When there is
> no such info, structure gets completely peeled. (The struct-reorg-branch
> contains much more accurate model which is based on the calculation of
> distance between accesses, but it's also heavier.)
> The -fipa-struct-reorg flag activates this optimization.
>
> The patch makes strong use of ipa-type-escape analysis, that provide
> safety of this optimization. Also changes required for ipa-type-escape
> to support POITER_PLUS_EXPR are part of this patch.
>
> With this patch:
> - two structures f1_neuron and xyz got peeled in 179. art ,
> +44% w/o profiling and +33% with profiling
>
> - the structures NEXT_MOVE got peeled, the structure CHESS_POSITION
> can be peeled, but interfere with another optimization that
> generated BIT_FIELD_REFs in 186.crafty; many other structures escape,
> no influence on performance
>
> - the structure MMNODE got peeled; many others escape,
> no influence on performance
>
> Bootstraped and tested (except fortran) on ppc .
>
> :ADDPATCH ipa ssa:
>
> Ok for mainline?
>
> 2007-07-09 Olga Golovanevsky <olga@il.ibm.com>
>
> * struct-reorg.c, struct-reorg.h: New files.
> * ipa-type-escape.h: Expose function
> is_array_access_through_pointer_and_index.
> * ipa-type-escape.c
> (is_array_access_through_pointer_and_index):
> Add three new parameters. Add support of
> POINTER_PLUS_EXPR code.
> * tree.h: Add pass_ipa_struct_reorg.
> * common.opt: Add ipa-struct-reorg flag.
> * Makefile.in: Add strcut-reorg.o compilation.
> * passes.c: Add pass pass_ipa_struct_reorg.
>
> (See attached file: struct-reorg.txt)(See attached file: struct-reorg.c)
> (See attached file: struct-reorg.h)
> Index: tree-pass.h
> ===================================================================
> --- tree-pass.h (revision 126190)
> +++ tree-pass.h (working copy)
> @@ -331,6 +331,7 @@
> extern struct tree_opt_pass pass_ipa_pure_const;
> extern struct tree_opt_pass pass_ipa_type_escape;
> extern struct tree_opt_pass pass_ipa_pta;
> +extern struct tree_opt_pass pass_ipa_struct_reorg;
> extern struct tree_opt_pass pass_early_local_passes;
> extern struct tree_opt_pass pass_ipa_increase_alignment;
> extern struct tree_opt_pass pass_ipa_function_and_variable_visibility;
> Index: ipa-type-escape.c
> ===================================================================
> --- ipa-type-escape.c (revision 126190)
> +++ ipa-type-escape.c (working copy)
> @@ -926,56 +926,90 @@
>
> */
>
> -static bool
> -is_array_access_through_pointer_and_index (tree op0, tree op1)
> +bool
> +is_array_access_through_pointer_and_index (enum tree_code code, tree op0, tree op1,
> + tree * base, tree * offset,
> + tree * offset_cast_stmt)
> {
> - tree base, offset, offset_cast_stmt;
> tree before_cast, before_cast_def_stmt;
> cast_t op0_cast, op1_cast;
>
> + *base = NULL;
> + *offset = NULL;
> + *offset_cast_stmt = NULL;
> +
> /* Check 1. */
>
> - /* Init data for walk_use_def_chains function. */
> - op0_cast.type = op1_cast.type = 0;
> - op0_cast.stmt = op1_cast.stmt = NULL;
> + if (code == POINTER_PLUS_EXPR)
> + {
> + tree op0type = TYPE_MAIN_VARIANT (TREE_TYPE (op0));
> + tree op1type = TYPE_MAIN_VARIANT (TREE_TYPE (op1));
>
> - visited_stmts = pointer_set_create ();
> - walk_use_def_chains (op0, is_cast_from_non_pointer,(void *)(&op0_cast), false);
> - pointer_set_destroy (visited_stmts);
> + /* One of op0 and op1 is of pointer type and the other is numerical. */
> + if (POINTER_TYPE_P (op0type)
> + && NUMERICAL_TYPE_CHECK (op1type))
> + {
> + *base = op0;
> + *offset = op1;
> + }
> + else if (POINTER_TYPE_P (op1type)
> + && NUMERICAL_TYPE_CHECK (op0type))
> + {
> + *base = op1;
> + *offset = op0;
> + }
> + else
> + return false;
> + }
> + else
> + {
> + /* Init data for walk_use_def_chains function. */
> + op0_cast.type = op1_cast.type = 0;
> + op0_cast.stmt = op1_cast.stmt = NULL;
>
> - visited_stmts = pointer_set_create ();
> - walk_use_def_chains (op1, is_cast_from_non_pointer,(void *)(&op1_cast), false);
> - pointer_set_destroy (visited_stmts);
> + visited_stmts = pointer_set_create ();
> + walk_use_def_chains (op0, is_cast_from_non_pointer,(void *)(&op0_cast), false);
> + pointer_set_destroy (visited_stmts);
>
> - if (op0_cast.type == 1 && op1_cast.type == 0)
> - {
> - base = op1;
> - offset = op0;
> - offset_cast_stmt = op0_cast.stmt;
> + visited_stmts = pointer_set_create ();
> + walk_use_def_chains (op1, is_cast_from_non_pointer,(void *)(&op1_cast), false);
> + pointer_set_destroy (visited_stmts);
> +
> + if (op0_cast.type == 1 && op1_cast.type == 0)
> + {
> + *base = op1;
> + *offset = op0;
> + *offset_cast_stmt = op0_cast.stmt;
> + }
> + else if (op0_cast.type == 0 && op1_cast.type == 1)
> + {
> + *base = op0;
> + *offset = op1;
> + *offset_cast_stmt = op1_cast.stmt;
> + }
> + else
> + return false;
> }
> - else if (op0_cast.type == 0 && op1_cast.type == 1)
> - {
> - base = op0;
> - offset = op1;
> - offset_cast_stmt = op1_cast.stmt;
> - }
> - else
> - return false;
>
> /* Check 2.
> offset_cast_stmt is of the form:
> D.1606_7 = (struct str_t *) D.1605_6; */
>
> - before_cast = SINGLE_SSA_TREE_OPERAND (offset_cast_stmt, SSA_OP_USE);
> - if (!before_cast)
> - return false;
> + if (*offset_cast_stmt)
> + {
> + before_cast = SINGLE_SSA_TREE_OPERAND (*offset_cast_stmt, SSA_OP_USE);
> + if (!before_cast)
> + return false;
>
> - if (SSA_NAME_IS_DEFAULT_DEF(before_cast))
> - return false;
> + if (SSA_NAME_IS_DEFAULT_DEF(before_cast))
> + return false;
>
> - before_cast_def_stmt = SSA_NAME_DEF_STMT (before_cast);
> - if (!before_cast_def_stmt)
> - return false;
> + before_cast_def_stmt = SSA_NAME_DEF_STMT (before_cast);
> + if (!before_cast_def_stmt)
> + return false;
> + }
> + else
> + before_cast_def_stmt = SSA_NAME_DEF_STMT (*offset);
>
> /* before_cast_def_stmt should be of the form:
> D.1605_6 = i.1_5 * 16; */
> @@ -1449,7 +1483,6 @@
> okay_pointer_operation (enum tree_code code, tree op0, tree op1)
> {
> tree op0type = TYPE_MAIN_VARIANT (TREE_TYPE (op0));
> - tree op1type = TYPE_MAIN_VARIANT (TREE_TYPE (op1));
>
> switch (code)
> {
> @@ -1459,11 +1492,15 @@
> break;
> case MINUS_EXPR:
> case PLUS_EXPR:
> + case POINTER_PLUS_EXPR:
> {
> - if (POINTER_TYPE_P (op1type)
> + tree base, offset, offset_cast_stmt;
> +
> + if (POINTER_TYPE_P (op0type)
> && TREE_CODE (op0) == SSA_NAME
> && TREE_CODE (op1) == SSA_NAME
> - && is_array_access_through_pointer_and_index (op0, op1))
> + && is_array_access_through_pointer_and_index (code, op0, op1, &base,
> + &offset, &offset_cast_stmt))
> return true;
> else
> {
> @@ -1541,7 +1578,7 @@
> of circumstances and if the moon is in the correct
> place could be safe, but it is hard to see how this
> is worth the effort. */
> -
> +
> if (type0 && POINTER_TYPE_P (type0)
> && !okay_pointer_operation (TREE_CODE (rhs), op0, op1))
> mark_interesting_type (type0, FULL_ESCAPE);
> Index: ipa-type-escape.h
> ===================================================================
> --- ipa-type-escape.h (revision 126190)
> +++ ipa-type-escape.h (working copy)
> @@ -27,6 +27,7 @@
> bool ipa_type_escape_field_does_not_clobber_p (tree record_type, tree field_type);
> int ipa_type_escape_star_count_of_interesting_type (tree type);
> int ipa_type_escape_star_count_of_interesting_or_array_type (tree type);
> +bool is_array_access_through_pointer_and_index (enum tree_code, tree, tree, tree *, tree *, tree *);
>
>
> #endif /* GCC_IPA_TYPE_ESCAPE_H */
> Index: common.opt
> ===================================================================
> --- common.opt (revision 126190)
> +++ common.opt (working copy)
> @@ -601,6 +601,11 @@
> Perform matrix layout flattening and transposing based
> on profiling information.
>
> +fipa-struct-reorg
> +Common Report Var(flag_ipa_struct_reorg)
> +Perform structure layout optimizations based
> +on profiling information.
> +
> fivopts
> Common Report Var(flag_ivopts) Init(1) Optimization
> Optimize induction variables on trees
> Index: Makefile.in
> ===================================================================
> --- Makefile.in (revision 126190)
> +++ Makefile.in (working copy)
> @@ -1185,6 +1185,7 @@
> ipa-utils.o \
> ipa.o \
> matrix-reorg.o \
> + struct-reorg.o \
> tree-inline.o \
> tree-nomudflap.o \
> varpool.o
> @@ -2449,6 +2450,10 @@
> pointer-set.h $(GGC_H) $(IPA_TYPE_ESCAPE_H) $(IPA_UTILS_H) $(C_COMMON_H) \
> $(TREE_GIMPLE_H) $(CGRAPH_H) output.h $(FLAGS_H) tree-pass.h \
> $(DIAGNOSTIC_H) $(FUNCTION_H)
> +struct-reorg.o: struct-reorg.c struct-reorg.h $(CONFIG_H) $(SYSTEM_H) coretypes.h \
> + $(TM_H) $(RTL_H) $(TREE_H) $(FLAGS_H) output.h $(REGS_H) $(EXPR_H) $(FUNCTION_H) \
> + toplev.h $(GGC_H) $(TARGET_H) langhooks.h $(COVERAGE_H) libfuncs.h \
> + gt-coverage.h $(HASHTAB_H) $(IPA_TYPE_ESCAPE_H)
>
> coverage.o : coverage.c $(GCOV_IO_H) $(CONFIG_H) $(SYSTEM_H) coretypes.h \
> $(TM_H) $(RTL_H) $(TREE_H) $(FLAGS_H) output.h $(REGS_H) $(EXPR_H) \
> @@ -3011,7 +3016,7 @@
> $(srcdir)/reload.h \
> $(srcdir)/alias.c $(srcdir)/bitmap.c $(srcdir)/cselib.c $(srcdir)/cgraph.c \
> $(srcdir)/ipa-prop.c $(srcdir)/ipa-cp.c $(srcdir)/ipa-inline.c $(srcdir)/matrix-reorg.c \
> - $(srcdir)/dbxout.c $(srcdir)/dwarf2out.c $(srcdir)/dwarf2asm.c \
> + $(srcdir)/dbxout.c $(srcdir)/struct-reorg.c $(srcdir)/dwarf2out.c $(srcdir)/dwarf2asm.c \
> $(srcdir)/dojump.c \
> $(srcdir)/emit-rtl.c $(srcdir)/except.c $(srcdir)/explow.c $(srcdir)/expr.c \
> $(srcdir)/function.c $(srcdir)/except.h \
> Index: passes.c
> ===================================================================
> --- passes.c (revision 126190)
> +++ passes.c (working copy)
> @@ -542,6 +542,7 @@
> NEXT_PASS (pass_ipa_pure_const);
> NEXT_PASS (pass_ipa_type_escape);
> NEXT_PASS (pass_ipa_pta);
> + NEXT_PASS (pass_ipa_struct_reorg);
> *p = NULL;
>
> /* These passes are run after IPA passes on every function that is being