[tree-ssa] New type-based aliasing (was: More CFG improvements)
Diego Novillo
dnovillo@redhat.com
Thu Feb 6 13:49:00 GMT 2003
On Tue, 04 Feb 2003, Jeff Law wrote:
> >OK. I'll be doing a couple of tests with the alias sets idea.
> >If it works, I'll post a detailed summary.
> Please do. Though I'm 99.9% certain the alias sets idea won't help
> 20001226-1.c -- everything in that testcase is in the same alias set.
>
Precisely. It's because variables tend to cluster in the same
alias set that we can do this so quickly.
Instead of comparing every variable against every variable, we
compare their alias sets. The new algorithm builds an array of
alias sets where each entry represents a group of variables that
have conflicting types. It works in two phases:
(1) Register alias sets for every INDIRECT_REF. If an
INDIRECT_REF is found to conflict with more than one alias
set (I and J), the entries for I and J are merged into one.
This pass is O(I x S). Where I is the number of INDIRECT_REF
variables in the program and S the number of non-conflicting
types. In the GCC sources, S is a very small number (1-3).
(2) Associate addressables and INDIRECT_REFs to one of the sets
built in (1). Variables are added to the first entry in the
alias sets array that conflicts with it.
This pass is O((A + I) x S) where A is the number of
addressable variables. I and S are as above.
So, we replace an almost quadratic algorithm with a an almost
linear one. In a full bootstrap of the compiler, the most time
we spend doing alias analysis is in java/parse.c (0.98 seconds).
So, the alias analysis pass is fast, very fast. The information
it gives back stinks (did I mention fast?), but type-based
aliasing is supposed to stink. We could probably make it less
compulsive in aggregating alias sets, but I'd rather spend our
energy on getting Daniel's PTA working.
Your favorite testcase can now be enabled again. These are the
timings for 20001226-1.c:
-----------------------------------------------------------------------------
Execution times (seconds)
garbage collection : 0.44 ( 1%) usr 0.02 ( 2%) sys 0.46 ( 1%) wall
cfg construction : 0.31 ( 1%) usr 0.00 ( 0%) sys 0.31 ( 1%) wall
cfg cleanup : 4.52 (11%) usr 0.00 ( 0%) sys 4.51 (11%) wall
trivially dead code : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall
life analysis : 1.11 ( 3%) usr 0.07 ( 8%) sys 1.15 ( 3%) wall
life info update : 0.18 ( 0%) usr 0.00 ( 0%) sys 0.19 ( 0%) wall
preprocessing : 0.21 ( 1%) usr 0.09 (11%) sys 0.26 ( 1%) wall
lexical analysis : 0.19 ( 0%) usr 0.17 (20%) sys 0.47 ( 1%) wall
parser : 0.64 ( 2%) usr 0.11 (13%) sys 0.70 ( 2%) wall
integration : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall
tree gimplify : 0.67 ( 2%) usr 0.04 ( 5%) sys 0.70 ( 2%) wall
tree CFG construction : 0.06 ( 0%) usr 0.02 ( 2%) sys 0.08 ( 0%) wall
tree alias analysis : 0.15 ( 0%) usr 0.00 ( 0%) sys 0.15 ( 0%) wall
tree PHI insertion : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall
tree SSA rewrite : 0.11 ( 0%) usr 0.01 ( 1%) sys 0.11 ( 0%) wall
tree SSA other : 0.21 ( 1%) usr 0.07 ( 8%) sys 0.28 ( 1%) wall
tree CCP : 0.44 ( 1%) usr 0.02 ( 2%) sys 0.45 ( 1%) wall
tree DCE : 0.18 ( 0%) usr 0.00 ( 0%) sys 0.18 ( 0%) wall
dominance frontiers : 0.16 ( 0%) usr 0.12 (14%) sys 0.29 ( 1%) wall
expand : 0.27 ( 1%) usr 0.02 ( 2%) sys 0.29 ( 1%) wall
jump : 0.13 ( 0%) usr 0.01 ( 1%) sys 0.13 ( 0%) wall
CSE : 2.64 ( 6%) usr 0.00 ( 0%) sys 2.65 ( 6%) wall
loop analysis : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall
branch prediction : 4.80 (12%) usr 0.02 ( 2%) sys 4.82 (11%) wall
flow analysis : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall
combiner : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.15 ( 0%) wall
if-conversion : 9.74 (24%) usr 0.00 ( 0%) sys 9.74 (23%) wall
mode switching : 0.15 ( 0%) usr 0.01 ( 1%) sys 0.16 ( 0%) wall
local alloc : 0.17 ( 0%) usr 0.01 ( 1%) sys 0.17 ( 0%) wall
global alloc : 7.64 (18%) usr 0.02 ( 2%) sys 7.66 (18%) wall
reload CSE regs : 0.23 ( 1%) usr 0.00 ( 0%) sys 0.23 ( 1%) wall
flow 2 : 0.14 ( 0%) usr 0.00 ( 0%) sys 0.14 ( 0%) wall
if-conversion 2 : 4.87 (12%) usr 0.00 ( 0%) sys 4.87 (12%) wall
rename registers : 0.13 ( 0%) usr 0.01 ( 1%) sys 0.15 ( 0%) wall
shorten branches : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall
final : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall
rest of compilation : 0.24 ( 1%) usr 0.00 ( 0%) sys 0.23 ( 1%) wall
TOTAL : 41.35 0.84 42.20
-----------------------------------------------------------------------------
The patch also "fixes" the problem we've been having with VLAs.
It's a kludge, but it's so small and easy to rip out that we
might as well have it. Before computing aliases, we walk_tree
all the lexical blocks in the function looking for VAR_DECLs
inside an array declaration and mark those variables. DCE will
then consider stores to those variables as inherently useful.
Bootstrapped and tested x86.
* tree-dfa.c (struct alias_tags, alias_tags, num_alias_tags):
Remove. Update all users.
(struct alias_set_d): New.
(alias_sets): New file local variable.
(compute_alias_sets): New function.
(compute_may_aliases): Call it when not doing points-to analysis.
(register_alias_set): New function.
(find_alias_for): New function.
(may_alias_p): Declare static.
Don't assume that VAR may not be aliased if it's a non-addressable
_DECL.
If VAR and PTR are aggregate types, check if they can have a field
that points to the other one.
(find_may_aliases_for): Move handling of global memory aliasing ...
(add_may_alias): ... here.
Also accept the base symbols for the variable and its alias.
(register_new_alias): Remove. Update all users.
(find_alias_tag): Remove. Update all users.
(find_vars_r): Update VAR after re-writing *TP when sharing
INDIRECT_REF nodes.
* tree-flow.h (may_alias_p): Remove declaration.
* tree-ssa.c (rewrite_into_ssa): Include referenced variables in
default debug dumps.
Support for VLAs.
* tree-dfa.c (find_vla_decls): New function.
(compute_may_aliases): Call it.
(find_vla_decls_r): New function.
(dump_variable): Show whether the variable is used in a VLA
declaration.
* tree-flow-inline.h (is_vla_decl): New function.
(set_vla_decl): New function.
* tree-flow.h (struct var_ann_d): Add bitfield 'is_vla_decl'.
* tree-ssa-dce.c (need_to_preserve_store): Return true if SYM is
used inside a VLA declaration.
testsuite/ChangeLog.tree-ssa:
* gcc.c-torture/compile/20001226-1.c: Remove deliberate syntax
error.
More information about the Gcc-patches
mailing list