Differences between revisions 36 and 37
Revision 36 as of 2009-03-28 10:41:34
Size: 8792
Revision 37 as of 2009-04-03 21:48:33
Size: 7103
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:

The branch is merged as of SVN revision 145494.
Line 101: Line 103:
== Pre-Merge TODO ==
Line 106: Line 106:
There are a number of tasks left to do, either on the branch or (prefered)
on trunk after merging the branch
There are a number of tasks left to do.
Line 109: Line 108:
 * push/pop_stmt_changes can be cleaned up, simplified, or even removed.  * push/pop_stmt_changes can be removed.
Line 143: Line 142:

== Testsuite status on x86_64, i?86 ==

FAILs that appear also on trunk and XPASSes are omitted.

=== gcc tests ===

FAIL: gcc.dg/uninit-B.c uninit i warning (test for warnings, line 12)
FAIL: gcc.dg/uninit-pr19430.c (test for warnings, line 32)
FAIL: gcc.dg/uninit-pr19430.c uninitialized (test for warnings, line 41)
Uninitialized memory warnings will no longer work, or rather, the current
implementation will be even more fragile. Action: uninit-B.c to be XFAILed again,
the rest to be XFAILed.
FAIL: gcc.dg/tree-ssa/ssa-fre-10.c scan-tree-dump pre "Insertions: 2"
Weird testcase, it is not clear what it was supposed to test. Action: Move to the compile torture.
Actually PPRE would perform insertion here. See PR38401 for a patch unleashing it (at -O3). There
is also missing DSE in this testcase (because DSE doesn't look through PHIs) and missing "FRE" (the
loads are all from uninitialized memory which we could detect and substitute an uninitialized register
instead). A really weird testcase.
FAIL: gcc.dg/tree-ssa/ssa-pre-24.c scan-tree-dump-not pre "prephitmp"
A new testcase on the branch. We shouldn't insert a prephitmp here.
The issue is that PHI-translating VUSEs should return the same VUSE if the expression does not
die on the other edges of the PHI (thus is loop invariant for example). In general PHI-translation
shouldn't change the expression if its value doesn't change. This may conflict with the purpose
of the VUSE being a stmt location, so further phi-translating a "non"-translated VUSE may confuse
the machinery.

Alias Improvements Branch

The branch is merged as of SVN revision 145494.

(see http://gcc.gnu.org/ml/gcc/2009-01/msg00286.html)

The primary objective of the branch is to transition away from the use of the virtual operand use-def chains as data-flow representation for memory optimizations. Memory optimizers need to transition to the use of the alias-oracle which is a query based system. For the foreseeable future they can still rely on the safety-net the virtual operand use-def chain provides.

Overview of infrastructure changes done on the branch:

  • There is a single memory symbol (.MEM) per function used to build the virtual operand use-def chain. In trunk terms you can view this as equivalent to having a single memory partition containing all symbols. Due to the lack of precision computing and keeping this virtual operand use-def chain is trivial, all code supporting more complex operations has been removed. In particular it can be assumed that at most a single VUSE and VDEF is available per statement. Most code has been simplified according to that assumption.
  • This single-symbol virtual operand use-def chain safety-net is also available during early optimizations (in fact, always if the IL is in SSA form). This removes the need for special-casing memory statements during that compilation phase and enables to use common infrastructure during early (and IPA) optimization phases. It also allows to schedule regular memory optimization passes during early optimization.
  • The virtual operand use-def chain is kept up-to-date by the operand-scanner on the branch. There is no need to manually mark any symbols for renaming (instead this is now useless and costly). As there cannot be magically more, less or different memory symbols required (there's only one) optimization passes usually know if they are removing a load or a store and so can act properly (or rely on the operand scanner which updates virtual SSA form manually in the simple cases of removing virtual operands and otherwise marks .MEM for renaming).
  • The at most two virtual operands (a VUSE and a VDEF) are now real operands (there is a vuse and a vdef member in the gimple structure for memory statements). This means they can share the infrastructure present for regular operands and immediate uses. All infrastructure dealing with the old virtual operands has been removed. The operand structures are linked in the normal use and defs lists, the special lists for virtual uses and defs has been removed. This means that operand iterators have been simplified a lot. The possibility to iterate over only virtual operands has been removed (there is only at most a single one, directly accessible via gimple_v{use,def}_op). Most bulk changes in optimizers had to be done because of this simplification.
  • Global data that was not kept in sync and is either unused on the branch or very hard to keep up-to-date has been removed. This includes the bitmap of addressable variables, the escape type per variable and pointer and the bitmaps of loaded and stored symbols on statements.
  • Points-to information has been abstracted and is directly used for pointer information as well as call-clobber related information. All existing call-clobber related code has been removed. Low-level operations on points-to information are implemented in tree-ssa-structalias.c.
  • Numerous fixes to the points-to analysis code have been done.
  • The alias-oracle present on the trunk has been extended to also use points-to alias information and it has been split up to be more easily available in different contexts. The implementation resides in tree-ssa-alias.c, interfaces are exported from tree-ssa-alias.h.
  • The FRE and PRE passes using the common SCCVN infrastructure have been improved by improving the memory statement walking infrastructure of the alias-oracle and its precision.
  • The tree DSE pass has been improved to compensate the loss of DSE on aliased symbols done by the tree DCE pass on the trunk.
  • The tree DCE pass has been improved to do DSE of non-aliased symbols and thus fully replaces the SDSE (simple DSE) early optimization pass.

The alias-oracle and its interface is contained in the tree-ssa-alias.[ch] files. Common with the trunk is the refs_may_alias_p type of query which has been accompanied by statement based clobber / use queries.

Preliminary performance measurements show comparable performance for SPEC 2000, Polyhedron and our usual set of C++ benchmarks. SPEC 2006 measurements are on the way, all on x86_64. Daily testing can be monitored beyond http://gcc.opensuse.org/, look for "alias-improvements" branch annotation and compare with the other runs on the frescobaldi machine.

In general the trade-off with the new scheme is that walking statements via the virtual use-def links will be slower because possibly (a lot) more statements are visited. This is offsetted by leaner operand iterators, cheaper operand scanning and updating and less memory use for maintaining the virtual operands. And most important, a lot less code to maintain.

Followup Improvements

There are a number of tasks left to do.

  • push/pop_stmt_changes can be removed.
  • The inliner can rewrite the virtual operands itself just like regular operands. The only difficulties are that new pure/const functions are exposed after IPA-pure-const and that new PHI-nodes need to be inserted to merge VDEFs in the inlined function to the VDEF of

    the call statement. Partial patch available.

  • The single pre-function gimple_vop (cfun) DECL can be made per CU. The disadvantage is that cross-function inconsistencies are harder to detect.
  • Most passes that mark symbols for renaming should not need to. This is mostly fixed.

  • PTA should keep both an unpruned points-to set and a TBAA-pruned points-to set. This allows per alias-oracle query decisions on whether TBAA should be applied or not. This also allows (a fixed - see its comments) merge_alias_info to do its job using the unpruned sets.
  • into-SSA could go into virtual SSA form directly, without doing an ssa update

    to rename the virtual operands after into-SSA. into-SSA expert needed.

  • In PTA we can generate a CALLUSED variable and its constraints for each pure call (just like we do with consttmp for const calls).
  • Simple mod/ref analysis can be done in PTA handle_rhs_call for builtins. call-used queries have to be extended to builtin function calls then (or finally a stmt local call-used points-to solution reference can be introduced).
  • PRE should compute a per-BB maximal set, unioning from the exit block upward the CFG. This should make the validity checks for the VUSEs unnecessary and speed up phi-translating the maximal set.
  • Add params for the magic number(s) in the DCE/DSE oracle query limits.
  • Add oracle query limits to SCCVN.

None: Alias_Improvements (last edited 2010-02-14 11:03:09 by RichardGuenther)