[trunk]: patch to properly build the conflict graph

Kenneth Zadeck zadeck@naturalbridge.com
Thu Sep 13 14:19:00 GMT 2007

This patch replaces the code in global.c that builds the interference
graph and the reload insn chain.  The new technique differs from the old
in several ways:

1) The new technique uses the DF_LIVE problem, which is the intersection
of the forwards live information and the backwards live information.
This is the dataflow problem that the rest of the back end of the
compiler uses.  The existing implementation is based on a custom problem
(UREC) that was "converted" to use the df framework from
global.c:make_accurate_live_analysis, but was still a bastard problem.
A significant amount of this patch is the removal of the UREC dataflow

2) The new technique properly tracks subreg usage, at least locally.
When a use of a subreg is seen, it only makes the part of the subreg
that it accesses live, not the whole register.  When a set is seen for
that same part of the register, the set is allowed to kill the entire
register.  This part seems to account for most of the improvement
seen.  This tracking is at the resolution of bytes and only for SUBREG
operations, it is assumed that the entire register is accessed in the
case of a ZERO_EXTRACT and its friends.

3) The new formulation uses a backwards scan rather than a forwards
scan.  This is the way that every other compiler in the world builds
the interference graph, and for good reason.  There is no use of
REG_DEAD or REG_UNUSED notes when building the graph.  The
combination of a backwards scan at tracking the subregs actually
discovers fewer interferences:

consider the case where a multiword reg gets mapped into a series of
hard regs but only one of those hardregs is actually used.  In a
forwards scan, any register live will interfere with ANY of those hard
regs.  In a backwards scan, the uses cause regs to go live, so only the
hard regs actually used will create interferences.

There have been proposals to generate SUBREG_DEAD and
SUBREG_UNUSED notes but the proper thing to do is a backwards
scan.  It may be that the current reload/global stack is not up to taking
advantage of this yet, but vlad has plans for replacing the ra, and
hopefully the new one will be equipped to use this.

Additionally this patch makes the DF_*_TOP sets unnecessary.  These
sets represented the dataflow information at the top of the block but
after consideration of the artificial uses and defs at the top of the
block.  These take both time and space to represent, and are
unnecessary if backward scans are used.

4) Early clobbers are handled more precisely.  In the UREC problem, used
in the existing implementation, as well as
global.c:make_accurate_live_analysis before the df commit, an early
clobber forces any input reg that could be allocated to the
early_clobbering output reg to be live, starting at the beginning of the
basic block.  This is insane, especially when optimizations have been
performed to try to make large basic blocks.

The new implementation implements the exact definition of early
clobber:  any input reg that dies in the insn and could be allocated to
the same register as the early clobber output is made to interfere with
that output.

5) The interference graph builder and the insn chain builder do not
scan insns, they use the df_scan information.

6) There is nothing like the call record_conflicts in the new code.
Chaitin showed that this was useless 20 years ago.

7) The code that set the preferences has been moved into global itself
since this will most likely die in vlad's new implementation.

I put the new conflict builder in a new file ra-conflicts.c since this
code is likely to survive vlad's new ra; it seemed like this would make
that integration easier.

This code has been bootstrapped and regression tested on ppc-64, ppc-32
x86-64, x86-32, ia-64, and the spu.  sparc testing is in progress.

I apologize for the large patch.  The problem with messing around with
the register allocators/reload is that there is sometimes no place to
stop.   Just changing the interference graph builder to do a backwards
scan, admittedly a more precise scan, slowed things down because it
meant that we had to add the calculation of the DF_*_TOP sets to the
DF_LIVE problem which is used through out the back end.  To get rid of
that, we changed the build_insn_chain to also use a backwards scan and
then we changed local to just use more conservative information.

We are seeing small gains on spec benchmarks for the x86-64.  We hope
to see the same kinds of gains for the ppc-64 which is in progress.
However, a lot of the gain from this patch is lost on the current
register allocator.  I expect that this patch will enhance vlads
better allocator more the current global allocator.

This patch slows down the compilation of the gcc .i files at -O2 by
less than .05% on the x86-64.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: nourec10.diff
Type: text/x-patch
Size: 139836 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20070913/44dcc3f8/attachment.bin>

More information about the Gcc-patches mailing list