This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [trans-mem] rewrite transaction lowering
- From: Albert Cohen <Albert dot Cohen at inria dot fr>
- To: Richard Henderson <rth at redhat dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Fri, 24 Oct 2008 00:07:07 +0200
- Subject: Re: [trans-mem] rewrite transaction lowering
- References: <48F8F7D7.9020505@redhat.com>
Richard Henderson wrote:
The lowering of transactions are now much more to my liking. It's a 5
step process, described in detail in a block comment at the beginning of
gtm-low.c. The important functional changes are (1) we properly commit
the transaction when we leave the region via an exception, and (2) we
accurately describe the CFG, and the abnormal edges created by the
longjmp used to restart or abort the transaction.
Thanks a lot Richard for getting so quickly into the GTM code. Martin
and I are thrilled to see that some of our long-standing issues have
been solved so nicely.
Still to do is rtl expansion of the TM_LOAD/STORE nodes, rewriting
creation of the transactional clones as real IPA pass, duplicating the
transaction region without annotations as a fast-path when the
transaction gets (re-)started in sequential mode.
Indeed.
We also expect much performance boost by movint into a real IPA pass.
Being able to mark the read/write barriers with the proper VUSE and VDEF
is known to seriously inhibit optimizations that would normally apply on
the non-instrumented variant of a function (or outside a tm_atomic block).
We also expect many redundancies to disappear once the instrumentation
(not only the checkpointing) will operate on SSA form, after PRE removes
redundant load/store instructions.
It may also help a lot to remove redundant barriers associated with
already acquired read/write locks from previous ones (assuming a write
lock subsumes a read lock, many redundancies occur, see Tabatabai and
al's CGO'07 paper).
We have seen these problems occuring in real-world parallelization
experiments, when extending the par-loops pass with automatic TM
insertion for generalized reduction support. Andrea Marongiu from U.
Bologna is investigating this issue, and any feedback or idea or
benchmark motivating this approach would be very welcome.
What I generate is no longer compatible with TinySTM. It's closer to
what I believe the final ABI should look like. At some point I'll
either import some existing GPL STM library (TinySTM, unless someone has
a better suggestion) or start a new one from scratch. We'll see...
I just notice one stupid thing that reminds me of some license problem
we had with Graphite... TinySTM is GPLv2, not GPLv2+ :-( I guess the
problem could be trivially solved by updating the license in this case,
as the copyright holders are well identified. Yet TinySTM is not a very
big piece of code, and it would indeed make sense to derive a
GCC-specific STM from it. I don't see any urgency here, however.
Also, one emphasis of our GTM project is to help TM research (Martin
Schindewolf is funded by the HiPEAC european research network,
http://www.hipeac.net). This means being able to retarget GTM to various
TM runtimes, including hybrid ones (e.g., for Sun's HW, or for
simulators). This emphasis should not inhibit a quickpath to a
production-level support for TM in GCC, but interchangeability of the
runtime is clearly important, as the existing runtimes are far from
mature and many improvements could arise from third-party research and
developments.
Albert