This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [trans-mem][rfc] Improvements to uninstrumented code paths
- From: Torvald Riegel <triegel at redhat dot com>
- To: Richard Henderson <rth at redhat dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Aldy Hernandez <aldyh at redhat dot com>
- Date: Thu, 08 Nov 2012 13:33:09 +0100
- Subject: Re: [trans-mem][rfc] Improvements to uninstrumented code paths
- References: <509AE85C.2000107@redhat.com>
On Wed, 2012-11-07 at 15:01 -0800, Richard Henderson wrote:
> I wrote the second of these patches first, and I'm uncertain about the
> desirability of the first of the patches.
>
> While working on the uninstrumented code path bulk patch, I noticed that
> nested transactions within the copy of the outermost transaction
I assume "within" refers to within the same static scope of the
function?
> were
> not being processed for an uninstrumented code path, and so were only
> receiving an instrumented path. This is clearly less than ideal when
> considering HTM.
Yes, it would be better if nested transactions would have an
uninstrumented code path too.
> Now, it seemed to me that if we're already in an uninstrumented code
> path, we're extremely likely to want to stay in one. This is certainly
> true for HTM, as well as when we've selected the serialirr method. I
> can't think off hand of any other reason we'd be on the uninstrumented
> code path.
>
> Therefore the second patch arranges for all nested transactions in the
> uninstrumented path to _only_ have uninstrumented paths themselves.
This is only possible if the nested transaction cannot abort. If it
potentially can, then we must have an instrumented code path for it.
> While reviewing the results of this second patch in detail, I noticed
> that nested transactions on the instrumented code paths were not
> generating both instrumented and uninstrumented code paths. My first
> reaction was that this was a bug, and so I wrote the first patch to
> fix it.
>
> But as I was reviewing the patch to write the changelog, I began to
> wonder whether the same logic concerning the original instrumented/
> uninstrumented choice applies as well to the instrumented path.
>
> Is it ever likely that we'd choose an uninstrumented path for a
> nested transaction, given that we're already executing the instrumented
> path for an outer transaction?
It could be a performance benefit in cases like
__transaction_relaxed {
// something
if (foo) {
unsafe_action();
transaction_relaxed {
// lots of code
}
}
}
Here, we go serial at unsafe_action and can then cause less overall
slowdown when running "lots of code" as uninstrumented (but it's just
avoiding the per-memory-access function calls). If the txn can acquire
the serial lock and commit afterwards, then it doesn't have to restart
at the outermost txn.
I can't say whether such cases would appear often enough in practice to
offset the code size increase (and thus potential slowdowns) caused by
having uninstrumented code paths available too.
> It now seems to me that the only time we'd switch from instrumented
> to uninstrumented code path would be during a transaction restart,
> after having selected to retry with a serialirr method.
>
> Which means that I should apply the second patch only,
>
> Thoughts?
Another thing to consider is whether flat nesting could be used: for
nested transactions that have hasNoAbort set, we can elide txn
begin/commit calls and just embed the nested txn body in the enclosing
txn.
This would be a benefit on uninstrumented/uninstrumented code path pairs
because the nested txn can't do anything but keep running. For
instrumented/instrumented (with hasNoAbort set), we could also flatten
but there is a performance trade-off: the TM runtime lib could do closed
nesting on data conflicts, so that it tries to roll back only the inner
txn on a conflict. The potential benefit of such a scheme depends a lot
on the workload, and libitm currently also flattens txns that do not
abort. OTOH, the overhead of the begin/commit calls for nested txns
should also be relatively small compared to overall STM/instrumented
overheads.
So, to summarize:
- If the txn could abort, then it must have instrumented and could have
uninstrumented.
- If the nested txn does not abort, then:
- For uninstrumented / uninstrumented nesting, flat nesting makes
sense (and thus we don't need instrumented for the nested).
- For instrumented / instrumented, we could flatten, but unsure
whether that significantly improves performance.
- Nested uninstrumented could be useful with enclosing nested, but
perhaps this doesn't happen often enough to offset the code size
increase.
Thoughts?
Torvald