This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: RFC: S/390 Transactional memory support - save/restore of FPRs
- From: Torvald Riegel <triegel at redhat dot com>
- To: Andreas Krebbel <krebbel at linux dot vnet dot ibm dot com>
- Cc: gcc at gcc dot gnu dot org
- Date: Tue, 21 May 2013 16:28:08 +0200
- Subject: Re: RFC: S/390 Transactional memory support - save/restore of FPRs
- References: <20130521124056 dot GA16148 at bart>
On Tue, 2013-05-21 at 14:40 +0200, Andreas Krebbel wrote:
> Hi,
>
> I'm currently implementing support for hardware transactional memory
> in the S/390 backend and ran into a problem with saving and restoring
> the floating point registers.
>
> On S/390 the tbegin instruction starts a transaction. If a subsequent
> memory access collides with another the transaction is aborted. The
> execution then continues *after* the tbegin instruction. All memory
> writes after the tbegin are rolled back, the general purpose registers
> selected in the tbegin operand are restored, and the condition code is
> set in order indicate that an abort occurred. What the code then is
> supposed to do is to check the condition code and either jump back to
> the transaction if it is a temporary failure or provide an alternate
> implementation using e.g. a lock.
>
> Unfortunately our tbegin instruction does not save the floating point
> registers leaving it to the compiler to make sure the old values get
> restored. This will be necessary if the abort code relies on these
> values and the transaction body modifies them.
You could also start with supporting s390 HTM through the transactional
language constructs we already support (__transaction_atomic etc.) and
libitm. The advantage would be that you can reuse quite a few bits of
existing machinery (e.g., different fallbacks when the HTM can't execute
a certain transaction, some analyses on the compilation side); however,
this doesn't give programmers as much control as if using the HTM
directly, and it requires a function call on begin and commit when using
the current libitm ABI.
(I know that this is kind of a side note, because you seem to be looking
for a way to expose this at the granularity of HTM begin/commit builtins
(e.g., to base lock elision implementations on top of it); but I think
that in the long run txnal language constructs are easier for many
users.)
> With my current approach I try to place FPR clobbers to trigger GCC
> generating the right save/restore operations. This has some
> drawbacks:
>
> - Bundling the clobbers with the tbegin causes FPRs to be restored
> even in the good path (the transaction never aborts).
>
> - Placing the clobbers on the abort path kinda works. However it is
> not really correct. GCC could decide to wrap the save/restore
> operations just around the clobbers what would be wrong. A solution
> to that might be to (that's what I'm currently working on):
>
> - Bundle the tbegin with the condtional jump to the abort code in
> order to prevent GCC from saving the FPRs right after the tbegin.
>
> - Direct an abnormal edge to the abort code to tell GCC that the
> FPRs are actually clobbered from somewhere outside (as with EH).
>
> Does this sound reasonable?
>
> The point is that not all the execution paths through tbegin
> actually clobber FPRs. It is only true for the paths which lead to
> the abort code in the end. So another solution might be to
> implement support for conditional clobbers. Clobbers wrapped into a
> cond_exec perhaps. I'm not sure how difficult this would be to
> implement and whether it would be worth it?!
>
>
>
> This also has implications for the ABI and the prologue/epilogue
> generation. Consider a function with just a tbegin:
> int foo () { return __builtin_tbegin (); }
>
> foo needs to save and restore *all* the call-saved FPRs since the
> transaction body continuing in the caller of foo might modify a
> call-saved FPR and trigger an abort. If foo would not save and
> restore the FPRs it could end up clobbering call-saved FPRs violating
> the ABI.
>
> (Note: Be aware that since transactions roll back all memory
> operations this also applies to stack manipulations. So with a
> function like foo above it will happen that during an abort you return
> to a callee which already returned. The stack frame of foo will be
> restored by the transaction. So compared to setjmp/longjmp jumping to
> a callee is supposed to work reliably even if the stack content of the
> callee has been clobbered in between.)
>
> The additional prologue/epilogue FPR backups for TXs can only be
> avoided if the transaction is fully contained in the function body
> (and does not use the FPRs). I call these non-escaping transactions.
That's what __transaction_atomic etc. give you. I believe we already
check whether we need to save/restore vector registers, but I guess
we're not checking for FPRs.
> I've implemented a check which deals with the most common situations
> using the post-dominance tree. If all the tbegin BBs are
> post-dominated by a tend BB I redo the df_regs_ever_live computation
> from scratch after reload removed the clobbers. But this
> unfortunately doesn't help with TX instructions being used as part of
> a library like with libitm.
In libitm, it's probably easier to write custom assembly code for
ITM_beginTransaction that saves/restores all additional bits not
restored by the HTM explicitly through a partial SW setjmp. This
approach at least worked well for AMD's ASF, which didn't even restore
all normal registers.
Torvald