Bug 47530 - [trans-mem] tail call optimization problem with _ITM_commitTransaction
Summary: [trans-mem] tail call optimization problem with _ITM_commitTransaction
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.8.0
: P3 normal
Target Milestone: ---
Assignee: Aldy Hernandez
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-01-29 08:12 UTC by Patrick Marlier
Modified: 2012-06-04 16:56 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2011-02-09 17:36:18


Attachments
Preprocessed RBTree microbenchmark from RSTM that triggers this bug. (76.31 KB, application/octet-stream)
2012-05-02 16:50 UTC, Luke Dalessandro
Details
reduced testcase for luke's sample (207 bytes, text/plain)
2012-06-04 15:24 UTC, Aldy Hernandez
Details
propsed patch being tested (782 bytes, patch)
2012-06-04 15:51 UTC, Aldy Hernandez
Details | Diff
testcase (218 bytes, text/x-csrc)
2012-06-04 15:58 UTC, Patrick Marlier
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Patrick Marlier 2011-01-29 08:12:50 UTC
According to:
http://gcc.gnu.org/ml/gcc/2011-01/msg00314.html
http://gcc.gnu.org/ml/gcc/2011-01/msg00279.html

The tail call optimization should not be possible on _ITM_commitTransaction because this function can go (longjmp) to _ITM_beginTransaction and thus the stack will be corrupted.

Here ASM:
_function:
        pushq   %r12
        pushq   %rbp
        pushq   %rbx
        subq    $16, %rsp
...
        call    _ITM_beginTransaction
.L8:
...
        call    _ITM_WU8
        addq    $16, %rsp
        popq    %rbx
        popq    %rbp
        popq    %r12
        jmp     _ITM_commitTransaction

In this example, if _ITM_commitTransaction rolls back to L8, the function epilogue will restore wrong values to registers (modified by the function _ITM_commitTransaction) when leaving the _function.

Moreover I think that the function _ITM_beginTransaction should have the flag ECS_RETURNS_TWICE because it is how it behaves (ie the transaction can abort and longjmp behind the _ITM_beginTransaction call).

Draft patch proposed in http://gcc.gnu.org/ml/gcc/2011-01/msg00279.html

Patrick Marlier.
Comment 1 Richard Henderson 2011-02-09 17:36:18 UTC
Mine.

We aren't representing these functions as setjmp because we have more
detailed information about the control flow, and it's (eventually)
explicitly represented.  For instance, if a function has two transactions
we know that the commit for the second transaction cannot branch to the
restart of the first transaction.

The problem here is that these extra edges are not inserted until after
the tail-call discovery pass is run.  Thankfully, no actual transform 
is actually performed at that spot.  All we have to do later is clear
the bit that the tail-call discovery pass set.
Comment 2 Richard Henderson 2011-02-09 20:24:02 UTC
Author: rth
Date: Wed Feb  9 20:23:56 2011
New Revision: 169984

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169984
Log:
PR 47530
        * trans-mem.c (expand_block_edges): Reset tail-call bit.

Added:
    branches/transactional-memory/gcc/testsuite/g++.dg/tm/pr47530.C
Modified:
    branches/transactional-memory/gcc/ChangeLog.tm
    branches/transactional-memory/gcc/trans-mem.c
Comment 3 Richard Henderson 2011-02-09 20:24:26 UTC
Fixed.
Comment 4 Patrick Marlier 2011-02-10 09:39:05 UTC
I hope next time to come up with a 'perfect' patch!
Thanks for the explanation and for the fix. :)

Patrick.
Comment 5 Luke Dalessandro 2012-05-02 16:50:41 UTC
Created attachment 27288 [details]
Preprocessed RBTree microbenchmark from RSTM that triggers this bug.

I'm still seeing this bug with gcc-4.8 (svn r187050). I've attached a preprocessed source file from the RSTM microbenchmark suite (for fc13, x86_64) that manifests the problem. It simply takes an integer set, implemented as an RBTree, and randomly performs insert/delete/lookup operations as transactions. The "p" command line parameter sets the number of threads to use. The rest of the command line parameters aren't relevant.

Unoptimized compilation produces expected output:

luke> g++ -fgnu-tm -o TreeBench TreeBench.i -lrt
luke> ./TreeBench -p2
Verification: Passed
csv, ALG=gcc-tm, B=RBTree, R=34, d=1, p=2, X=0, m=256, S=1, O=1, txns=1231838, time=1000001329, throughput=1231836

Optimized compilation shows stack corruption ("time" output is computed using a stack location that is trashed when _ITM_commitTransaction longjmps from the tail call).

luke> g++ -O3 -fgnu-tm -o TreeBench TreeBench.i -lrt
luke> ./TreeBench -p2
Verification: Passed
csv, ALG=gcc-tm, B=RBTree, R=34, d=1, p=2, X=0, m=256, S=1, O=1, txns=1301244673, time=1335976459721524108, throughput=0

Tail (sibling?) calls to _ITM_commitTransaction are identified by 'jmp' instructions.

luke> g++ -O3 -fgnu-tm -o /dev/stdout -S TreeBench.i | grep _ITM_commitTransaction
	call	_ITM_commitTransaction
	jmp	_ITM_commitTransaction
	jmp	_ITM_commitTransaction


I expect that this is also related to http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51771--and in fact may be the same bug, but it's manifesting as the specific tail-call problem that was noted here as fixed. The returns_twice "hack" that suppresses pr51771 in gcc-4.7 also suppresses this bug (pr47530) in gcc-4.7, so it's gcc-4.8-only at this point.

I will reopen pr47530, but if someone can confirm it as really being pr51771 I'm happy to have it closed again.
Comment 6 Luke Dalessandro 2012-05-02 16:52:42 UTC
(In reply to comment #5)
> Created attachment 27288 [details]
> Preprocessed RBTree microbenchmark from RSTM that triggers this bug.
> 
> [snip]
> 
> I will reopen pr47530, but if someone can confirm it as really being pr51771
> I'm happy to have it closed again.

Of course, what I mean is that this can be re-opened for 4.8, if appropriate. For now, gcc-4.7 is fine.
Comment 7 Aldy Hernandez 2012-06-04 15:21:42 UTC
Reopened with Luke's testcase, which triggers on both 4.7 and 4.8.  Luke, in the future, it would really help if you could narrow down the testcase to the smallest reproducible testcase.  This helps us immensely and expedites any possible fix.
Comment 8 Aldy Hernandez 2012-06-04 15:24:38 UTC
Created attachment 27555 [details]
reduced testcase for luke's sample
Comment 9 Patrick Marlier 2012-06-04 15:26:37 UTC
Aldy,

I have a testcase and a patch for this. I will submit it soon.

Patrick
Comment 10 Aldy Hernandez 2012-06-04 15:51:22 UTC
Created attachment 27556 [details]
propsed patch being tested

Whoops, sorry Patrick.  I already have a patch I am testing :).  If you already have yours tested, feel free to submit it, otherwise I'll post shortly.
Comment 11 Patrick Marlier 2012-06-04 15:58:35 UTC
Created attachment 27557 [details]
testcase

(In reply to comment #10)
> Created attachment 27556 [details]
> propsed patch being tested
> 
> Whoops, sorry Patrick.  I already have a patch I am testing :).  If you already
> have yours tested, feel free to submit it, otherwise I'll post shortly.

Go ahead and submit (I will not able to do it before few hours). My patch was quite similar except for the testcase (using the tree dump but untested). Testcase attached if you want to include it into the testsuite.
Comment 12 Aldy Hernandez 2012-06-04 16:51:33 UTC
Author: aldyh
Date: Mon Jun  4 16:51:24 2012
New Revision: 188190

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188190
Log:
        PR middle-end/47530
        * trans-mem.c (expand_block_edges): Do not skip the first
        statement when resetting the BB.


Added:
    trunk/gcc/testsuite/g++.dg/tm/pr47530-2.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/trans-mem.c
Comment 13 Aldy Hernandez 2012-06-04 16:52:55 UTC
Author: aldyh
Date: Mon Jun  4 16:52:47 2012
New Revision: 188191

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188191
Log:
	PR middle-end/47530
	* trans-mem.c (expand_block_edges): Do not skip the first
	statement when resetting the BB.


Added:
    branches/gcc-4_7-branch/gcc/testsuite/g++.dg/tm/pr47530-2.C
Modified:
    branches/gcc-4_7-branch/gcc/ChangeLog
    branches/gcc-4_7-branch/gcc/trans-mem.c
Comment 14 Aldy Hernandez 2012-06-04 16:56:21 UTC
fixed on 4.7 and trunk.  closing PR.