This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH v4][C][ADA] use function descriptors instead of trampolines in C
- From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- To: "Uecker, Martin" <Martin dot Uecker at med dot uni-goettingen dot de>, Jakub Jelinek <jakub at redhat dot com>
- Cc: nd <nd at arm dot com>, "paulkoning at comcast dot net" <paulkoning at comcast dot net>, "law at redhat dot com" <law at redhat dot com>, Szabolcs Nagy <Szabolcs dot Nagy at arm dot com>, "msebor at gmail dot com" <msebor at gmail dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, "ebotcazou at adacore dot com" <ebotcazou at adacore dot com>, "joseph at codesourcery dot com" <joseph at codesourcery dot com>
- Date: Wed, 19 Dec 2018 21:28:45 +0000
- Subject: Re: [PATCH v4][C][ADA] use function descriptors instead of trampolines in C
- References: <email@example.com> <firstname.lastname@example.org> <5896AE4C-D296-4FAF-A809-7BACA532BBF5@comcast.net> <20181218153209.GP23305@tucnak> <email@example.com> <20181218162440.GQ23305@tucnak> <firstname.lastname@example.org> <email@example.com> <20181218164212.GR23305@tucnak> <firstname.lastname@example.org>,<20181219200801.GG23305@tucnak>
Jakub Jelinek wrote:
> On Wed, Dec 19, 2018 at 07:53:48PM +0000, Uecker, Martin wrote:
>> What do you think about making the trampoline a single call
>> instruction and have a large memory region which is the same
>> page mapped many times?
This sounds like a good idea, but given a function descriptor is 8-16 bytes
it doesn't need to be 1 instruction. You can even go for larger sizes since
all it affects is minimum alignment of function descriptors.
>> The trampoline handler would pop the instruction pointer and use
>> this as an index into the real stack to read the static chain and
>> function pointer.
> While you save a few bytes per trampoline that way, it is heavily call-ret
> stack unfriendly, so it will not be very fast.
A repeated page adjacent to the stack is a good idea since it avoids adding
runtime support to push/pop nested function addresses. That would be
inefficient and likely very tricky for setjmp and exception handling
(or leak memory).
Since it can use several instructions we could load the static chain register
with the PC for example. On ISAs that don't support PC-relative addressing
you could do a call/ret sequence to get the PC and then tailcall the helper
to keep the return stack intact.
If computing the difference between the stack and trampoline region takes
just a few instructions (eg. thread local storage) then it could even be inlined.