This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Porting libsanitizer to aarch64
- From: Richard Earnshaw <rearnsha at arm dot com>
- To: Richard Henderson <rth at redhat dot com>
- Cc: Jakub Jelinek <jakub at redhat dot com>, Konstantin Serebryany <konstantin dot s dot serebryany at gmail dot com>, Christophe Lyon <christophe dot lyon at linaro dot org>, GCC Development <gcc at gcc dot gnu dot org>
- Date: Thu, 23 May 2013 17:28:29 +0100
- Subject: Re: Porting libsanitizer to aarch64
- References: <CAKdteOa-UDeo5zDwCeYSydu0K-WqmTjPgj3sYUpKrc0YPoncCg at mail dot gmail dot com> <20130521154426 dot GA1377 at tucnak dot redhat dot com> <CAGQ9bdyXCWDt0FF4+F5_4LbW7XcZACczKfHhr28nnwk96rf5Mw at mail dot gmail dot com> <20130522074341 dot GC1377 at tucnak dot redhat dot com> <519D2839 dot 5070602 at redhat dot com>
On 22/05/13 21:19, Richard Henderson wrote:
On 05/22/2013 12:43 AM, Jakub Jelinek wrote:
Changing frame grows upward into frame grows downward shouldn't be that
hard, see e.g. rs6000 port, where
#define FRAME_GROWS_DOWNWARD (flag_stack_protect != 0 || flag_asan != 0)
and grep the port where it uses FRAME_GROWS_DOWNWARD.
Basically you need to tweak initial elimination offset computation for it,
and that might be it, or perhaps one or two extra spots.
FWIW, I would actually recommend against conditionalizing FRAME_GROWS_DOWNWARD
for a new port. Just make it _always_ grow down and save yourself the
additional code bloat in the backend.
Doing that would add significantly to the cost of setting up the frame.
FRAME_GROWS_DOWNWARD
Define this macro to nonzero value if the addresses of local variable
slots are at negative offsets from the frame pointer.
The optimal frame establishment sequence (for small frames, less than
128 bytes) is generated by doing
stp fp, lr, [sp, #-frame-size]!
mov fp, sp
Job done, along with stack allocation. Larger frames can be generated
with an initial subtraction from SP to allocate the entire frame. Total
cost of maintaining the frame structure comes out to about 1.5
instructions in the prologue and 0.5 instructions in the epilogue (50%
of functions end up with one additional store and one additional load
instruction).
Any other sequence requires modifying the stack pointer, then saving the
registers (potentially having to generate a further temporary value for
the large offset from SP to them), then adding another value back into FP.
Furthermore, load/store immediate operations have a significantly larger
positive offset from the base than negative; so you really hurt
performance by having the frame record above the local variables for
large functions.
R.