This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [patch, arm] align saved FP regs on stack


Hi, Sandra.

FWIW, I tried this patch on A15 Juno with Coremark and any difference, if
any, between specifying this option and not was below 1%.

Cheers,

-- 
Evandro Menezes                              Austin, TX

> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-owner@gcc.gnu.org]
On
> Behalf Of Sandra Loosemore
> Sent: Friday, November 14, 2014 18:47
> To: GCC Patches
> Cc: Chris Jones; Joshua Conner
> Subject: [patch, arm] align saved FP regs on stack
> 
> On ARM targets, the stack is aligned to an 8-byte boundary, but when
> saving/restoring the VFP coprocessor registers in the function
> prologue/epilogue, it is possible for the 8-byte values to end up at
> locations that are 4-byte aligned but not 8-byte aligned.  This can result
in
> a performance penalty on micro-architectures that are optimized for well-
> aligned data, especially when such a misalignment may result in cache line
> splits within a single access.  This patch detects when at least one
> coprocessor register value needs to be saved and adds some additional
padding
> to the stack at that point if necessary to align it to an 8-byte boundary.
> I've re-used the existing logic to try pushing a 4-byte scratch register
and
> only fall back to an explicit stack adjustment if that fails.
> 
> NVIDIA found that an earlier version of this patch (benchmarked with
> SPECint2k and SPECfp2k on an older version of GCC) gave measurable
> improvements on their Tegra K1 64-bit processor, aka "Denver".  We aren't
> sure what other ARM processors might benefit from the extra alignment, so
> we've given it its own command-line option instead of tying it to -mtune.
> 
> I did some hand-testing of this patch on small test cases to verify that
the
> expected alignment was happening, but it seemed to me that the expected
> assembly-language patterns were likely too fragile to be hard-wired into a
> test case.  I also ran regression tests both with and without the switch
set
> so it doesn't break other things.  OK to commit?
> 
> -Sandra



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]