[Bug target/77308] surprisingly large stack usage for sha512 on arm
bernd.edlinger at hotmail dot de
gcc-bugzilla@gcc.gnu.org
Wed Nov 2 09:25:00 GMT 2016
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308
--- Comment #48 from Bernd Edlinger <bernd.edlinger at hotmail dot de> ---
(In reply to wilco from comment #22)
>
> Anyway, there is another bug: on AArch64 we correctly recognize there are 8
> 1-byte loads, shifts and orrs which can be replaced by a single 8-byte load
> and a byte reverse. Although it is recognized on ARM and works correctly if
> it is a little endian load, it doesn't perform the optimization if a byte
> reverse is needed. As a result there are lots of 64-bit shifts and orrs
> which create huge register pressure if not expanded early.
Hmm...
I think the test case does something invalid here:
const SHA_LONG64 *W = in;
T1 = X[0] = PULL64(W[0]);
in is not aligned, but it is cast to a 8-byte aligned type.
If the bswap pass assumes with your proposed patch
it is OK to merge 4 byte accesses into an aligned word access,
it may likely break openssl on -mno-unaligned targets.
Even on our cortex-a9 the O/S will trap on unaligned accesses.
I have checked that openssl still works on arm-none-eabi
with my patch, but I am not sure about your patch.
More information about the Gcc-bugs
mailing list