[Bug target/77308] surprisingly large stack usage for sha512 on arm
ktkachov at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Tue Aug 23 12:51:00 GMT 2016
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308
ktkachov at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2016-08-23
CC| |ktkachov at gcc dot gnu.org
Ever confirmed|0 |1
--- Comment #9 from ktkachov at gcc dot gnu.org ---
Note that the fpu option plays a role here as well
When I compile with -O3 -S -mfloat-abi=hard -march=armv7-a -mthumb
-mtune=cortex-a8 -mfpu=neon
I get:
sha512_block_data_order:
@ args = 0, pretend = 0, frame = 2384
@ frame_needed = 0, uses_anonymous_args = 0
push {r4, r5, r6, r7, r8, r9, r10, fp, lr}
subw sp, sp, #2388
subs r4, r2, #1
whereas if you leave out the -mfpu you get the default which is probably 'vfp'
if you didn't configure gcc with an explicit --with-fpu. This is usually not a
good fit for recent targets.
With -mfpu=vfp I get the terrible:
sha512_block_data_order:
@ args = 0, pretend = 0, frame = 3568
@ frame_needed = 0, uses_anonymous_args = 0
push {r4, r5, r6, r7, r8, r9, r10, fp, lr}
subw sp, sp, #3572
subs r4, r2, #1
That said, I bet there's still room for improvement
More information about the Gcc-bugs
mailing list