[Bug target/77308] surprisingly large stack usage for sha512 on arm

ktkachov at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Tue Aug 23 12:51:00 GMT 2016


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308

ktkachov at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2016-08-23
                 CC|                            |ktkachov at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #9 from ktkachov at gcc dot gnu.org ---
Note that the fpu option plays a role here as well

When I compile with -O3 -S -mfloat-abi=hard -march=armv7-a -mthumb
-mtune=cortex-a8 -mfpu=neon

I get:
sha512_block_data_order:
        @ args = 0, pretend = 0, frame = 2384
        @ frame_needed = 0, uses_anonymous_args = 0
        push    {r4, r5, r6, r7, r8, r9, r10, fp, lr}
        subw    sp, sp, #2388
        subs    r4, r2, #1


whereas if you leave out the -mfpu you get the default which is probably 'vfp'
if you didn't configure gcc with an explicit --with-fpu. This is usually not a
good fit for recent targets.
With -mfpu=vfp I get the terrible:
sha512_block_data_order:
        @ args = 0, pretend = 0, frame = 3568
        @ frame_needed = 0, uses_anonymous_args = 0
        push    {r4, r5, r6, r7, r8, r9, r10, fp, lr}
        subw    sp, sp, #3572
        subs    r4, r2, #1

That said, I bet there's still room for improvement


More information about the Gcc-bugs mailing list