[Bug target/77308] surprisingly large stack usage for sha512 on arm

wdijkstr at arm dot com gcc-bugzilla@gcc.gnu.org
Thu Oct 20 23:20:00 GMT 2016


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308

--- Comment #12 from Wilco <wdijkstr at arm dot com> ---
It looks like we need a different approach, I've seen the extra SETs use up
more registers in some cases, and in other cases being optimized away early
on... 

Doing shift expansion at the same time as all other DI mode operations should
result in the same stack size as -fpu=neon. However that's still well behind
Thumb-1, and I would expect ARM/Thumb-2 to beat Thumb-1 easily with 6 extra
registers.

The spill code for Thumb-2 seems incorrect:

(insn 11576 8090 9941 5 (set (reg:SI 3 r3 [11890])
        (plus:SI (reg/f:SI 13 sp)
            (const_int 480 [0x1e0]))) sha512.c:147 4 {*arm_addsi3}
     (nil))
(insn 9941 11576 2978 5 (set (reg:DI 2 r2 [4210])
        (mem/c:DI (reg:SI 3 r3 [11890]) [5 %sfpD.4158+-3112 S8 A64]))
sha512.c:147 170 {*arm_movdi}
     (nil))

LDRD has a range of 1020 on Thumb-2 so I would expect this to be a single
instruction.


More information about the Gcc-bugs mailing list