This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/77308] surprisingly large stack usage for sha512 on arm
- From: "wdijkstr at arm dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 25 Oct 2016 18:41:19 +0000
- Subject: [Bug target/77308] surprisingly large stack usage for sha512 on arm
- Auto-submitted: auto-generated
- References: <bug-77308-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308
--- Comment #14 from Wilco <wdijkstr at arm dot com> ---
(In reply to Bernd Edlinger from comment #13)
> I am still trying to understand why thumb1 seems to outperform thumb2.
>
> Obviously thumb1 does not have the shiftdi3 pattern,
> but even if I remove these from thumb2, the result is still
> not par with thumb2. Apparently other patterns still produce di
> values that are not enabled with thumb1, they are
> xordi3 and anddi3, these are often used. Then there is
> adddi3 that is enabled in thumb1 and thumb2, I also disabled
> this one, and now the sha512 gets down to inclredible 1152
> bytes frame (-Os -march=armv7 -mthumb -float-abi=soft):
>
> I know this is a hack, but 1K stack is what we should expect...
>
> --- arm.md 2016-10-25 19:54:16.425736721 +0200
> +++ arm.md.orig 2016-10-17 19:46:59.000000000 +0200
> @@ -448,7 +448,7 @@
> (plus:DI (match_operand:DI 1 "s_register_operand" "")
> (match_operand:DI 2 "arm_adddi_operand" "")))
> (clobber (reg:CC CC_REGNUM))])]
> - "TARGET_EITHER && !TARGET_THUMB2"
> + "TARGET_EITHER"
So you're actually turning the these instructions off for Thumb-2? What does it
do instead then? Do the number of instructions go down?
I noticed that with or without -mfpu=neon, using -marm is significantly smaller
than -mthumb. Most of the extra instructions appear to be moves, which means
something is wrong (I would expect Thumb-2 to do better as it supports LDRD
with larger offsets than ARM).