This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/77308] surprisingly large stack usage for sha512 on arm


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308

--- Comment #32 from Wilco <wdijkstr at arm dot com> ---
(In reply to Bernd Edlinger from comment #31)
> Sure, combine cant help, especially because it runs before split1.
> 
> But I wondered why this peephole2 is not enabled:
> 
> (define_peephole2 ; ldrd
>   [(set (match_operand:SI 0 "arm_general_register_operand" "")
>         (match_operand:SI 2 "memory_operand" ""))
>    (set (match_operand:SI 1 "arm_general_register_operand" "")
>         (match_operand:SI 3 "memory_operand" ""))]
>   "TARGET_LDRD
>      && current_tune->prefer_ldrd_strd
>      && !optimize_function_for_size_p (cfun)"
>   [(const_int 0)]
> 
> 
> I have -march=armv7-a / -mcpu=cortex-a9 and thus for me
> current_tune-> prefer_ldrd_strd is FALSE.
> 
> Furthermore, if I want to do -Os the third condition is FALSE too.
> But one ldrd must be shorter than two ldr ?
> 
> That seems wrong...

Indeed, on a target that supports LDRD you want to use LDRD if legal. LDM
should only be tried on Thumb-1. Emitting LDRD from a peephole when the offset
is in range will never increase code size so should always be enabled.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]