[Bug target/70048] [6 Regression][AArch64] Inefficient local array addressing

rth at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Mon Mar 7 19:17:00 GMT 2016


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048

--- Comment #9 from Richard Henderson <rth at gcc dot gnu.org> ---
While I fully believe in CSE'ing "base + reg*scale" when talking about
non-stack-based pointers, when it comes to stack-based data access I'm
less certain about the proper approach.

All things work out "best" when there's no (or little) offset applied
during register elimination.  When this can be true, all of the rtl
optimizations see the final address and can do the right thing.

This isn't easy to do for AArch64, however.  So we need to accept that
some amount of concession need be made so that it's not too difficult
turn reg + scale + c1 + c2 into a final address without extra steps.

We already special case the eliminable frame registers in
aarch64_classify_address to allow arbitrary offset, and we're prepared
to split to a proper offset during RA.  It wouldn't be out of the
question to allow "reg + scale + c" as well.  We can probably come up
with some good heuristics for splitting into a number of cases based
on the generalized "((reg + hi_c) + scale) + lo_c".

But the patch we take for stage4 must be less than the full solution.


More information about the Gcc-bugs mailing list