This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/65375] aarch64: poor codegen for vld2q_f32 and vst2q_f32
- From: "ramana at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Wed, 24 Jun 2015 09:06:45 +0000
- Subject: [Bug target/65375] aarch64: poor codegen for vld2q_f32 and vst2q_f32
- Auto-submitted: auto-generated
- References: <bug-65375-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65375
--- Comment #11 from Ramana Radhakrishnan <ramana at gcc dot gnu.org> ---
(In reply to Jim Wilson from comment #10)
> Improved, but not completely resolved. We still get unnecessary orr
> instructions, same as in comment 2. This is partly an issue with the
> register allocator not handling partially overlapping register reads/writes
> very well. We already have a few other bugs for that. This is also partly
> an issue with how the aarch64 builtins work, via
> __builtin_aarch64_[gs]et_qregoiv4sf which create the partially overlapping
> register reads/writes. The ARM builtins don't work this way, they use a
> union for type punning, and hence don't have the same problem.
Both the ARM and the AArch64 ports have the issues with partially overlapping
register reads / writes especially with the vzip / vuzip style intrinsics in
AArch32 world or even the larger vld3/4 intrinsics in both ARM and AArch64
states. It would be nice to fix that finally.
If that is the only issue left in the ticket - maybe we should just park this
example in that ticket - IIRC PR43725 and close this one out ?
regards
Ramana