This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/79581] VFP4 slower than VFP3 in C-ray
- From: "ktkachov at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Mon, 20 Feb 2017 09:51:24 +0000
- Subject: [Bug target/79581] VFP4 slower than VFP3 in C-ray
- Auto-submitted: auto-generated
- References: <bug-79581-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79581
--- Comment #3 from ktkachov at gcc dot gnu.org ---
I can't reproduce the difference on my machine.
Judging by your -mcpu option is this on a Cortex-A5?
As far as codegen goes the major difference I can see is that the vfpv4 version
generates vfma instructions instead of vmla ones.
Also there are cases where the vfpv3 version will generate multiple vmls
instructions whereas the vfpv4 one will generate an explicit vneg followed by
vfma instructions