[Bug target/79581] VFP4 slower than VFP3 in C-ray
ktkachov at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Mon Feb 20 09:51:00 GMT 2017
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79581
--- Comment #3 from ktkachov at gcc dot gnu.org ---
I can't reproduce the difference on my machine.
Judging by your -mcpu option is this on a Cortex-A5?
As far as codegen goes the major difference I can see is that the vfpv4 version
generates vfma instructions instead of vmla ones.
Also there are cases where the vfpv3 version will generate multiple vmls
instructions whereas the vfpv4 one will generate an explicit vneg followed by
vfma instructions
More information about the Gcc-bugs
mailing list