[Bug middle-end/70773] Profiled sudoku solver slower due to lack of sdiv/udiv
wilco at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Wed Apr 19 12:56:00 GMT 2017
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70773
wilco at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |WAITING
CC| |wilco at gcc dot gnu.org
--- Comment #10 from wilco at gcc dot gnu.org ---
I can't reproduce any of this. GCC6 and GCC7 always use smull for the divisions
on ARM, even with profile-use. I could only make GCC emit a library call by
using -Os on a CPU that doesn't have divide, but that is expected and correct.
On AArch64 I get > 20% speedup with -fprofile-use vs plain -O3, so it works as
expected. With -mcpu=cortex-a53 there are more uses of sdiv, but the profiled
version is still faster.
So without more details I don't see any issue here.
More information about the Gcc-bugs
mailing list