[Bug middle-end/79712] New: Clang smarter about unrolling in fhourstones benchmark
tulipawn at gmail dot com
gcc-bugzilla@gcc.gnu.org
Sat Feb 25 18:47:00 GMT 2017
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79712
Bug ID: 79712
Summary: Clang smarter about unrolling in fhourstones benchmark
Product: gcc
Version: 7.0.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: tulipawn at gmail dot com
Target Milestone: ---
Created attachment 40829
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40829&action=edit
preprocessed source
It seems clang is probably doing a better job at unrolling in the fhourstones
benchmark:
$ gcc -Wextra -Wall -Ofast -mcpu=cortex-a53 -march=armv8-a+crc -ftree-vectorize
SearchGame.i (-funroll-loops -fvariable-expansion-in-unroller
-ftree-loop-ivcanon -fivopts)
$ ./a.out < inputs
- clang 3.8 result: 3358 kpos/s
- gcc result: 3220 kpos/s
- gcc result with unrolling: 3473 kpos/s
It would be nice if gcc could achieve similar performance to clang's -O3 out of
the box.
BTW, running the benchmark on 32-bit requires changing the %lu's to %llu's at
line 200 in the C source.
More information about the Gcc-bugs
mailing list