[Bug middle-end/79712] New: Clang smarter about unrolling in fhourstones benchmark

tulipawn at gmail dot com gcc-bugzilla@gcc.gnu.org
Sat Feb 25 18:47:00 GMT 2017


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79712

            Bug ID: 79712
           Summary: Clang smarter about unrolling in fhourstones benchmark
           Product: gcc
           Version: 7.0.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tulipawn at gmail dot com
  Target Milestone: ---

Created attachment 40829
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40829&action=edit
preprocessed source

It seems clang is probably doing a better job at unrolling in the fhourstones
benchmark:

$ gcc -Wextra -Wall -Ofast -mcpu=cortex-a53 -march=armv8-a+crc -ftree-vectorize
SearchGame.i (-funroll-loops -fvariable-expansion-in-unroller
-ftree-loop-ivcanon -fivopts)
$ ./a.out < inputs

- clang 3.8 result: 3358 kpos/s
- gcc result: 3220 kpos/s
- gcc result with unrolling: 3473 kpos/s 

It would be nice if gcc could achieve similar performance to clang's -O3 out of
the box.

BTW, running the benchmark on 32-bit requires changing the %lu's to %llu's at
line 200 in the C source.


More information about the Gcc-bugs mailing list