This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug rtl-optimization/53533] [4.8/4.9/5/6 regression] vectorization causes loop unrolling test slowdown as measured by Adobe's C++Benchmark
- From: "maltsevm at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Mon, 04 May 2015 14:59:59 +0000
- Subject: [Bug rtl-optimization/53533] [4.8/4.9/5/6 regression] vectorization causes loop unrolling test slowdown as measured by Adobe's C++Benchmark
- Auto-submitted: auto-generated
- References: <bug-53533-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533
--- Comment #29 from Mikhail Maltsev <maltsevm at gmail dot com> ---
Results for attached testcase:
Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz (Haswell)
g++ -O3 -march=native -mtune=native
10000 iterations
Clang 3.7
Total absolute time for int32_t for loop unrolling: 0.99 sec
Total absolute time for int32_t do loop unrolling: 1.00 sec
Total absolute time for double for loop unrolling: 1.37 sec
Total absolute time for double do loop unrolling: 1.37 sec
GCC 4.7.4
Total absolute time for int32_t for loop unrolling: 5.88 sec
Total absolute time for int32_t do loop unrolling: 7.57 sec
Total absolute time for double for loop unrolling: 2.29 sec
Total absolute time for double do loop unrolling: 2.45 sec
GCC 4.8.4
Total absolute time for int32_t for loop unrolling: 3.12 sec
Total absolute time for int32_t do loop unrolling: 3.29 sec
Total absolute time for double for loop unrolling: 1.13 sec
Total absolute time for double do loop unrolling: 1.14 sec
GCC 4.9.2
Total absolute time for int32_t for loop unrolling: 3.02 sec
Total absolute time for int32_t do loop unrolling: 3.29 sec
Total absolute time for double for loop unrolling: 1.10 sec
Total absolute time for double do loop unrolling: 1.13 sec
GCC 6
Total absolute time for int32_t for loop unrolling: 5.95 sec
Total absolute time for int32_t do loop unrolling: 6.95 sec
Total absolute time for double for loop unrolling: 2.39 sec
Total absolute time for double do loop unrolling: 2.39 sec
g++ -DINLINE_MANUALLY -O3 -march=native -mtune=native
50000 iterations
Clang 3.7
Total absolute time for int32_t for loop unrolling: 2.43 sec
Total absolute time for int32_t do loop unrolling: 2.32 sec
Total absolute time for double for loop unrolling: 6.38 sec
Total absolute time for double do loop unrolling: 6.38 sec
GCC 4.9.2
Total absolute time for int32_t for loop unrolling: 10.17 sec
Total absolute time for int32_t do loop unrolling: 10.16 sec
Total absolute time for double for loop unrolling: 3.89 sec
Total absolute time for double do loop unrolling: 3.90 sec
GCC 6
Total absolute time for int32_t for loop unrolling: 10.10 sec
Total absolute time for int32_t do loop unrolling: 10.12 sec
Total absolute time for double for loop unrolling: 3.90 sec
Total absolute time for double do loop unrolling: 3.89 sec
g++ -DINLINE_MANUALLY -Ofast -march=native -mtune=native
GCC 6
Total absolute time for int32_t for loop unrolling: 10.11 sec
Total absolute time for int32_t do loop unrolling: 10.11 sec
Total absolute time for double for loop unrolling: 1.14 sec
Total absolute time for double do loop unrolling: 1.15 sec
So, IMHO there is no regression here (at least w.r.t. vectorization). Floating
point loop gets constant-folded, if reassociation is allowed. Also, GCC6 is
able to infer that "for" and "while" tests are semantically equivalent and
unifies them.