This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug rtl-optimization/53533] [4.8/4.9/5/6 regression] vectorization causes loop unrolling test slowdown as measured by Adobe's C++Benchmark


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533

--- Comment #29 from Mikhail Maltsev <maltsevm at gmail dot com> ---
Results for attached testcase:

Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz (Haswell)
g++ -O3 -march=native -mtune=native
10000 iterations

Clang 3.7
Total absolute time for int32_t for loop unrolling: 0.99 sec
Total absolute time for int32_t do loop unrolling: 1.00 sec
Total absolute time for double for loop unrolling: 1.37 sec
Total absolute time for double do loop unrolling: 1.37 sec

GCC 4.7.4
Total absolute time for int32_t for loop unrolling: 5.88 sec
Total absolute time for int32_t do loop unrolling: 7.57 sec
Total absolute time for double for loop unrolling: 2.29 sec
Total absolute time for double do loop unrolling: 2.45 sec

GCC 4.8.4
Total absolute time for int32_t for loop unrolling: 3.12 sec
Total absolute time for int32_t do loop unrolling: 3.29 sec
Total absolute time for double for loop unrolling: 1.13 sec
Total absolute time for double do loop unrolling: 1.14 sec

GCC 4.9.2
Total absolute time for int32_t for loop unrolling: 3.02 sec
Total absolute time for int32_t do loop unrolling: 3.29 sec
Total absolute time for double for loop unrolling: 1.10 sec
Total absolute time for double do loop unrolling: 1.13 sec

GCC 6
Total absolute time for int32_t for loop unrolling: 5.95 sec
Total absolute time for int32_t do loop unrolling: 6.95 sec
Total absolute time for double for loop unrolling: 2.39 sec
Total absolute time for double do loop unrolling: 2.39 sec

g++ -DINLINE_MANUALLY -O3 -march=native -mtune=native
50000 iterations

Clang 3.7
Total absolute time for int32_t for loop unrolling: 2.43 sec
Total absolute time for int32_t do loop unrolling: 2.32 sec
Total absolute time for double for loop unrolling: 6.38 sec
Total absolute time for double do loop unrolling: 6.38 sec

GCC 4.9.2
Total absolute time for int32_t for loop unrolling: 10.17 sec
Total absolute time for int32_t do loop unrolling: 10.16 sec
Total absolute time for double for loop unrolling: 3.89 sec
Total absolute time for double do loop unrolling: 3.90 sec

GCC 6
Total absolute time for int32_t for loop unrolling: 10.10 sec
Total absolute time for int32_t do loop unrolling: 10.12 sec
Total absolute time for double for loop unrolling: 3.90 sec
Total absolute time for double do loop unrolling: 3.89 sec

g++ -DINLINE_MANUALLY -Ofast -march=native -mtune=native
GCC 6
Total absolute time for int32_t for loop unrolling: 10.11 sec
Total absolute time for int32_t do loop unrolling: 10.11 sec
Total absolute time for double for loop unrolling: 1.14 sec
Total absolute time for double do loop unrolling: 1.15 sec

So, IMHO there is no regression here (at least w.r.t. vectorization). Floating
point loop gets constant-folded, if reassociation is allowed. Also, GCC6 is
able to infer that "for" and "while" tests are semantically equivalent and
unifies them.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]