This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/81616] Update -mtune=generic for the current Intel and AMD processors
- From: "hubicka at ucw dot cz" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 28 Nov 2017 18:14:15 +0000
- Subject: [Bug target/81616] Update -mtune=generic for the current Intel and AMD processors
- Auto-submitted: auto-generated
- References: <bug-81616-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616
--- Comment #26 from Jan Hubicka <hubicka at ucw dot cz> ---
On you matrix benchmarks I get:
Vector inside of loop cost: 44
Vector prologue cost: 12
Vector epilogue cost: 0
Scalar iteration cost: 40
Scalar outside cost: 0
Vector outside cost: 12
prologue iterations: 0
epilogue iterations: 0
Calculated minimum iters for profitability: 1
mult.c:15:7: note: Runtime profitability threshold = 4
mult.c:15:7: note: Static estimate profitability threshold = 4
Vector inside of loop cost: 2428
Vector prologue cost: 4
Vector epilogue cost: 0
Scalar iteration cost: 2428
Scalar outside cost: 0
Vector outside cost: 4
prologue iterations: 0
epilogue iterations: 0
Calculated minimum iters for profitability: 1
mult.c:30:7: note: Runtime profitability threshold = 4
mult.c:30:7: note: Static estimate profitability threshold = 4
for 128bit vectorization and for 256bit
Vector inside of loop cost: 88
Vector prologue cost: 24
Vector epilogue cost: 0
Scalar iteration cost: 40
Scalar outside cost: 0
Vector outside cost: 24
prologue iterations: 0
epilogue iterations: 0
Calculated minimum iters for profitability: 1
mult.c:15:7: note: Runtime profitability threshold = 8
mult.c:15:7: note: Static estimate profitability threshold = 8
Vector inside of loop cost: 6472
Vector prologue cost: 8
Vector epilogue cost: 0
Scalar iteration cost: 2428
Scalar outside cost: 0
Vector outside cost: 8
prologue iterations: 0
epilogue iterations: 0
Calculated minimum iters for profitability: 1
mult.c:30:7: note: Runtime profitability threshold = 8
mult.c:30:7: note: Static estimate profitability threshold = 8
So if vectorizer knew to preffer bigger vector sizes when cost is about double,
it would vectoriye first loop to
256 as expected.