This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/84037] [8 Regression] Speed regression of polyhedron benchmark since r256644
- From: "amker at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Wed, 07 Feb 2018 15:57:28 +0000
- Subject: [Bug tree-optimization/84037] [8 Regression] Speed regression of polyhedron benchmark since r256644
- Auto-submitted: auto-generated
- References: <bug-84037-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84037
--- Comment #23 from amker at gcc dot gnu.org ---
(In reply to Richard Biener from comment #21)
> So after r257453 we improve the situation pre-IVOPTs to just
> 6 IVs (duplicated but trivially equivalent) plus one counting IV. But then
> when SLP is enabled IVOPTs comes along and adds another 4 IVs which makes us
> spill... (for AVX256, so you need -march=core-avx2 for example).
>
> Bin, any chance you can take a look? In the IVO dump I see
>
> target_avail_regs 15
> target_clobbered_regs 9
> target_reg_cost 4
> target_spill_cost 8
> regs_used 3
> ^^^
>
> and regs_used looks awfully low to me. The loop has even more IVs initially
> plus variable steps for that IVs which means we need two regs per IV.
>
> There doesn't seem to be a way to force IVOPTs to use the minimal set of IVs?
> Or just use the original set, removing the obvious redundancies? There is
> a microarchitectural issue left with the vectorization but the spilling
> obscures the look quite a bit :/
Sure, I will have a look based on your commit. Thanks