[Bug tree-optimization/78348] [7 REGRESSION] 15% performance drop for coremark-pro/nnet-test after r242038
ysrumyan at gmail dot com
gcc-bugzilla@gcc.gnu.org
Tue Nov 15 13:24:00 GMT 2016
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78348
--- Comment #5 from Yuri Rumyantsev <ysrumyan at gmail dot com> ---
Yes, I think so.
2016-11-15 14:49 GMT+03:00 rguenth at gcc dot gnu.org
<gcc-bugzilla@gcc.gnu.org>:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78348
>
> Richard Biener <rguenth at gcc dot gnu.org> changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> Status|UNCONFIRMED |NEW
> Last reconfirmed| |2016-11-15
> Ever confirmed|0 |1
>
> --- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
>> The issue is that memcpy must be produced instead of memove which does
>> not have optimized version for avx2 x86 and simply uses byte copy.
>
> I'd expected a if (! overlap) memcpy () else byte-copy at least.
>
> Note the loop distribution code doesn't try to be clever in choosing memcpy
> over memmove (using dependence analysis). So improving loop distribution
> (adding a PKIND_MEMMOVE and conservatively using that from dependence analysis)
> is a possibility as well. But we have
>
> (compute_affine_dependence
> stmt_a: _2 = par.0_1->x2[i_19][j_20];
> stmt_b: par.0_1->x1[i_19][j_20] = _2;
> (analyze_overlapping_iterations
> (chrec_a = {0, +, 1}_2)
> (chrec_b = {0, +, 1}_2)
> (overlap_iterations_a = [0])
> (overlap_iterations_b = [0]))
> (analyze_overlapping_iterations
> (chrec_a = i_19)
> (chrec_b = i_19)
> (overlap_iterations_a = [0])
> (overlap_iterations_b = [0]))
> (analyze_overlapping_iterations
> (chrec_a = 33280)
> (chrec_b = 12800)
> (analyze_ziv_subscript
> )
> (overlap_iterations_a = no dependence)
> (overlap_iterations_b = no dependence))
> ) -> no dependence
>
> so I think we could use memcpy for all no dependence cases?
>
> --
> You are receiving this mail because:
> You reported the bug.
More information about the Gcc-bugs
mailing list