This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug tree-optimization/78348] [7 REGRESSION] 15% performance drop for coremark-pro/nnet-test after r242038

From: "ysrumyan at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Tue, 15 Nov 2016 13:21:38 +0000
Subject: [Bug tree-optimization/78348] [7 REGRESSION] 15% performance drop for coremark-pro/nnet-test after r242038
Auto-submitted: auto-generated
References: <bug-78348-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78348

--- Comment #5 from Yuri Rumyantsev <ysrumyan at gmail dot com> ---
Yes, I think so.

2016-11-15 14:49 GMT+03:00 rguenth at gcc dot gnu.org
<gcc-bugzilla@gcc.gnu.org>:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78348
>
> Richard Biener <rguenth at gcc dot gnu.org> changed:
>
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>              Status|UNCONFIRMED                 |NEW
>    Last reconfirmed|                            |2016-11-15
>      Ever confirmed|0                           |1
>
> --- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
>> The issue is that memcpy must be produced instead of memove which does
>> not have optimized version for avx2 x86 and simply uses byte copy.
>
> I'd expected a if (! overlap) memcpy () else byte-copy at least.
>
> Note the loop distribution code doesn't try to be clever in choosing memcpy
> over memmove (using dependence analysis).  So improving loop distribution
> (adding a PKIND_MEMMOVE and conservatively using that from dependence analysis)
> is a possibility as well.  But we have
>
> (compute_affine_dependence
>   stmt_a: _2 = par.0_1->x2[i_19][j_20];
>   stmt_b: par.0_1->x1[i_19][j_20] = _2;
> (analyze_overlapping_iterations
>   (chrec_a = {0, +, 1}_2)
>   (chrec_b = {0, +, 1}_2)
>   (overlap_iterations_a = [0])
>   (overlap_iterations_b = [0]))
> (analyze_overlapping_iterations
>   (chrec_a = i_19)
>   (chrec_b = i_19)
>   (overlap_iterations_a = [0])
>   (overlap_iterations_b = [0]))
> (analyze_overlapping_iterations
>   (chrec_a = 33280)
>   (chrec_b = 12800)
> (analyze_ziv_subscript
> )
>   (overlap_iterations_a = no dependence)
>   (overlap_iterations_b = no dependence))
> ) -> no dependence
>
> so I think we could use memcpy for all no dependence cases?
>
> --
> You are receiving this mail because:
> You reported the bug.

References:
- [Bug tree-optimization/78348] New: [7 REGRESSION] 15% performance drop for coremark-pro/nnet-test after r242038
  - From: ysrumyan at gmail dot com

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]