This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH PR62011]
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: Yuri Rumyantsev <ysrumyan at gmail dot com>
- Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>, Uros Bizjak <ubizjak at gmail dot com>
- Date: Thu, 14 Aug 2014 08:18:49 -0700
- Subject: Re: [PATCH PR62011]
- Authentication-results: sourceware.org; auth=none
- References: <CAEoMCqRQgLeT9EpwxmX99wi=Fzc3CN2ZE4dQGATTVd8xmV1nNA at mail dot gmail dot com>
On Thu, Aug 14, 2014 at 4:50 AM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Hi All,
>
> Here is a fix for PR 62011 - remove false dependency for unary
> bit-manipulation instructions for latest BigCore chips (Sandybridge
> and Haswell) by outputting in assembly file zeroing destination
> register before bmi instruction. I checked that performance restored
> for popcnt, lzcnt and tzcnt instructions.
>
> Bootstrap and regression testing did not show any new failures.
>
> Is it OK for trunk?
>
> gcc/ChangeLog
> 2014-08-14 Yuri Rumyantsev <ysrumyan@gmail.com>
>
> PR target/62011
> * config/i386/i386-protos.h (ix86_avoid_false_dep_for_bm): New function
> prototype.
> * config/i386/i386.c (ix86_avoid_false_dep_for_bm): New function.
> * config/i386/i386.h (TARGET_AVOID_FALSE_DEP_FOR_BM) New macros.
> * config/i386/i386.md (ctz<mode>2, clz<mode>2_lzcnt, popcount<mode>2,
> *popcount<mode>2_cmp, *popcountsi2_cmp_zext): Output zeroing
> destination register for unary bit-manipulation instructions
> if required.
Why don't you use splitter to to generate XOR?
> * config/i386/x86-tune.def (X86_TUNE_AVOID_FALSE_DEP_FOR_BM): New.
Is this needed for r16 and r32? The original report says that only
r64 is affected:
http://stackoverflow.com/questions/25078285/replacing-a-32-bit-loop-count-variable-with-64-bit-introduces-crazy-performance
Have you tried this on Silvermont? Does it help Silvermont?
--
H.J.