This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Hello! Attached patch fixes the problem with false data dependency on output register for popcnt, lzcnt and tzcnt insns on sandybridge and haswell targets. The new insn pattern shadows existing one, and after reload, the clearing isns is split out of the insn. This way the clearing insn can be scheduled by postreload scheduler. The new pattern takes care to avoid live registers, so the compiler is always able to clear output reg. The testcase from the PR, compiled with -O3 -march=corei7 improves on Ivybridge from: unsigned 209717360000 3.21002 sec 16.3329 GB/s uint64_t 209717360000 4.06517 sec 12.8971 GB/s to (-O3 -march=corei7 -mtune-ctrl=avoid_false_dep_for_bmi): unsigned 209717360000 3.14541 sec 16.6683 GB/s uint64_t 209717360000 2.3663 sec 22.1564 GB/s Due to high impact, the new tune flag is enabled by default for Intel tunes and generic: m_SANDYBRIDGE | m_HASWELL | m_INTEL | m_GENERIC 2014-08-16 Uros Bizjak <ubizjak@gmail.com> PR target/62011 * config/i386/x86-tune.def (X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): New tune flag. * config/i386/i386.h (TARGET_AVOID_FALSE_DEP_FOR_BMI): New define. * config/i386/i386.md (unspec) <UNSPEC_INSN_FALSE_DEP>: New unspec. (ffs<mode>2): Do not expand with tzcnt for TARGET_AVOID_FALSE_DEP_FOR_BMI. (ffssi2_no_cmove): Ditto. (*tzcnt<mode>_1): Disable for TARGET_AVOID_FALSE_DEP_FOR_BMI. (ctz<mode>2): New expander. (*ctz<mode>2_falsedep_1): New insn_and_split pattern. (*ctz<mode>2_falsedep): New insn. (*ctz<mode>2): Rename from ctz<mode>2. (clz<mode>2_lzcnt): New expander. (*clz<mode>2_lzcnt_falsedep_1): New insn_and_split pattern. (*clz<mode>2_lzcnt_falsedep): New insn. (*clz<mode>2): Rename from ctz<mode>2. (popcount<mode>2): New expander. (*popcount<mode>2_falsedep_1): New insn_and_split pattern. (*popcount<mode>2_falsedep): New insn. (*popcount<mode>2): Rename from ctz<mode>2. (*popcount<mode>2_cmp): Remove. (*popcountsi2_cmp_zext): Ditto. The patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32} and will be committed to mainline SVN after a couple of days. The patch will be also backported to 4.9 branch. Uros.
Attachment:
p.diff.txt
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |