This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFT PATCH, i386]: Optimize zero-extensions from mask registers


Hello!

Attached patch was inspired by assembly from PR 72805 testcase.
Currently, the compiler generates:

test:
        vpternlogd      $0xFF, %zmm0, %zmm0, %zmm0
        vpxord  %zmm1, %zmm1, %zmm1
        vpcmpd  $1, %zmm1, %zmm0, %k1
        kmovw   %k1, %eax
        movzwl  %ax, %eax
        ret

Please note that kmovw already zero-extended from a mask register.

Attached patch allows ree pass to propagate mask registers to zext
insn patterns, resulting in:

test:
        vpternlogd      $0xFF, %zmm0, %zmm0, %zmm0      # 24
movv16si_internal/2     [length = 6]
        vpxord  %zmm1, %zmm1, %zmm1     # 25    movv16si_internal/1
 [length = 6]
        vpcmpd  $1, %zmm1, %zmm0, %k1   # 13    avx512f_cmpv16si3
 [length = 7]
        kmovw   %k1, %eax       # 27    *zero_extendhisi2/2     [length = 4]
        ret     # 30    simple_return_internal  [length = 1]

2016-08-05  Uros Bizjak  <ubizjak@gmail.com>

    * config/i386/i386.md (*zero_extendsidi2): Add (*r,*k) alternative.
    (zero_extend<mode>di2): Ditto.
    (*zero_extend<mode>si2): Ditto.
    (*zero_extendqihi2): Ditto.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

The patch is in RFT state, since I have no means to test AVX512 stuff.
Kirill, can someone from Intel please test the patch?

Uros.

Attachment: p.diff.txt
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]