Bug 77628 - avx512: unnecessary GR extending after kmovw
Summary: avx512: unnecessary GR extending after kmovw
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 5.3.1
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-17 18:59 UTC by Wojciech Mula
Modified: 2016-09-18 18:14 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Wojciech Mula 2016-09-17 18:59:55 UTC
According to the latests documentation from Intel, the kmovw instruction
zeros the higher part of a GP register:

    KMOVW
    IF *destination is a memory location*
        DEST[15:0] <- SRC[15:0]
    IF *destination is a mask register or a GPR *
        DEST <- ZeroExtension(SRC[15:0])

GCC adds superfluous movzwl after kmovw:

Program:

    #include <stdint.h>
    #include <immintrin.h>

    uint32_t test(__m512i a, __m512i b) {
        
        uint32_t c = _mm512_cmpeq_epi32_mask(a, b);
        return c;
    }

Invocation:

$ gcc-5 --version
gcc-5 (Debian 5.3.1-13) 5.3.1 20160323
$ gcc-5 -O3 -S -mavx512f report.cpp

Assembly output:

	vpcmpeqd	%zmm1, %zmm0, %k1
	kmovw	%k1, %eax
	movzwl	%ax, %eax <<<< HERE
	ret
Comment 1 Uroš Bizjak 2016-09-18 18:14:59 UTC
This is fixed for current trunk (gcc-7).