This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: loading of zeros into {x,y,z}mm registers

From: "Jan Beulich" <JBeulich at suse dot com>
To: "Kirill Yukhin" <kirill dot yukhin at gmail dot com>
Cc: <gcc at gcc dot gnu dot org>
Date: Fri, 01 Dec 2017 05:08:40 -0700
Subject: Re: loading of zeros into {x,y,z}mm registers
Authentication-results: sourceware.org; auth=none
References: <5A1EE7890200007800193391@prv-mh.provo.novell.com> <20171201054550.GA22657@titus>

>>> On 01.12.17 at 06:45, <kirill.yukhin@gmail.com> wrote:
> On 29 Nov 08:59, Jan Beulich wrote:
>> in an unrelated context I've stumbled across a change of yours
>> from Aug 2014 (revision 213847) where you "extend" the ways
>> of loading zeros into registers. I don't understand why this was
>> done, and the patch submission mail also doesn't give any reason.
>> My point is that simple VEX-encoded vxorps/vxorpd/vpxor with
>> 128-bit register operands ought to be sufficient to zero any width
>> registers, due to the zeroing of the high parts the instructions do.
>> Hence by using EVEX encoded insns it looks like all you do is grow
>> the instruction length by one or two bytes (besides making the
>> source somewhat more complicated to follow). At the very least
>> the shorter variants should be used for -Os imo.
> As far as I can recall, this was done since we cannot load zeroes
> into upper 16 MM registers, which are available in EVEX exclusively.

Ah, I did overlook this aspect indeed. I still think the smaller VEX
encoding should then be used for the low 16 registers.

Furthermore this

typedef double __attribute__((vector_size(16))) v2df_t;
typedef double __attribute__((vector_size(32))) v4df_t;

void test1(void) {
	register v2df_t x asm("xmm31") = {};
	asm volatile("" :: "v" (x));
}

void test2(void) {
	register v4df_t x asm("ymm31") = {};
	asm volatile("" :: "v" (x));
}

translates to "vxorpd %xmm31, %xmm31, %xmm31" for both
functions with -mavx512vl, yet afaict the instructions would #UD
without AVX-512DQ, which suggests to me that the original
intention wasn't fully met.

Jan

Follow-Ups:
- Re: loading of zeros into {x,y,z}mm registers
  - From: Jakub Jelinek

References:
- Re: loading of zeros into {x,y,z}mm registers
  - From: Kirill Yukhin

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]