This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: More type narrowing in match.pd

From: Marc Glisse <marc dot glisse at inria dot fr>
To: Jeff Law <law at redhat dot com>
Cc: gcc-patches at gcc dot gnu dot org
Date: Thu, 30 Apr 2015 23:38:00 +0200 (CEST)
Subject: Re: More type narrowing in match.pd
Authentication-results: sourceware.org; auth=none
References: <5541A704 dot 3070502 at redhat dot com> <alpine dot DEB dot 2 dot 11 dot 1504300848520 dot 1599 at laptop-mg dot saclay dot inria dot fr> <55426499 dot 40901 at redhat dot com>
Reply-to: gcc-patches at gcc dot gnu dot org

On Thu, 30 Apr 2015, Jeff Law wrote:

On 04/30/2015 01:17 AM, Marc Glisse wrote:


+/* This is another case of narrowing, specifically when there's an outer
+   BIT_AND_EXPR which masks off bits outside the type of the innermost
+   operands.   Like the previous case we have to convert the operands
+   to unsigned types to avoid introducing undefined behaviour for the
+   arithmetic operation.  */
+(for op (minus plus)

No mult? or widen_mult with a different pattern? (maybe that's already
done elsewhere)

No mult. When I worked on the pattern for 47477, supporting mult clearlyregressed the generated code -- presumably because we can often widen theoperands for free.

It would help with the testcase below, but I am willing to accept that thecases where it hurts are more common (and guessing if it will help or hurtmay be hard), while with +- the cases that help are more common.


void f(short*a) {
  a = __builtin_assume_aligned(a,128);
  for (int i = 0; i < (1<<22); ++i) {
#ifdef EASY
    a[i] *= a[i];
#else
    int x = a[i];
    x *= x;
    a[i] = x;
#endif
  }
}

With EASY, a nice little loop:
.L2:
	movdqa	(%rdi), %xmm0
	addq	$16, %rdi
	pmullw	%xmm0, %xmm0
	movaps	%xmm0, -16(%rdi)
	cmpq	%rdi, %rax
	jne	.L2

while without EASY, we get the uglier:
.L2:
	movdqa	(%rdi), %xmm0
	addq	$16, %rdi
	movdqa	%xmm0, %xmm2
	movdqa	%xmm0, %xmm1
	pmullw	%xmm0, %xmm2
	pmulhw	%xmm0, %xmm1
	movdqa	%xmm2, %xmm0
	punpckhwd	%xmm1, %xmm2
	punpcklwd	%xmm1, %xmm0
	movdqa	%xmm2, %xmm1
	movdqa	%xmm0, %xmm2
	punpcklwd	%xmm1, %xmm0
	punpckhwd	%xmm1, %xmm2
	movdqa	%xmm0, %xmm1
	punpcklwd	%xmm2, %xmm0
	punpckhwd	%xmm2, %xmm1
	punpcklwd	%xmm1, %xmm0
	movaps	%xmm0, -16(%rdi)
	cmpq	%rdi, %rax
	jne	.L2

A small pattern like
(simplify
 (vec_pack_trunc (widen_mult_lo @0 @1) (widen_mult_hi:c @0 @1))
 (mult @0 @1))

probably with some tweaks (convert to unsigned? only do it before vectorlowering?), would fix this particular case, but not as well as narrowingbefore vectorization.


--
Marc Glisse

References:
- More type narrowing in match.pd
  - From: Jeff Law
- Re: More type narrowing in match.pd
  - From: Marc Glisse
- Re: More type narrowing in match.pd
  - From: Jeff Law

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]