This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Transform (m1 > m2) * d into m1> m2 ? d : 0

From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
To: Richard Biener <richard dot guenther at gmail dot com>, "Naveen dot Hurugalawadi at cavium dot com" <Naveen dot Hurugalawadi at cavium dot com>
Cc: nd <nd at arm dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
Date: Thu, 29 Jun 2017 11:20:22 +0000
Subject: Re: [PATCH] Transform (m1 > m2) * d into m1> m2 ? d : 0
Authentication-results: sourceware.org; auth=none
Authentication-results: arm.com; dkim=none (message not signed) header.d=none;arm.com; dmarc=none action=none header.from=arm.com;
Nodisclaimer: True
Spamdiagnosticmetadata: NSPM
Spamdiagnosticoutput: 1:99

Richard Biener wrote:
> Hurugalawadi, Naveen wrote:
> > The code (m1 > m2) * d code should be optimized as m1> m2 ? d : 0.

> What's the reason of this transform?  I expect that the HW multiplier
> is quite fast given one operand is either zero or one and a multiplication
> is a gimple operation that's better handled in optimizations than
> COND_EXPRs which eventually expand to conditional code which
> would be much slower.

Even really fast multipliers have several cycles latency, and this is generally
fixed irrespectively of the inputs. Maybe you were thinking about division?

Additionally integer multiply typically has much lower throughput than other 
ALU operations like conditional move - a modern CPU may have 4 ALUs
but only 1 multiplier, so removing redundant integer multiplies is always good.

Note (m1 > m2) is also a conditional expression which will result in branches
for floating point expressions and on some targets even for integers. Moving
the multiply into the conditional expression generates the best code:

Integer version:
f1:
	cmp    w0, 100
	csel   w0, w1, wzr, gt
	ret
f2:
	cmp    w0, 100
	cset   w0, gt
	mul    w0, w0, w1
	ret

Float version:
f3:
	movi   v1.2s, #0
	cmp    w0, 100
	fcsel  s0, s0, s1, gt
	ret
f4:
	cmp    w0, 100
	bgt    .L8
	movi   v1.2s, #0
	fmul   s0, s0, s1  // eh???
.L8:
	ret

Wilco

Follow-Ups:
- Re: [PATCH] Transform (m1 > m2) * d into m1> m2 ? d : 0
  - From: Richard Biener
- Re: [PATCH] Transform (m1 > m2) * d into m1> m2 ? d : 0
  - From: Jeff Law

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]