This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Transform (m1 > m2) * d into m1> m2 ? d : 0


Richard Biener wrote:
> Hurugalawadi, Naveen wrote:
> > The code (m1 > m2) * d code should be optimized as m1> m2 ? d : 0.

> What's the reason of this transform?  I expect that the HW multiplier
> is quite fast given one operand is either zero or one and a multiplication
> is a gimple operation that's better handled in optimizations than
> COND_EXPRs which eventually expand to conditional code which
> would be much slower.

Even really fast multipliers have several cycles latency, and this is generally
fixed irrespectively of the inputs. Maybe you were thinking about division?

Additionally integer multiply typically has much lower throughput than other 
ALU operations like conditional move - a modern CPU may have 4 ALUs
but only 1 multiplier, so removing redundant integer multiplies is always good.

Note (m1 > m2) is also a conditional expression which will result in branches
for floating point expressions and on some targets even for integers. Moving
the multiply into the conditional expression generates the best code:

Integer version:
f1:
	cmp    w0, 100
	csel   w0, w1, wzr, gt
	ret
f2:
	cmp    w0, 100
	cset   w0, gt
	mul    w0, w0, w1
	ret

Float version:
f3:
	movi   v1.2s, #0
	cmp    w0, 100
	fcsel  s0, s0, s1, gt
	ret
f4:
	cmp    w0, 100
	bgt    .L8
	movi   v1.2s, #0
	fmul   s0, s0, s1  // eh???
.L8:
	ret

Wilco

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]