[PATCH][ARM] Remove support for MULS
Wilco Dijkstra
Wilco.Dijkstra@arm.com
Thu Oct 10 17:20:00 GMT 2019
Any further comments? Note GCC doesn't support S/UMULLS either since it is equally
useless. It's no surprise that Thumb-2 removed support for flag-setting 64-bit multiplies,
while AArch64 didn't add flag-setting multiplies. So there is no argument that these
instructions are in any way useful to compilers.
Wilco
Hi Richard, Kyrill,
>> I disagree. If they still trigger and generate better code than without
>> we should keep them.
>
>> What kind of code is *common* varies greatly from user to user.
Not really - doing a multiply and checking whether the result is zero is
exceedingly rare. I found only 3 cases out of 7300 mul/mla in all of
SPEC2006... Overall codesize effect with -Os: 28 bytes or 0.00045%.
So we really should not even consider wasting any more time on
maintaining such useless patterns.
> Also, the main reason for restricting their use was that in the 'olden
> days', when we had multi-cycle implementations of the multiply
> instructions with short-circuit fast termination when the result was
> completed, the flag setting variants would never short-circuit.
That only applied to conditional multiplies IIRC, some implementations
would not early-terminate if the condition failed. Today there are serious
penalties for conditional multiplies - but that's something to address in a
different patch.
> These days we have fixed cycle counts for multiply instructions, so this
> is no-longer a penalty.
No, there is a large overhead on modern cores when you set the flags,
and there are other penalties due to the extra micro-ops.
> In the thumb2 case in particular we can often
> reduce mul-cmp (6 bytes) to muls (2 bytes), that's a 66% saving on this
> sequence and definitely worth exploiting when we can, even if it's not
> all that common.
Using muls+cbz is equally small. With my patch we generate this with -Os:
void g(void);
int f(int x)
{
if (x * x != 0)
g();
}
f:
muls r0, r0, r0
push {r3, lr}
cbz r0, .L9
bl g
.L9:
pop {r3, pc}
Wilco
More information about the Gcc-patches
mailing list