[PATCH][ARM] Remove support for MULS

Wilco Dijkstra Wilco.Dijkstra@arm.com
Thu Oct 10 17:20:00 GMT 2019


Any further comments? Note GCC doesn't support S/UMULLS either since it is equally
useless. It's no surprise that Thumb-2 removed support for flag-setting 64-bit multiplies,
while AArch64 didn't add flag-setting multiplies. So there is no argument that these
instructions are in any way useful to compilers.

Wilco

Hi Richard, Kyrill,

>> I disagree. If they still trigger and generate better code than without 
>> we should keep them.
> 
>> What kind of code is *common* varies greatly from user to user.

Not really - doing a multiply and checking whether the result is zero is
exceedingly rare. I found only 3 cases out of 7300 mul/mla in all of
SPEC2006... Overall codesize effect with -Os: 28 bytes or 0.00045%.

So we really should not even consider wasting any more time on
maintaining such useless patterns.

> Also, the main reason for restricting their use was that in the 'olden 
> days', when we had multi-cycle implementations of the multiply 
> instructions with short-circuit fast termination when the result was 
> completed, the flag setting variants would never short-circuit.

That only applied to conditional multiplies IIRC, some implementations
would not early-terminate if the condition failed. Today there are serious
penalties for conditional multiplies - but that's something to address in a
different patch.

> These days we have fixed cycle counts for multiply instructions, so this 
> is no-longer a penalty.  

No, there is a large overhead on modern cores when you set the flags,
and there are other penalties due to the extra micro-ops.

> In the thumb2 case in particular we can often 
> reduce mul-cmp (6 bytes) to muls (2 bytes), that's a 66% saving on this 
> sequence and definitely worth exploiting when we can, even if it's not 
> all that common.

Using muls+cbz is equally small. With my patch we generate this with -Os:

void g(void);
int f(int x)
{
  if (x * x != 0)
    g();
}

f:
        muls    r0, r0, r0
        push    {r3, lr}
        cbz     r0, .L9
        bl      g
.L9:
        pop     {r3, pc}

Wilco


More information about the Gcc-patches mailing list