This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][ARM] optimizing _muldi3 for Thumb
2008/8/4 Paul Brook <paul@codesourcery.com>:
> No. ARMv6 is not a subset of ARMv6-M.
I guess I would like to ask the reverse. Is ARMV6 is considered the
superset of all ARMV6*? If so, I would expect an ARMV6 compiler to
produce workable executable for all ARMV6* variant. In that case, I
cannot force using the ARM version of __muldi3 since it will not work
for ARMV6.
>> From a scheulding point of view, an MUL->MLA chain is bad.
>> That's the reason why I use two independent MUL in the ARM version.
>
> Do you have any proof of this? Most cores have a bypass for the accumulate
> operand, and will issue dependent mul/mla instructions back to back.
I checked the 1022E Technical Manual. The output latency of MUL result
is available 3 cycles after issue. Maybe I did not look deep enough
but I could not find special bypassing for MUL-MLA pair.
> Nonsense.
I rearranged my code and sent out another email.
> Not a good excuse IMHO. As Mark mentioned, the out of line version can be
> useful for size optimization.
-Doug