This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][ARM] optimizing _muldi3 for Thumb


2008/8/4 Paul Brook <paul@codesourcery.com>:

> No. ARMv6 is not a subset of ARMv6-M.

I guess I would like to ask the reverse. Is ARMV6 is considered the
superset of all ARMV6*? If so, I would expect an ARMV6 compiler to
produce workable executable for all ARMV6* variant. In that case, I
cannot force using the ARM version of __muldi3 since it will not work
for ARMV6.

>> From a scheulding point of view, an MUL->MLA chain is bad.
>> That's the reason why I use two independent MUL in the ARM version.
>
> Do you have any proof of this? Most cores have a bypass for the accumulate
> operand, and will issue dependent mul/mla instructions back to back.

I checked the 1022E Technical Manual. The output latency of MUL result
is available 3 cycles after issue.  Maybe I did not look deep enough
but I could not find special bypassing for MUL-MLA pair.

> Nonsense.

I rearranged my code and sent out another email.

> Not a good excuse IMHO.  As Mark mentioned, the out of line version can be
> useful for size optimization.

-Doug


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]