This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][ARM] optimizing _muldi3 for Thumb


On Tuesday 05 August 2008, Doug Kwan (éæå) wrote:
> 2008/8/4 Paul Brook <paul@codesourcery.com>:
> > No. ARMv6 is not a subset of ARMv6-M.
>
> I guess I would like to ask the reverse. Is ARMV6 is considered the
> superset of all ARMV6*?

Targeting a superset of all instructions would be pointless - the code 

ARMv6 is the base v6 architecture, which was then extended with things like 
trustzone, SMP and Thumb-2.

The v7 architecture introduces architecture profiles, and ARMv7 is used to 
target the common subset of these.  ARMv6-M was a later creation which is 
basically the Thumb-1 instructions from ARMv6 plus the system model and a 
handful of instructions from ARMv7-M.

> >> From a scheulding point of view, an MUL->MLA chain is bad.
> >> That's the reason why I use two independent MUL in the ARM version.
> >
> > Do you have any proof of this? Most cores have a bypass for the
> > accumulate operand, and will issue dependent mul/mla instructions back to
> > back.
>
> I checked the 1022E Technical Manual. The output latency of MUL result
> is available 3 cycles after issue.  Maybe I did not look deep enough
> but I could not find special bypassing for MUL-MLA pair.

Section 16.3 "Interlocks":
[...] a multiply accumulate instruction can start before the accumulate 
operands are available [...].

Two of those three cycles are spent iterating in the execute stage, which 
gives plenty of time for forwarding the accumulator value.

I'm pretty sure at least the arm7, arm920, arm9e, arm10, arm11, 
cortex-[a8,r4,m3] and Marvell cores either don't have a pipelined execution 
stage, or have late accumulator forwarding.

Paul


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]