This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH][ARM] optimizing _muldi3 for Thumb

From: "=?big5?b?RG91ZyBLd2FuICjD9q62vHcp?=" <dougkwan at google dot com>
To: "Paul Brook" <paul at codesourcery dot com>
Cc: gcc-patches at gcc dot gnu dot org, "Mark Mitchell" <mark at codesourcery dot com>
Date: Mon, 4 Aug 2008 15:37:58 -0700
Subject: Re: [PATCH][ARM] optimizing _muldi3 for Thumb
References: <498552560807011502w6dd3dd62q3bd7b1cf08102387@mail.gmail.com> <488E8D3C.5020000@codesourcery.com> <498552560808012307j236465f6k4aac9764f1955d66@mail.gmail.com> <200808042254.48200.paul@codesourcery.com>

Hi

2008/8/4 Paul Brook <paul@codesourcery.com>:
> On Saturday 02 August 2008, Doug Kwan (關振德) wrote:
>>    Here is an updated patch.
>
> This is bad in several ways.
Bad feedback is better than no feedback. :)

> The condition for pure Thumb-1 code is completely wrong, you want
> __ARM_ARCH_6M__, exactly the same as all the other Thumb-1 only code.
> For most purposes ARMv6-M is Thumb-1. ARM marketing sometimes call
> it "Thumb-2" to deliberately confuse people. Please don't do this.

Question: If someone configures gcc with --with-arch=armv6, the libgcc
will be built with __ARM_ARCH_6__.  If he/she then uses the said gcc
to compile something with -mthumb. Is the resulting binary expected to
work on a Cortex-M1?  That the reason why I test of both
__ARM_ARCH_6__ and __ARM_ARCH_6M__.

>> +/* We cannot use the faster ARM version for THUMB libgcc on V6 and V6M
>> since  Cortex-M1 does not run ARM code. */
>
> Shows you've completely misunderstood which architecture variants have which
> features.
>
> umull is only available on v3M and later cores.

Will add another ARM version with umull then.

> The Thumb-2 code is just dumb (Yes I know it's what the compiler generates,
> but gcc is notoriously bad at doubleword arithmetic).  mla is not Thumb-2
> specific, and using it certainly doesn't require additional register pushes.
> AFAICS there's no reason to have different code for ARM and Thumb-2.

From a scheulding point of view, an MUL->MLA chain is bad.  That's the
reason why I use two independent MUL in the ARM version.   I cannot
use the ARM version for Thumb-2 as well since it accesses ip in the
MUL instruction.

I don't care too much about Thumb-2 since gcc currently generate
in-line 64-bit multiplication anyway so I just take what gcc generates
for the Thumb-2. If someone really wants to call __aeabi_lmul instead,
my dumb code is still better than the horrible thing based on the C
version.  I will remove the push and pop though.

-Doug

Follow-Ups:
- Re: [PATCH][ARM] optimizing _muldi3 for Thumb
  - From: =?big5?b?RG91ZyBLd2FuICjD9q62vHcp?=
- Re: [PATCH][ARM] optimizing _muldi3 for Thumb
  - From: Paul Brook

References:
- Re: [PATCH][ARM] optimizing _muldi3 for Thumb
  - From: =?big5?b?RG91ZyBLd2FuICjD9q62vHcp?=
- Re: [PATCH][ARM] optimizing _muldi3 for Thumb
  - From: Paul Brook

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]