This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Modifying ARM code generator for elimination of 8bit writes - need help


Rask,

On Monday 05 June 2006 16:16, Rask Ingemann Lambertsen wrote:
> On Mon, Jun 05, 2006 at 01:47:10PM +0200, Wolfgang Mües wrote:
> Does GCC happen to accept "[%r, #0]" for swp?

No. But no problem here to change that.

> I think the comment in arm.h is wrong. The manual seems to agree with
> the code.

Just to make it easy for beginners...

> I tried 'V' instead, but it looks as if reload completely ignores the
> meaning of the constraint. There is already a comment in arm.md about
> that. It should be investigated further.

Hmmm... I have searched 'Q' in the arm files. Not used in arm.md, only 
for some variants of arm (cirrus). Maybe only implemented for them?

> Meanwhile, I changed arm_legitimate_address_p() to enforce the
> correct address form. This hurts byte loads too, though.

I assume there is no way to tell the direction in 
arm_legitimate_address_p() ? Hmmm.

> Index: gcc/config/arm/arm.opt
> ===================================================================
> --- gcc/config/arm/arm.opt	(revision 114119)
> +++ gcc/config/arm/arm.opt	(working copy)
> @@ -153,3 +153,7 @@
>  mwords-little-endian
>  Target Report RejectNegative Mask(LITTLE_WORDS)
>  Assume big endian bytes, little endian words
> +
> +mswp-byte-writes
> +Target Report Mask(SWP_BYTE_WRITES)
> +Use the swp instruction for byte writes

In my environment (gcc 4.0.2), this is different. But I was able to find 
the definitions in arm.h and implement these changes. Easyer than 
expected...

(The DSLINUX team is not using gcc 4.1 because of compile problems with 
the 2.6.14er kernel).

> +   swp%?b\\t%1, %1, %0\;ldr%?b\\t%1, %0"

You should get a price for cleverness here!

> +; Avoid reading the stored value back if we have a spare register.
> +(define_peephole2
> +  [(match_scratch:QI 2 "r")
> +   (set (match_operand:QI 0 "memory_operand" "")
> +        (match_operand:QI 1 "register_operand" ""))]
> +  "TARGET_ARM && TARGET_SWP_BYTE_WRITES"
> +  [(parallel [
> +    (set (match_dup 0) (match_dup 1))
> +    (clobber (match_dup 2))]
> +  )]
> +)

As far as I can tell now, this works good. But I think there are many 
cases in which the source operand is not needed after the store. Is 
there a possibility to clobber the source operand and not using another 
register?

Hmmm. Most of the code I have seen in the first tests have no problem 
with this extra register...it's available.

> With -O2 -mswp-byte-writes:
>
> bytewritetest:
> 	@ args = 0, pretend = 0, frame = 0
> 	@ frame_needed = 0, uses_anonymous_args = 0
> 	str	lr, [sp, #-4]!
> 	add	r2, r0, #4
> 	add	lr, r0, #5
> 	ldrb	r3, [lr, #0]	@ zero_extendqisi2
> 	ldrb	r1, [r2, #0]	@ zero_extendqisi2
> 	eor	r2, r1, r3
> 	add	r3, r3, r1
> 	ldr	ip, [r0, #0]
> 	str	r3, [r0, #0]
> 	swpb	r3, r2, [lr, #0]
> 	str	ip, [r0, #8]
> 	ldr	pc, [sp], #4
>
>
> The register allocator chooses to use the lr register, in turn
> causing link register save alimination to fail, which doesn't help.

I can't understand this without explanation... is it bad?

Rask, thank you very much for your work.

regards
Wolfgang
-- 
We're back to the times when men were men 
and wrote their own device drivers.

(Linus Torvalds)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]