This is the mail archive of the
mailing list for the GCC project.
Re: Modifying ARM code generator for elimination of 8bit writes - need help
- From: Wolfgang Mües <wolfgang at iksw-muees dot de>
- To: gcc at gcc dot gnu dot org
- Cc: Rask Ingemann Lambertsen <rask at sygehus dot dk>, gcc-patches at gcc dot gnu dot org
- Date: Tue, 6 Jun 2006 07:42:20 +0200
- Subject: Re: Modifying ARM code generator for elimination of 8bit writes - need help
- References: <email@example.com> <firstname.lastname@example.org> <20060605141647.GA28964@sygehus.dk>
On Monday 05 June 2006 16:16, Rask Ingemann Lambertsen wrote:
> On Mon, Jun 05, 2006 at 01:47:10PM +0200, Wolfgang Mües wrote:
> Does GCC happen to accept "[%r, #0]" for swp?
No. But no problem here to change that.
> I think the comment in arm.h is wrong. The manual seems to agree with
> the code.
Just to make it easy for beginners...
> I tried 'V' instead, but it looks as if reload completely ignores the
> meaning of the constraint. There is already a comment in arm.md about
> that. It should be investigated further.
Hmmm... I have searched 'Q' in the arm files. Not used in arm.md, only
for some variants of arm (cirrus). Maybe only implemented for them?
> Meanwhile, I changed arm_legitimate_address_p() to enforce the
> correct address form. This hurts byte loads too, though.
I assume there is no way to tell the direction in
arm_legitimate_address_p() ? Hmmm.
> Index: gcc/config/arm/arm.opt
> --- gcc/config/arm/arm.opt (revision 114119)
> +++ gcc/config/arm/arm.opt (working copy)
> @@ -153,3 +153,7 @@
> Target Report RejectNegative Mask(LITTLE_WORDS)
> Assume big endian bytes, little endian words
> +Target Report Mask(SWP_BYTE_WRITES)
> +Use the swp instruction for byte writes
In my environment (gcc 4.0.2), this is different. But I was able to find
the definitions in arm.h and implement these changes. Easyer than
(The DSLINUX team is not using gcc 4.1 because of compile problems with
the 2.6.14er kernel).
> + swp%?b\\t%1, %1, %0\;ldr%?b\\t%1, %0"
You should get a price for cleverness here!
> +; Avoid reading the stored value back if we have a spare register.
> + [(match_scratch:QI 2 "r")
> + (set (match_operand:QI 0 "memory_operand" "")
> + (match_operand:QI 1 "register_operand" ""))]
> + "TARGET_ARM && TARGET_SWP_BYTE_WRITES"
> + [(parallel [
> + (set (match_dup 0) (match_dup 1))
> + (clobber (match_dup 2))]
> + )]
As far as I can tell now, this works good. But I think there are many
cases in which the source operand is not needed after the store. Is
there a possibility to clobber the source operand and not using another
Hmmm. Most of the code I have seen in the first tests have no problem
with this extra register...it's available.
> With -O2 -mswp-byte-writes:
> @ args = 0, pretend = 0, frame = 0
> @ frame_needed = 0, uses_anonymous_args = 0
> str lr, [sp, #-4]!
> add r2, r0, #4
> add lr, r0, #5
> ldrb r3, [lr, #0] @ zero_extendqisi2
> ldrb r1, [r2, #0] @ zero_extendqisi2
> eor r2, r1, r3
> add r3, r3, r1
> ldr ip, [r0, #0]
> str r3, [r0, #0]
> swpb r3, r2, [lr, #0]
> str ip, [r0, #8]
> ldr pc, [sp], #4
> The register allocator chooses to use the lr register, in turn
> causing link register save alimination to fail, which doesn't help.
I can't understand this without explanation... is it bad?
Rask, thank you very much for your work.
We're back to the times when men were men
and wrote their own device drivers.