[PATCH][ARM] FAIL: gcc.target/arm/pr58041.c scan-assembler ldrb

Maciej W. Rozycki macro@codesourcery.com
Tue May 27 14:32:00 GMT 2014


On Tue, 13 Aug 2013, Kyrylo Tkachov wrote:

> > On 08/09/13 11:01, Julian Brown wrote:
> > > On Thu, 8 Aug 2013 15:44:17 +0100
> > > Kyrylo Tkachov <kyrylo.tkachov@arm.com> wrote:
> > >
> > >> Hi all,
> > >>
> > >> The recently added gcc.target/arm/pr58041.c test exposed a bug in the
> > >> backend. When compiling for NEON and with -mno-unaligned-access we
> > >> end up generating the vld1.64 and vst1.64 instructions instead of
> > >> doing the accesses one byte at a time like -mno-unaligned-access
> > >> expects. This patch fixes that by enabling the NEON expander and
> > >> insns that produce these instructions only when unaligned accesses
> > >> are allowed.
> > >>
> > >> Bootstrapped on arm-linux-gnueabihf. Tested arm-none-eabi on qemu.
> > >>
> > >> Ok for trunk and 4.8?
> > >
> > > I'm not sure if this is right, FWIW -- do the instructions in question
> > > trap if the CPU is set to disallow unaligned accesses? I thought that
> > > control bit only affected ARM core loads & stores, not NEON ones.
> > 
> > Thinking again - the ARM-ARM says - the alignment check is for element
> > size, so an alternative might be to use vld1.8 instead to allow for this
> > at which point we might as well do something else with the test. I note
> > that these patterns are not allowed for BYTES_BIG_ENDIAN so that might
> > be a better alternative than completely disabling it.
> 
> Looking at the section on unaligned accesses, it seems that the
> ldrb/strb-class instructions are the only ones that are unaffected by the
> SCTLR.A bit and do not produce alignment faults in any case.
> The NEON load/store instructions, including vld1.8 can still cause an
> alignment fault when SCTLR.A is set. So it seems we can only use the byte-wise
> core memory instructions for unaligned data.

 This change however has regressed gcc.dg/vect/vect-72.c on the 
arm-linux-gnueabi target, -march=armv5te, in particular in 4.8.

 Beforehand the code fragment in question produced was:

.L14:
	sub	r1, r3, #16
	add	r3, r3, #16
	vld1.8	{q8}, [r1]
	cmp	r3, r0
	vst1.64	{d16-d17}, [r2:64]!
	bne	.L14

Afterwards it is:

.L14:
	vldr	d16, [r3, #-16]
	vldr	d17, [r3, #-8]
	add	r3, r3, #16
	cmp	r3, r1
	vst1.64	{d16-d17}, [r2:64]!
	bne	.L14

and the second VLDR instruction traps with SIGILL (the value in R3 is 
0x10b29, odd as you'd expect, pointing into `ib').  I don't know why and 
especially why only the second of the two (regrettably I've been unable to 
track down an instruction reference that'd be detailed enough to specify 
what exceptions VLDR can produce and under what conditions).

 Interestingly enough the trap does not happen when the program is 
single-stepped under GDB (via gdbserver), however it then aborts once this 
copy loop has completed as `ia' contains rubbish and fails the test.

 Is there a fix that needs backporting to 4.8 or is this an issue that was 
unknown so far?

 Hardware and Linux used:

$ cat /proc/cpuinfo
Processor	: ARMv7 Processor rev 2 (v7l)
processor	: 0
BogoMIPS	: 2013.49

processor	: 1
BogoMIPS	: 1963.08

Features	: swp half thumb fastmult vfp edsp thumbee neon vfpv3
CPU implementer	: 0x41
CPU architecture: 7
CPU variant	: 0x1
CPU part	: 0xc09
CPU revision	: 2

Hardware	: OMAP4430 Panda Board
Revision	: 0020
Serial		: 0000000000000000
$ uname -a
Linux panda2 2.6.35-903-omap4 #14-Ubuntu SMP PREEMPT Wed Oct 6 17:23:24 UTC 2010 armv7l GNU/Linux
$ 

  Maciej



More information about the Gcc-patches mailing list