This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/67366] Poor assembly generation for unaligned memory accesses on ARM v6 & v7 cpus
- From: "ramana at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Thu, 27 Aug 2015 11:08:46 +0000
- Subject: [Bug target/67366] Poor assembly generation for unaligned memory accesses on ARM v6 & v7 cpus
- Auto-submitted: auto-generated
- References: <bug-67366-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67366
--- Comment #6 from Ramana Radhakrishnan <ramana at gcc dot gnu.org> ---
(In reply to rguenther@suse.de from comment #3)
> On Thu, 27 Aug 2015, rearnsha at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67366
> >
> > --- Comment #2 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
> > (In reply to Richard Biener from comment #1)
> > > I think this boils down to the fact that memcpy expansion is done too late
> > > and
> > > that (with more recent GCC) the "inlining" done on the GIMPLE level is
> > > restricted
> > > to !SLOW_UNALIGNED_ACCESS but arm defines STRICT_ALIGNMENT to 1
> > > unconditionally.
> > >
> >
> > Yep, we have to define STRICT_ALIGNMENT to 1 because not all load instructions
> > work with misaligned addresses (ldm, for example). The only way to handle
> > misaligned copies is through the movmisalign API.
>
> Are the movmisalign handled ones reasonably efficient? That is, more
> efficient than memcpy/memmove? Then we should experiment with
minor nit - missing include of optabs.h - fixing that and adding a
movmisalignsi pattern in the backend that just generates either an unaligned /
storesi insn generates the following for me for the above mentioned testcase.
read32:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
ldr r0, [r0] @ unaligned
bx lr
I'm on holiday from this evening so don't really want to push something today
...
>
> Index: gcc/gimple-fold.c
> ===================================================================
> --- gcc/gimple-fold.c (revision 227252)
> +++ gcc/gimple-fold.c (working copy)
> @@ -708,7 +708,9 @@ gimple_fold_builtin_memory_op (gimple_st
> /* If the destination pointer is not aligned we must be
> able
> to emit an unaligned store. */
> && (dest_align >= GET_MODE_ALIGNMENT (TYPE_MODE (type))
> - || !SLOW_UNALIGNED_ACCESS (TYPE_MODE (type),
> dest_align)))
> + || !SLOW_UNALIGNED_ACCESS (TYPE_MODE (type),
> dest_align)
> + || (optab_handler (movmisalign_optab, TYPE_MODE
> (type))
> + != CODE_FOR_nothing)))
> {
> tree srctype = type;
> tree desttype = type;
> @@ -720,7 +722,10 @@ gimple_fold_builtin_memory_op (gimple_st
> srcmem = tem;
> else if (src_align < GET_MODE_ALIGNMENT (TYPE_MODE
> (type))
> && SLOW_UNALIGNED_ACCESS (TYPE_MODE (type),
> - src_align))
> + src_align)
> + && (optab_handler (movmisalign_optab,
> + TYPE_MODE (type))
> + == CODE_FOR_nothing))
> srcmem = NULL_TREE;
> if (srcmem)
> {