This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [AArch64] Implement movmem for the benefit of inline memcpy
- From: Andrew Pinski <pinskia at gmail dot com>
- To: James Greenhalgh <james dot greenhalgh at arm dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Marcus Shawcroft <marcus dot shawcroft at arm dot com>
- Date: Thu, 31 Jul 2014 23:38:35 -0700
- Subject: Re: [AArch64] Implement movmem for the benefit of inline memcpy
- Authentication-results: sourceware.org; auth=none
- References: <1402044609-30902-1-git-send-email-james dot greenhalgh at arm dot com>
On Fri, Jun 6, 2014 at 1:50 AM, James Greenhalgh
<james.greenhalgh@arm.com> wrote:
>
> Hi,
>
> The move_by_pieces infrastructure performs a copy by repeatedly trying
> the largest safe copy it can make. So for a 15-byte copy we might see:
>
> offset amount bytes copied
> 0 8 0-7
> 8 4 8-11
> 12 2 12-13
> 14 1 14
>
> However, we can implement a 15-byte copy as so:
>
> offset amount bytes copied
> 0 8 0-7
> 7 8 7-14
>
> Which can prove more efficient for both space and speed.
>
> In this patch we set MOVE_RATIO low to avoid using move_by_pieces, and
> implement the movmem pattern name to expand small block copy cases. Note, this
> optimization does not apply for -mstrict-align targets, which must continue
> copying byte-by-byte.
>
> Setting MOVE_RATIO low in this way causes a few tests to begin failing,
> both of these are documented in the test-case as expected to fail for
> low MOVE_RATIO targets, which do not allow certain Tree-Level optimizations.
I think you should reevaluate setting MOVE_RATIO this low. It is used
for SRA and IPA-SRA which both are very useful; more useful than
memmove optimizations can do.
In fact this optimization is not even valid for volatile variables.
Here is a testcase for the volatile issue:
struct __attribute__((packed)) t15{
long long t8;
int t4;
short t2;
unsigned char t1;
};
volatile struct t15 t15;
int f(struct t15 *a)
{
t15 = *a;
}
Notice how we are writing to byte 7 twice to t15 in the outputted code.
Thanks,
Andrew Pinski
>
> Bootstrapped on aarch64-unknown-linux-gnu with no issues.
>
> OK for trunk?
>
> Thanks,
> James
>
> ---
> gcc/
>
> 2014-06-06 James Greenhalgh <james.greenhalgh@arm.com>
>
> * config/aarch64/aarch64-protos.h (aarch64_expand_movmem): New.
> * config/aarch64/aarch64.c (aarch64_move_pointer): New.
> (aarch64_progress_pointer): Likewise.
> (aarch64_copy_one_part_and_move_pointers): Likewise.
> (aarch64_expand_movmen): Likewise.
> * config/aarch64/aarch64.h (MOVE_RATIO): Set low.
> * config/aarch64/aarch64.md (movmem<mode>): New.
>
> gcc/testsuite/
>
> 2014-06-06 James Greenhalgh <james.greenhalgh@arm.com>
>
> * gcc.dg/tree-ssa/pr42585.c: Skip for AArch64.
> * gcc.dg/tree-ssa/sra-12.c: Likewise.