This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[PATCH][AARCH64] Emulating aligned mask loads on AArch64

From: Pawel Kupidura <pawel dot kupidura at arm dot com>
To: gcc-patches at gcc dot gnu dot org
Date: Fri, 18 Sep 2015 11:24:50 +0100
Subject: [PATCH][AARCH64] Emulating aligned mask loads on AArch64
Authentication-results: sourceware.org; auth=none

This patch uses max reductions to emulate aligned masked loads on AArch64.
It reduces the mask to a scalar that is nonzero if any mask element is true,
then uses that scalar to select between the real address and a scratchpad
address.

The idea is that if the vector load is aligned, it cannot cross a page
boundary and so cannot partially fault.  It is safe to load from the
address (and use only some of the result) if any mask element is true.

The patch provided a 15% speed improvement for simple microbenchmarks.

There were several spec2k6 benchmarks affected by patch: 400.perlbench,
403.gcc, 436.cactusADM, 454.calculix and 464.h264.  However, the changes
had no measureable effect on performance.

Regression-tested on x86_64-linux-gnu, aarch64-linux-gnu andarm-linux-gnueabi.


Thanks,
Pawel

Attachment: patch
Description: Text document

Follow-Ups:
- Re: [PATCH][AARCH64] Emulating aligned mask loads on AArch64
  - From: James Greenhalgh

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]