This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH][GCC][mid-end] Fix PR85123 incorrect copies
- From: Tamar Christina <tamar dot christina at arm dot com>
- To: gcc-patches at gcc dot gnu dot org
- Cc: nd at arm dot com, law at redhat dot com, rguenther at suse dot de, ian at airs dot com, bergner at vnet dot ibm dot com, amodra at gmail dot com
- Date: Thu, 5 Apr 2018 13:29:06 +0100
- Subject: [PATCH][GCC][mid-end] Fix PR85123 incorrect copies
- Nodisclaimer: True
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
Hi All,
This patch fixes the code generation for copy_blkmode_to_reg by calculating
the bitsize per iteration doing the maximum copy allowed that does not read more
than the amount of bits left to copy.
This fixes the bad code generation reported and also still produces better code
in most cases. For targets that don't support fast unaligned access it defaults
to the old 1 byte copy (MIN alignment).
This produces for the copying of a 3 byte structure:
fun3:
adrp x1, .LANCHOR0
add x1, x1, :lo12:.LANCHOR0
mov x0, 0
sub sp, sp, #16
ldrh w2, [x1, 16]
ldrb w1, [x1, 18]
add sp, sp, 16
bfi x0, x2, 0, 16
bfi x0, x1, 16, 8
ret
whereas before it was producing
fun3:
adrp x0, .LANCHOR0
add x2, x0, :lo12:.LANCHOR0
sub sp, sp, #16
ldrh w1, [x0, #:lo12:.LANCHOR0]
ldrb w0, [x2, 2]
strh w1, [sp, 8]
strb w0, [sp, 10]
ldr w0, [sp, 8]
add sp, sp, 16
ret
Cross compiled on aarch64-none-elf and no issues
Bootstrapped powerpc64-unknown-linux-gnu, x86_64-pc-linux-gnu,
arm-none-linux-gnueabihf, aarch64-none-linux-gnu with no issues.
Regtested aarch64-none-elf, x86_64-pc-linux-gnu, powerpc64-unknown-linux-gnu
and arm-none-linux-gnueabihf and found no issues.
Regression on powerpc (pr63594-2.c) is fixed now.
OK for trunk?
Thanks,
Tamar
gcc/
2018-04-05 Tamar Christina <tamar.christina@arm.com>
PR middle-end/85123
* expr.c (copy_blkmode_to_reg): Fix wrong code gen.
--
diff --git a/gcc/expr.c b/gcc/expr.c
index 00660293f72e5441a6421a280b04c57fca2922b8..7daeb8c91d758edf0b3dc37f6927380b6f3df877 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -2749,7 +2749,7 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src)
{
int i, n_regs;
unsigned HOST_WIDE_INT bitpos, xbitpos, padding_correction = 0, bytes;
- unsigned int bitsize;
+ unsigned int bitsize = 0;
rtx *dst_words, dst, x, src_word = NULL_RTX, dst_word = NULL_RTX;
/* No current ABI uses variable-sized modes to pass a BLKmnode type. */
fixed_size_mode mode = as_a <fixed_size_mode> (mode_in);
@@ -2782,7 +2782,7 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src)
n_regs = (bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
dst_words = XALLOCAVEC (rtx, n_regs);
- bitsize = BITS_PER_WORD;
+
if (targetm.slow_unaligned_access (word_mode, TYPE_ALIGN (TREE_TYPE (src))))
bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD);
@@ -2791,6 +2791,17 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src)
bitpos < bytes * BITS_PER_UNIT;
bitpos += bitsize, xbitpos += bitsize)
{
+ /* Find the largest integer mode that can be used to copy all or as
+ many bits as possible of the structure. */
+ opt_scalar_int_mode mode_iter;
+ FOR_EACH_MODE_IN_CLASS (mode_iter, MODE_INT)
+ if (GET_MODE_BITSIZE (mode_iter.require ())
+ <= ((bytes * BITS_PER_UNIT) - bitpos)
+ && GET_MODE_BITSIZE (mode_iter.require ()) <= BITS_PER_WORD)
+ bitsize = GET_MODE_BITSIZE (mode_iter.require ());
+ else
+ break;
+
/* We need a new destination pseudo each time xbitpos is
on a word boundary and when xbitpos == padding_correction
(the first time through). */