This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH][GCC][mid-end] Allow larger copies when not slow_unaligned_access and no padding.


Hi All,

This allows copy_blkmode_to_reg to perform larger copies when it is safe to do so by calculating
the bitsize per iteration doing the maximum copy allowed that does not read more
than the amount of bits left to copy.

Strictly speaking, this copying is only done if:

  1. the target supports fast unaligned access
  2. no padding is being used.

This should avoid the issues of the first patch (PR85123) but still work for targets that are safe
to do so.

Original patch https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01088.html
Previous respin https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00239.html


This produces for the copying of a 3 byte structure:

fun3:
	adrp	x1, .LANCHOR0
	add	x1, x1, :lo12:.LANCHOR0
	mov	x0, 0
	sub	sp, sp, #16
	ldrh	w2, [x1, 16]
	ldrb	w1, [x1, 18]
	add	sp, sp, 16
	bfi	x0, x2, 0, 16
	bfi	x0, x1, 16, 8
	ret

whereas before it was producing

fun3:
	adrp	x0, .LANCHOR0
	add	x2, x0, :lo12:.LANCHOR0
	sub	sp, sp, #16
	ldrh	w1, [x0, #:lo12:.LANCHOR0]
	ldrb	w0, [x2, 2]
	strh	w1, [sp, 8]
	strb	w0, [sp, 10]
	ldr	w0, [sp, 8]
	add	sp, sp, 16
	ret

Cross compiled and regtested on
  aarch64_be-none-elf
  armeb-none-eabi
and no issues

Boostrapped and regtested
 aarch64-none-linux-gnu
 x86_64-pc-linux-gnu
 powerpc64-unknown-linux-gnu
 arm-none-linux-gnueabihf

and found no issues.

OK for trunk?

Thanks,
Tamar

gcc/
2018-07-23  Tamar Christina  <tamar.christina@arm.com>

	* expr.c (copy_blkmode_to_reg): Perform larger copies when safe.

-- 
diff --git a/gcc/expr.c b/gcc/expr.c
index f665e187ebbbc7874ec88e84ca47ed991491c3e5..17b580aabf761491d8003ac74daa014bc252ea9f 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -2763,6 +2763,7 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src)
   int i, n_regs;
   unsigned HOST_WIDE_INT bitpos, xbitpos, padding_correction = 0, bytes;
   unsigned int bitsize;
+  bool slow_unaligned_access;
   rtx *dst_words, dst, x, src_word = NULL_RTX, dst_word = NULL_RTX;
   /* No current ABI uses variable-sized modes to pass a BLKmnode type.  */
   fixed_size_mode mode = as_a <fixed_size_mode> (mode_in);
@@ -2795,6 +2796,10 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src)
 
   n_regs = (bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
   dst_words = XALLOCAVEC (rtx, n_regs);
+
+  slow_unaligned_access
+    = targetm.slow_unaligned_access (word_mode, TYPE_ALIGN (TREE_TYPE (src)));
+
   bitsize = MIN (TYPE_ALIGN (TREE_TYPE (src)), BITS_PER_WORD);
 
   /* Copy the structure BITSIZE bits at a time.  */
@@ -2816,6 +2821,23 @@ copy_blkmode_to_reg (machine_mode mode_in, tree src)
 	  emit_move_insn (dst_word, CONST0_RTX (word_mode));
 	}
 
+
+      /* Find the largest integer mode that can be used to copy all or as
+	 many bits as possible of the structure if the target supports larger
+	 copies.  There are too many corner cases here w.r.t to alignments on
+	 the read/writes.  So if there is any padding just use single byte
+	 operations.  */
+      opt_scalar_int_mode mode_iter;
+      FOR_EACH_MODE_IN_CLASS (mode_iter, MODE_INT)
+	if (padding_correction == 0
+	    && !slow_unaligned_access
+	    && GET_MODE_BITSIZE (mode_iter.require ())
+			    <= ((bytes * BITS_PER_UNIT) - bitpos)
+	    && GET_MODE_BITSIZE (mode_iter.require ()) <= BITS_PER_WORD)
+	  bitsize = GET_MODE_BITSIZE (mode_iter.require ());
+	else
+	  break;
+
       /* We need a new source operand each time bitpos is on a word
 	 boundary.  */
       if (bitpos % BITS_PER_WORD == 0)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]