This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RFC: allowing fwprop to propagate subregs


At the moment, fwprop will propagate constants and registers
even if no further rtl simplifications are possible:

  if (REG_P (new_rtx) || CONSTANT_P (new_rtx))
    flags |= PR_CAN_APPEAR;

What do you think about extending this to subregs?  The reason for
asking is that on NEON, vector loads like vld4 are represented as a load
of a single monolithic register followed by subreg extractions of each
vector:

  (set (reg:OI FULL) (...))
  (set (reg:V2SI V0) (subreg:V2SI (reg:OI FULL) 0))
  (set (reg:V2SI V1) (subreg:V2SI (reg:OI FULL) 16))
  (set (reg:V2SI V2) (subreg:V2SI (reg:OI FULL) 32))
  (set (reg:V2SI V3) (subreg:V2SI (reg:OI FULL) 48))

Nothing ever propagates these subregs, so the separate moves
survive until IRA.  This has three problems:

  - We generally want the registers allocated to V0...V3 to be the same
    as FULL, so that the four subreg moves become nops.  And this often
    happens in simple examples.  But if register pressure is relatively
    high, these moves can sometimes cause IRA to spill in cases where
    it doesn't if the subregs are used instead of each Vi.

  - Perhaps related, register pressure becomes harder to estimate.

  - These moves can interfere with pre-reload scheduling.

In combination with the MODES_TIEABLE_P patch that I posted here:

    http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00626.html

this patch significantly improves the code generated for several libav
loops.  Unfortunately, I don't have a setup that can do meaningful
x86_64 performance measurements, but a diff of the before and after
output for libav showed many cases where the patch removed moves.

What do you think?  Alternatives include propagating in lower-subreg,
or maybe only in the second fwprop pass.

Richard


gcc/
	* fwprop.c (propagate_rtx): Also set PR_CAN_APPEAR for subregs.

Index: gcc/fwprop.c
===================================================================
--- gcc/fwprop.c	2011-08-26 09:58:28.829540497 +0100
+++ gcc/fwprop.c	2011-08-26 10:14:03.767707504 +0100
@@ -664,7 +664,7 @@ propagate_rtx (rtx x, enum machine_mode 
     return NULL_RTX;
 
   flags = 0;
-  if (REG_P (new_rtx) || CONSTANT_P (new_rtx))
+  if (REG_P (new_rtx) || CONSTANT_P (new_rtx) || GET_CODE (new_rtx) == SUBREG)
     flags |= PR_CAN_APPEAR;
   if (!for_each_rtx (&new_rtx, varying_mem_p, NULL))
     flags |= PR_HANDLE_MEM;


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]