This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
[PR 16815] RFA: Fix a BLOCK_REG_PADDING check

From: Richard Sandiford <rsandifo at redhat dot com>
To: gcc-patches at gcc dot gnu dot org
Date: Fri, 17 Sep 2004 07:55:26 +0100
Subject: [PR 16815] RFA: Fix a BLOCK_REG_PADDING check
This patch fixes PR target/16815, which complains about very inefficient
handling of floating-point arguments in big-endian n32 and n64 MIPS code.
It also fixes gcc.dg/mips-movcc-3.c for mips-sgi-irix6.5.

Although this patch does fix a testsuite failure, there's no wrong code
as such, so I suppose it might not qualify as a true bugfix.  However,
the current output is a significant code quality regression from gcc 3.3
and earlier, so I hope the patch is still acceptable for stage 3.

The problem is that, in big-endian n32 and n64 code, float arguments
are always being forced onto the stack.  For example:

    float
    sub (float f, float g)
    {
      return f + g;
    }

gives the horrendous:

        addiu   $sp,$sp,-32
        swc1    $f12,0($sp)
        swc1    $f13,16($sp)
        lwc1    $f1,0($sp)
        lwc1    $f0,16($sp)
        addiu   $sp,$sp,32
        j       $31
        add.s   $f0,$f1,$f0

when compiled with "-mabi=n32 -EB -O2".  The problem lies in the
BLOCK_REG_PADDING check in function.c:assign_parm_setup_block_p():

-------------------------------------------------------------------------
/* A subroutine of assign_parms.  Return true if the current parameter
   should be stored as a BLKmode in the current frame.  */

static bool
assign_parm_setup_block_p (struct assign_parm_data_one *data)
{
  if (data->nominal_mode == BLKmode)
    return true;
  if (GET_CODE (data->entry_parm) == PARALLEL)
    return true;

#ifdef BLOCK_REG_PADDING
  if (data->locate.where_pad == (BYTES_BIG_ENDIAN ? upward : downward)
      && GET_MODE_SIZE (data->promoted_mode) < UNITS_PER_WORD)
    return true;
#endif

  return false;
}
-------------------------------------------------------------------------

This check is looking for register arguments that are padded at the
least significant end.  At the moment, such arguments do need to be
handled by assign_parm_setup_block, but the check is too general in
two respects:

  (1) It doesn't check whether the argument is actually passed in a
      register.  An argument that is:

        (a) passed on the stack;
        (b) less than a word in size; and
        (c) padded at the least significant end

      will be caught by the check and thus will never be assigned a
      pseudo register.  This seems unnecessary, since the padding of any
      such argument is controlled entirely by FUNCTION_ARG_PADDING and
      will be handled in the same way as it would have been on a
      non-B_R_P target.

  (2) It doesn't depend on the value of BLOCK_REG_PADDING.  (Note that
      where_pad is the padding provided by FUNCTION_ARG_PADDING, not
      BLOCK_REG_PADDING.)

      This is the crux of the problem.  With big-endian n32 and n64,
      float arguments are passed naturally in registers but are
      left-justified when passed on the stack.  Thus FUNCTION_ARG_PADDING
      says "upward" but BLOCK_REG_PADDING says "downward".

      Now this combination _is_ handled correctly by
      assign_parm_setup_block, but it simply does a normal store of the
      register into the stack slot.  This stack slot then becomes the
      PARM_DECL's DECL_RTL.  Since no special handling is required,
      it would be much better to use a pseudo register for the PARM_DECL
      where appropriate.

The patch below fixes both (1) and (2).

FWIW, I wondered about simply shifting the affected registers right so
that they are padded in the way gcc expects, but it wasn't always a win.
It would sometimes lead to spill code like:

        dsrl    reg,reg,56
        sb      reg,stack

instead of simply:

        sd      reg,stack

It also felt less like stage 3 work.

Two other targets use BLOCK_REG_PADDING: powerpc and hppa.  Both powerpc
definitions are:

  #define BLOCK_REG_PADDING(MODE, TYPE, FIRST) \
    (!(FIRST) ? upward : FUNCTION_ARG_PADDING (MODE, TYPE))

so changing the where_pad check to use BLOCK_REG_PADDING is a no-op
in this case.  The pa.h version is more problematic:

  #define BLOCK_REG_PADDING(MODE, TYPE, FIRST) \
    (TARGET_64BIT ? upward : downward)

This seems to be a simplification of function_arg_padding that relies
on the fact that BLOCK_REG_PADDING is only used for aggregates.   At the
moment, that's true, but since non-aggregate subword values are padded
downward if TARGET_64BIT, it seems a little dangerous to rely on that.
(FWIW, BLOCK_REG_PADDING is defined in terms of block moves, but it
doesn't say what kind of arguments will and will not use a block move.)

So there seemed to be two choices:

   (a) In assign_parm_setup_block_p, check BLOCK_REG_PADDING as well as,
       not instead of, where_pad.  Only return true if both specify
       padding at the least significant end.

       This fixes the MIPS problem without changing the gcc.c-torture
       or gcc.dg assembly output for either powerpc64-linux-gnu or
       hppa64-linux-gnu.  But it doesn't feel right conceptually.
       It also fails to to handle the pathological (hopefully non-existent)
       case of an argument that is padded at the least significant end in
       registers and at the most significant end on the stack.

   (b) Make hppa's definition use function_arg_padding, just like the
       powerpc and MIPS versions do.

I went for (b) here.  I compared the gcc.c-torture and gcc.dg output
of powerpc64-linux-gnu and hppa64-linux-gnu before and after the patch
(including subdirectories like gcc.dg/compat).  There were no changes.
The patch has also been bootstrapped & regression tested on mips-sgi-irix6.5,
where it fixes the mips-movcc-3.c test and introduced no new failures.
OK to install?

Richard


	PR target/16815
	* function.c (assign_parm_setup_block_p): Tighten BLOCK_REG_PADDING
	check.
	* config/pa/pa.h (BLOCK_REG_PADDING): Define in terms of
	function_arg_padding.

Index: function.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/function.c,v
retrieving revision 1.575
diff -c -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.575 function.c
*** function.c	10 Sep 2004 00:50:24 -0000	1.575
--- function.c	17 Sep 2004 06:04:03 -0000
*************** assign_parm_setup_block_p (struct assign
*** 2513,2520 ****
      return true;
  
  #ifdef BLOCK_REG_PADDING
!   if (data->locate.where_pad == (BYTES_BIG_ENDIAN ? upward : downward)
!       && GET_MODE_SIZE (data->promoted_mode) < UNITS_PER_WORD)
      return true;
  #endif
  
--- 2513,2524 ----
      return true;
  
  #ifdef BLOCK_REG_PADDING
!   /* Only assign_parm_setup_block knows how to deal with register arguments
!      that are padded at the least significant end.  */
!   if (REG_P (data->entry_parm)
!       && GET_MODE_SIZE (data->promoted_mode) < UNITS_PER_WORD
!       && (BLOCK_REG_PADDING (data->passed_mode, data->passed_type, 1)
! 	  == (BYTES_BIG_ENDIAN ? upward : downward)))
      return true;
  #endif
  
Index: config/pa/pa.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/pa/pa.h,v
retrieving revision 1.234
diff -c -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.234 pa.h
*** config/pa/pa.h	8 Sep 2004 18:45:16 -0000	1.234
--- config/pa/pa.h	17 Sep 2004 06:04:05 -0000
*************** #define FUNCTION_ARG_PADDING(MODE, TYPE)
*** 918,924 ****
     We use a DImode register in the parallel for 5 to 7 byte structures
     so that there is only one element.  This allows the object to be
     correctly padded.  */
! #define BLOCK_REG_PADDING(MODE, TYPE, FIRST) (TARGET_64BIT ? upward : downward)
  
  /* Do not expect to understand this without reading it several times.  I'm
     tempted to try and simply it, but I worry about breaking something.  */
--- 918,925 ----
     We use a DImode register in the parallel for 5 to 7 byte structures
     so that there is only one element.  This allows the object to be
     correctly padded.  */
! #define BLOCK_REG_PADDING(MODE, TYPE, FIRST) \
!   function_arg_padding ((MODE), (TYPE))
  
  /* Do not expect to understand this without reading it several times.  I'm
     tempted to try and simply it, but I worry about breaking something.  */
Follow-Ups:
- Unreviewed patch: [PR 16815] RFA: Fix a BLOCK_REG_PADDING check
  - From: Richard Sandiford
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]