This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Support Cell SPU floating point


Richard,

Thanks for the review.

Can you please give me a little more advice on how to implement the
appropriate bits in real.c.  Even though denorms are generally treated
as zero on SPU, they are a legitimate number on the hardware.  They can
be generated by double precision instructions, and should be encoded
correctly when specified as literals.  Trying to model the hardwares
behavior only in real_convert isn't sufficient.

Do you think adding a mode argument to the appropriate routines in
real.c (e.g., real_arithmetic, real_compare) is the correct approach?
That would allow me to model the hardware correctly.  Or do you have
another suggestion?

I'm ok with the name HONOR_DENORMS, but it has a slightly different
usage than the other HONOR_* macros.  The existing macros are used to
mean "when this macro is true it IS NOT safe to do this optimization".
HONOR_DENORMS will mean "when this macro is true it IS safe to do this
optimization."

ROUND_TO_ZERO is used in fp-bit.c and so cannot use a mode argument.
LARGEST_EXPONENT_IS_NORMAL is also used in fp-bit.c and uses size for an
argument, which is why I used the same approach.

I'll make the other changes you point out.

Trevor

* Richard Guenther <richard.guenther@gmail.com> [2007-09-01 13:10]:
> On 8/29/07, trevor_smigiel@playstation.sony.com
> <trevor_smigiel@playstation.sony.com> wrote:
> > This patch does three things:
> >
> > 1) Add Cell SPU extended single precision format to real.c
> > 2) Define macros for constant folding of denormalized numbers
> > 3) Define macros to prevent certain optimizations on denorms
> >
> > The latter 2 parts are necessary to get proper constant folding and
> > optimization on SPU which does not fully support IEEE floats.  A comment
> > in real.c describes the differences:
> >
> > /*  SPU Single Precision (Extended-Range Mode) format is the same as IEEE
> >     single precision with the following differences:
> >       - infinities are not supported.  Instead MAX_FLOAT or MIN_FLOAT are
> >         generated
> >       - NaNs are not supported
> >       - the range of non-zero numbers in binary is
> >          (001)[1.]000...000 to (255)[1.]111...111
> >       - denormals can be represented, but are treated as +0.0 when
> >         used as an operand and are never generated as a result
> >       - -0.0 can be represented but a zero result is always +0.0
> >       - the only supported rounding mode is trunction (towards zero)
> >
> >     SPU Double Precision format is the same as IEEE with the following
> >     differences:
> >       - denorms can be generated as a result, but are treated as zero
> >         when used as an operand
> >
> >     When converting between float and double, Infinity and NaN patterns
> >     are honored.  For example, 0x7f800000 will convert to
> >     0x7ff0000000000000, not 0x47f0000000000000;  and 0x7ff0000000000000
> >     converts to 0x7f800000, not 0x7fffffff.  */
> >
> > A couple of debatable choices are: (of course, it is all debatable)
> >
> > - the name STANDARD_DENORM.  Its purpose is to prevent optimization,
> > perhaps it would be better named as SAFE_FOR_DENORM, as in "this
> > optimization is safe even if an operand is a denorm."  There could be a
> > similar macro for negative zero, but handling denorms handles all the -0
> > cases too.  In fact, our internal implementation uses only
> > SAFE_FOR_MINUS_ZERO.
> >
> > - converting denorms to zero in const-fold.c and simplify-rtx.c.  It
> > could be done in real.c as part of the format.  The arithmetic in real.c
> > is modeless, and I didn't think it was appropriate to add mode arguments
> > to most of the functions or to the REAL_VALUE structure.  Compiler
> > internal computations might not want real.c to operate in a format
> > specific manner.   The advantage of doing it in real.c is that it is
> > only done in one place.  In our internal implementation, we did pass a
> > mode argument to most functions in real.c.
> >
> > The patch was bootstrapped on x86 by Andrew and had no new failures.
> >
> > The testsuite results on SPU do change quite a bit, as expected.  Some
> > failures become passes, and some passes become failures.  These do need
> > to be handled, and I hope to get some help doing so after the basic
> > patch is accepted.
> 
>  @itemize @bullet
>  @item
> -@code{MODE_HAS_SIGN_DEPENDENT_ROUNDING} will be false for all modes.
> +@code{MODE_HAS_SIGN_DEPENDENT_ROUNDING} will be false for
> +@var{size}-bit modes.
> 
> As this is named MODE_HAS_SIGN... it should be false for a mode
> specified, not a size.  Also the docs miss the now required mode
> argument.
> 
> Likewise ROUND_TOWARDS_ZERO should take a mode argument.
> 
> 
> +/* True when the given mode fully supports denormalized numbers. */
> +#define STANDARD_DENORMS(MODE) \
> +  (!DENORM_OPERANDS_ARE_ZERO (MODE) && !DENORM_RESULTS_ARE_ZERO (MODE))
> 
> HONOR_DENORMALS (MODE)
> 
> to match other macros.
> 
> Index: gcc/fold-const.c
> ===================================================================
> --- gcc/fold-const.c    (revision 127841)
> +++ gcc/fold-const.c    (working copy)
> @@ -1868,6 +1868,15 @@ const_binop (enum tree_code code, tree a
>        else if (REAL_VALUE_ISNAN (d2))
>         return arg2;
> 
> +      /* Some formats accept denorms but treat them as zero. */
> +      if (DENORM_OPERANDS_ARE_ZERO (mode))
> +       {
> +         if (real_isdenorm (mode, &d1))
> +           d1 = dconst0;
> +         if (real_isdenorm (mode, &d2))
> +           d2 = dconst0;
> +       }
> +
> 
> I think this should be done in real_arithmetic.
> 
>        inexact = real_arithmetic (&value, code, &d1, &d2);
>        real_convert (&result, mode, &value);
> 
> @@ -1890,6 +1899,11 @@ const_binop (enum tree_code code, tree a
>           && (inexact || !real_identical (&result, &value)))
>         return NULL_TREE;
> 
> +      /* Some formats always generate +0.0 for denorms and -0.0. */
> +      if ((DENORM_RESULTS_ARE_ZERO (mode) && real_isdenorm (mode, &result))
> +         || (ZERO_RESULTS_ARE_POSITIVE (mode) && real_isnegzero (&result)))
> +       result = dconst0;
> +
> 
> Likewise.
> 
>        t = build_real (type, result);
> 
> 
> @@ -2269,9 +2283,24 @@ static tree
>  fold_convert_const_real_from_real (tree type, tree arg1)
>  {
>    REAL_VALUE_TYPE value;
> +  REAL_VALUE_TYPE *r = &TREE_REAL_CST (arg1);
> +  REAL_VALUE_TYPE *r = &TREE_REAL_CST (arg1);
>    tree t;
> +  enum machine_mode to_mode = TYPE_MODE (type);
> +  enum machine_mode from_mode = TYPE_MODE (TREE_TYPE (arg1));
> +
> +  if (DENORM_OPERANDS_ARE_ZERO (from_mode) && real_isdenorm (from_mode, r))
> +    r = &dconst0;
> +
> +  /* Even though the native format does not represent IEEE exactly,
> +     treat the bit representation as if it were IEEE when converting. */
> +  if (REAL_CONVERT_AS_IEEE (to_mode, from_mode))
> +    real_convert_ieee (&value, to_mode, r, from_mode);
> +  else
> +    real_convert (&value, to_mode, r);
> +
> +  if (DENORM_RESULTS_ARE_ZERO (to_mode) && real_isdenorm (to_mode, &value))
> +    value = dconst0;
> 
> -  real_convert (&value, TYPE_MODE (type), &TREE_REAL_CST (arg1));
>    t = build_real (type, value);
> 
>    TREE_OVERFLOW (t) = TREE_OVERFLOW (arg1);
> @@ -6476,6 +6505,10 @@ fold_real_zero_addition_p (tree type, tr
>    if (!real_zerop (addend))
>      return false;
> 
> +  /* Don't allow the fold with non-standard denorms.  */
> +  if (!STANDARD_DENORMS (TYPE_MODE (type)))
> +    return false;
> +
>    /* Don't allow the fold with -fsignaling-nans.  */
>    if (HONOR_SNANS (TYPE_MODE (type)))
>      return false;
> @@ -10289,11 +10322,13 @@ fold_binary (enum tree_code code, tree t
>             return omit_one_operand (type, arg1, arg0);
>           /* In IEEE floating point, x*1 is not equivalent to x for snans.  */
>           if (!HONOR_SNANS (TYPE_MODE (TREE_TYPE (arg0)))
> +             && STANDARD_DENORMS (TYPE_MODE (TREE_TYPE (arg0)))
>               && real_onep (arg1))
>             return non_lvalue (fold_convert (type, arg0));
> 
>           /* Transform x * -1.0 into -x.  */
>           if (!HONOR_SNANS (TYPE_MODE (TREE_TYPE (arg0)))
> +             && STANDARD_DENORMS (TYPE_MODE (TREE_TYPE (arg0)))
>               && real_minus_onep (arg1))
>             return fold_convert (type, negate_expr (arg0));
> 
> @@ -10919,6 +10954,7 @@ fold_binary (enum tree_code code, tree t
>        if (SCALAR_FLOAT_TYPE_P (TREE_TYPE (arg0))
>           && ! HONOR_NANS (TYPE_MODE (TREE_TYPE (arg0)))
>           && ! HONOR_INFINITIES (TYPE_MODE (TREE_TYPE (arg0)))
> +         && STANDARD_DENORMS (TYPE_MODE (TREE_TYPE (arg0)))
>           && operand_equal_p (arg0, arg1, 0))
>         {
>           tree r = build_real (TREE_TYPE (arg0), dconst1);
> @@ -10952,11 +10988,13 @@ fold_binary (enum tree_code code, tree t
> 
>        /* In IEEE floating point, x/1 is not equivalent to x for snans.  */
>        if (!HONOR_SNANS (TYPE_MODE (TREE_TYPE (arg0)))
> +         && STANDARD_DENORMS (TYPE_MODE (TREE_TYPE (arg0)))
>           && real_onep (arg1))
>         return non_lvalue (fold_convert (type, arg0));
> 
>        /* In IEEE floating point, x/-1 is not equivalent to -x for snans.  */
>        if (!HONOR_SNANS (TYPE_MODE (TREE_TYPE (arg0)))
> +         && STANDARD_DENORMS (TYPE_MODE (TREE_TYPE (arg0)))
>           && real_minus_onep (arg1))
>         return non_lvalue (fold_convert (type, negate_expr (arg0)));
> 
> @@ -14591,6 +14629,15 @@ fold_relational_const (enum tree_code co
>           return constant_boolean_node (result, type);
>         }
> 
> +      /* Some formats accept denorms but treat them as zero. */
> +      if (DENORM_OPERANDS_ARE_ZERO (TYPE_MODE (TREE_TYPE (op0))))
> +       {
> +         if (real_isdenorm (TYPE_MODE (TREE_TYPE (op0)), c0))
> +           c0 = &dconst0;
> +         if (real_isdenorm (TYPE_MODE (TREE_TYPE (op0)), c1))
> +           c1 = &dconst0;
> +       }
> +
>        return constant_boolean_node (real_compare (code, c0, c1), type);
>      }
> 
> 
> I don't like doing all the special-fixups in fold-const.c.  Please
> move them to appropriate places in real.c instead.
> 
> 
> Index: gcc/real.c
> ===================================================================
> --- gcc/real.c  (revision 127841)
> +++ gcc/real.c  (working copy)
> @@ -1186,6 +1186,15 @@ real_isnegzero (const REAL_VALUE_TYPE *r
>    return r->sign && r->cl == rvc_zero;
>  }
> 
> +/* Determine whether a floating-point value X is a denorm for MODE. */
> +
> +bool
> +real_isdenorm (enum machine_mode mode, const REAL_VALUE_TYPE * r)
> +{
> +  const struct real_format *fmt = REAL_MODE_FORMAT (mode);
> +  return fmt->has_denorm && r->cl == rvc_normal && REAL_EXP (r) < fmt->emin;
> +}
> +
>  /* Compare two floating-point objects for bitwise identity.  */
> 
>  bool
> @@ -2400,41 +2409,47 @@ round_for_format (const struct real_form
>         }
>      }
> 
> -  /* There are P2 true significand bits, followed by one guard bit,
> -     followed by one sticky bit, followed by stuff.  Fold nonzero
> -     stuff into the sticky bit.  */
> -
> -  sticky = 0;
> -  for (i = 0, w = (np2 - 1) / HOST_BITS_PER_LONG; i < w; ++i)
> -    sticky |= r->sig[i];
> -  sticky |=
> -    r->sig[w] & (((unsigned long)1 << ((np2 - 1) % HOST_BITS_PER_LONG)) - 1);
> +  if (!fmt->round_towards_zero)
> +    {
> 
> 
> Uh, can you please use context diffs?  They are easier to review.
> 
> Thanks,
> Richard.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]