This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[spu, commit] Fix TImode->SFmode conversions (Re: Enable TImode vs. float conversion routines in libgcc)
- From: "Ulrich Weigand" <uweigand at de dot ibm dot com>
- To: gcc-patches at gcc dot gnu dot org
- Date: Fri, 17 Dec 2010 15:12:04 +0100 (CET)
- Subject: [spu, commit] Fix TImode->SFmode conversions (Re: Enable TImode vs. float conversion routines in libgcc)
Hello,
while the previous patch fixes TImode->DFmode conversions, it does not
correctly provide TImode->SFmode conversions (as noticed by the additional
testing suggested by Joseph Myers). This is because of two issues:
- There is an unfortunate interaction between LIB2_SIDITI_CONV_FUNCS and
LIB2FUNCS_EXCLUDE in the libgcc Makefile.in that causes breakage on SPU:
sifuncs := $(filter-out $(LIB2FUNCS_EXCLUDE),$(subst XX,si,$(swfloatfuncs)))
difuncs := $(filter-out $(LIB2FUNCS_EXCLUDE),$(subst XX,di,$(dwfloatfuncs)))
tifuncs := $(filter-out $(LIB2FUNCS_EXCLUDE),$(subst XX,ti,$(dwfloatfuncs)))
iter-items := $(sifuncs) $(difuncs) $(tifuncs)
iter-labels := $(sifuncs) $(difuncs) $(difuncs)
iter-sizes := $(patsubst %,4,$(sifuncs) $(difuncs)) $(patsubst %,8,$(tifuncs))
These three lists must be of the same length, which in particular implies
that $(difuncs) and $(tifuncs) must have the same length. However, it is
possible for LIB2FUNCS_EXCLUDE to remove some items from $(difuncs) while
the corresponding items are left in $(tifuncs). This happens on the SPU,
because the DImode->SFmode conversion routines are excluded.
- However, for the same reason the DImode->SFmode routines are excluded,
we actually have to exclude the TImode->SFmode routines too: the default
implementation in libgcc assumes round-to-nearest, while on the SPU all
single-precision floating-point operations should always use round-to-zero.
Therefore the patch below implements TImode->SFmode conversions using
round-to-zero inline in the compiler, following the existing DImode->
SFmode model. This means at this point there is no requirement to
address the libgcc Makefile.in issue ...
Tested on spu-elf (including additional enabled testcases as pointed out
by Joseph Myers) with no regression.
Committed to mainline.
Bye,
Ulrich
ChangeLog:
* config/spu/t-spu-elf (LIB2FUNCS_EXCLUDE): Add _floattisf and
_floatunstisf.
* config/spu/spu.md ("floattisf2"): New expander.
("floatunstisf2"): New insn pattern and splitter.
("cgt_ti_m1"): New insn pattern.
Index: gcc/config/spu/t-spu-elf
===================================================================
*** gcc/config/spu/t-spu-elf (revision 167950)
--- gcc/config/spu/t-spu-elf (working copy)
*************** TARGET_LIBGCC2_CFLAGS = -fPIC -mwarn-rel
*** 26,33 ****
# We exclude those because the libgcc2.c default versions do not support
# the SPU single-precision format (round towards zero). We provide our
! # own versions below.
! LIB2FUNCS_EXCLUDE = _floatdisf _floatundisf
# We provide our own version of __divdf3 that performs better and has
# better support for non-default rounding modes.
--- 26,33 ----
# We exclude those because the libgcc2.c default versions do not support
# the SPU single-precision format (round towards zero). We provide our
! # own versions below and/or via direct expansion.
! LIB2FUNCS_EXCLUDE = _floatdisf _floatundisf _floattisf _floatunstisf
# We provide our own version of __divdf3 that performs better and has
# better support for non-default rounding modes.
Index: gcc/config/spu/spu.md
===================================================================
*** gcc/config/spu/spu.md (revision 167950)
--- gcc/config/spu/spu.md (working copy)
***************
*** 753,758 ****
--- 753,825 ----
DONE;
})
+ (define_expand "floattisf2"
+ [(set (match_operand:SF 0 "register_operand" "")
+ (float:SF (match_operand:TI 1 "register_operand" "")))]
+ ""
+ {
+ rtx c0 = gen_reg_rtx (SImode);
+ rtx r0 = gen_reg_rtx (TImode);
+ rtx r1 = gen_reg_rtx (SFmode);
+ rtx r2 = gen_reg_rtx (SImode);
+ rtx setneg = gen_reg_rtx (SImode);
+ rtx isneg = gen_reg_rtx (SImode);
+ rtx neg = gen_reg_rtx (TImode);
+ rtx mask = gen_reg_rtx (TImode);
+
+ emit_move_insn (c0, GEN_INT (-0x80000000ll));
+
+ emit_insn (gen_negti2 (neg, operands[1]));
+ emit_insn (gen_cgt_ti_m1 (isneg, operands[1]));
+ emit_insn (gen_extend_compare (mask, isneg));
+ emit_insn (gen_selb (r0, neg, operands[1], mask));
+ emit_insn (gen_andc_si (setneg, c0, isneg));
+
+ emit_insn (gen_floatunstisf2 (r1, r0));
+
+ emit_insn (gen_iorsi3 (r2, gen_rtx_SUBREG (SImode, r1, 0), setneg));
+ emit_move_insn (operands[0], gen_rtx_SUBREG (SFmode, r2, 0));
+ DONE;
+ })
+
+ (define_insn_and_split "floatunstisf2"
+ [(set (match_operand:SF 0 "register_operand" "=r")
+ (unsigned_float:SF (match_operand:TI 1 "register_operand" "r")))
+ (clobber (match_scratch:SF 2 "=r"))
+ (clobber (match_scratch:SF 3 "=r"))
+ (clobber (match_scratch:SF 4 "=r"))]
+ ""
+ "#"
+ "reload_completed"
+ [(set (match_dup:SF 0)
+ (unsigned_float:SF (match_dup:TI 1)))]
+ {
+ rtx op1_v4si = gen_rtx_REG (V4SImode, REGNO (operands[1]));
+ rtx op2_v4sf = gen_rtx_REG (V4SFmode, REGNO (operands[2]));
+ rtx op2_ti = gen_rtx_REG (TImode, REGNO (operands[2]));
+ rtx op3_ti = gen_rtx_REG (TImode, REGNO (operands[3]));
+
+ REAL_VALUE_TYPE scale;
+ real_2expN (&scale, 32, SFmode);
+
+ emit_insn (gen_floatunsv4siv4sf2 (op2_v4sf, op1_v4si));
+ emit_insn (gen_shlqby_ti (op3_ti, op2_ti, GEN_INT (4)));
+
+ emit_move_insn (operands[4],
+ CONST_DOUBLE_FROM_REAL_VALUE (scale, SFmode));
+ emit_insn (gen_fmasf4 (operands[2],
+ operands[2], operands[4], operands[3]));
+
+ emit_insn (gen_shlqby_ti (op3_ti, op3_ti, GEN_INT (4)));
+ emit_insn (gen_fmasf4 (operands[2],
+ operands[2], operands[4], operands[3]));
+
+ emit_insn (gen_shlqby_ti (op3_ti, op3_ti, GEN_INT (4)));
+ emit_insn (gen_fmasf4 (operands[0],
+ operands[2], operands[4], operands[3]));
+ DONE;
+ })
+
;; Do (double)(operands[1]+0x80000000u)-(double)0x80000000
(define_expand "floatsidf2"
[(set (match_operand:DF 0 "register_operand" "")
***************
*** 3218,3223 ****
--- 3285,3297 ----
DONE;
})
+ (define_insn "cgt_ti_m1"
+ [(set (match_operand:SI 0 "spu_reg_operand" "=r")
+ (gt:SI (match_operand:TI 1 "spu_reg_operand" "r")
+ (const_int -1)))]
+ ""
+ "cgti\t%0,%1,-1")
+
(define_insn "cgt_ti"
[(set (match_operand:SI 0 "spu_reg_operand" "=r")
(gt:SI (match_operand:TI 1 "spu_reg_operand" "r")
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
Ulrich.Weigand@de.ibm.com