[PATCH,rs6000] Add builtins for accessing the FPSCR
Segher Boessenkool
segher@kernel.crashing.org
Fri Aug 17 21:34:00 GMT 2018
Hi Carl,
On Fri, Aug 17, 2018 at 11:46:06AM -0700, Carl Love wrote:
> > In addition to listing
> > the builtin, I added a C style comment to describe the builtin a
> > little.  I don't see any of the other builtins documented like this.Â
> > But I felt some explanation of the builtins were
> > helpful.  Suggestions
> > on a better way to add the comments on the builtins would be
> > appreciated.
I think this is fine.
> * config/rs6000/rs6000-builtin.def: Add definitions for __builtin_mffsl,
> __builtin_mtfsb0, __builtin_mtfsb1, __builtin_set_fpscr_rn,
> __builtin_set_fpscr_drn.
* config/rs6000/rs6000-builtin.def (__builtin_mffsl): New.
(__builtin_mtfsb0): New.
(__builtin_mtfsb1): New.
(__builtin_set_fpscr_rn): New.
(__builtin_set_fpscr_drn): New.
or
* config/rs6000/rs6000-builtin.def (__builtin_mffsl, __builtin_mtfsb0,
__builtin_mtfsb1, __builtin_set_fpscr_rn, __builtin_set_fpscr_drn): New.
> * config/rs6000.c: Add functions rs6000_expand_mtfsb0_mtfsb1_builtin,
> rs6000_expand_set_fpscr_rn_builtin, rs6000_expand_set_fpscr_drn_builtin.
Same here (and further on).
> Add case statement entries for the new builtins.
To what function(s)?
> * testsuite/gcc.target/powerpc/test_mffsl-p9.c: New file.
> * testsuite/gcc.target/powerpc/test_fpscr_builtins.c: New file.
> * testsuite/gcc.target/powerpc/test_fpscr_builtins_error.c: New file.
testsuite/ has its own changelog. Entries in there do not include
"testsuite/".
> --- a/gcc/config/rs6000/rs6000-builtin.def
> +++ b/gcc/config/rs6000/rs6000-builtin.def
> @@ -2486,11 +2486,34 @@ BU_SPECIAL_X (RS6000_BUILTIN_MFTB, "__builtin_ppc_mftb",
> BU_SPECIAL_X (RS6000_BUILTIN_MFFS, "__builtin_mffs",
> RS6000_BTM_ALWAYS, RS6000_BTC_MISC)
>
> +BU_SPECIAL_X (RS6000_BUILTIN_MFFSL, "__builtin_mffsl",
> + RS6000_BTM_ALWAYS, RS6000_BTC_MISC)
Should this be RS6000_BTM_MISC_P9 (or similar) instead? Same for the other
ISA 3.0 ops.
> +static rtx
> +rs6000_expand_mtfsb0_mtfsb1_builtin (enum insn_code icode, tree exp)
I'd call this rs6000_expand_mtfsb_builtin, but please use which you think
is clearest.
> + /* Only allow bit numbers 0 to 31. */
> + if (GET_CODE (op0) != CONST_INT || INTVAL (op0) < 0 || INTVAL (op0) > 31)
if (!u5bit_cint_operand (op0, VOIDmode))
should do the trick I think.
> + {
> + error ("Argument must be a constant between 0 and 31.");
> + return const0_rtx;
> + }
> +
> + if (! (*insn_data[icode].operand[0].predicate) (op0, mode0))
> + op0 = copy_to_mode_reg (mode0, op0);
Is this correct? It must be a constant integer already, and if it fails
copying it into a register is surely not the right thing to do.
> + /* If the argument is a constant, check the range. Agrument can only be a
> + 2-bit value. Unfortunately, can't check the range of the value at
> + compile time if the argument is a variable.
> + */
> + if (GET_CODE (op0) == CONST_INT && (INTVAL (op0) < 0 || INTVAL (op0) > 3))
const_0_to_3_operand
> + /* Builtin not supported on this processor. */
> + return 0;
> +
> + /* If we got invalid arguments bail out before generating bad rtl. */
> + if (arg0 == error_mark_node)
> + return const0_rtx;
> +
> + /* If the argument is a constant, check the range. Agrument can only be a
> + 3-bit value. Unfortunately, can't check the range of the value at
> + compile time if the argument is a variable.
> + */
> + if (GET_CODE (op0) == CONST_INT && (INTVAL (op0) < 0 || INTVAL (op0) > 7))
(Typo, "argument"). const_0_to_7_operand or u3bit_cint_operand (both exist,
and they are identical. Hrm.)
>
> @@ -16370,6 +16497,30 @@ rs6000_init_builtins (void)
> ftype = build_function_type_list (double_type_node, NULL_TREE);
> def_builtin ("__builtin_mffs", ftype, RS6000_BUILTIN_MFFS);
>
> + ftype = build_function_type_list (double_type_node, NULL_TREE);
> + def_builtin ("__builtin_mffsl", ftype, RS6000_BUILTIN_MFFSL);
> +
> + ftype = build_function_type_list (void_type_node,
> + intSI_type_node,
> + NULL_TREE);
> +
> + def_builtin ("__builtin_mtfsb0", ftype, RS6000_BUILTIN_MTFSB0_SI);
No blank line between ftype and def_builtin please?
> +(define_insn "rs6000_mtfsb0_si"
Why the _si? Won't just rs6000_mtfsb0 do?
> + [(use (match_operand:SI 0 "short_cint_operand" "n"))
> + (unspec_volatile:SI [(const_int 0)] UNSPECV_MTFSFB0)]
UNSPECV_MTFSB0 please.
operands[0] should be an argument of the unspec... so something like
(define_insn "rs6000_mtfsb0"
[(unspec_volatile [(match_operand:SI 0 "u5bit_cint_operand" "n")]
UNSPECV_MTFSB0)]
"TARGET_HARD_FLOAT"
"mtfsb0 %0")
(and you should set the "type" attribute to something useful, ideally).
> +(define_insn "rs6000_mffscrn"
> + [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
> + (unspec_volatile:DF [(const_int 0)] UNSPECV_MFFSCRN))
> + (use (match_operand:DF 1 "gpc_reg_operand" "d"))]
> + "TARGET_HARD_FLOAT"
> + "mffscrn %0,%1")
(define_insn "rs6000_mffscrn"
[(set (match_operand:DF 0 "gpc_reg_operand" "=d")
(unspec_volatile:DF [(match_operand:DF 1 "gpc_reg_operand" "d")]
UNSPECV_MFFSCRN))]
"TARGET_HARD_FLOAT"
"mffscrn %0,%1")
(you also need a check for ISA 3.0).
> +(define_expand "rs6000_set_fpscr_rn"
> + [(match_operand:DI 0 "gpc_reg_operand")]
> + "TARGET_HARD_FLOAT"
> +{
> + rtx tmp_df = gen_reg_rtx (DFmode);
> +
> + /* The floating point rounding control bits are FPSCR[62:63]. Put the
> + new rounding mode bits from operands[0][62:63] into FPSCR[62:63]. */
> + if (TARGET_P9_VECTOR)
It does not depend on vector stuff (say, you use -mcpu=power9 -mno-altivec).
> + {
> + rtx tmp_rn = gen_reg_rtx (DImode);
> + rtx tmp_di = gen_reg_rtx (DImode);
> +
> + /* Extract new RN mode from operand. */
> + emit_insn (gen_anddi3_mask (tmp_rn, operands[0], GEN_INT (0x3)));
This doesn't work for -m32 afaics. Either disallow it, or make it work?
> + /* Insert new RN mode into FSCPR. */
> + emit_insn (gen_rs6000_mffs (tmp_df));
> + tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> + emit_insn (gen_anddi3_mask (tmp_di, tmp_di, GEN_INT (0xFFFFFFFC)));
> + emit_insn (gen_iordi3 (tmp_di, tmp_di, tmp_rn));
> +
> + /* Need to write to field k=15. The fields are [0:15]. Hence with L=0,
> + W=0, FLM_i must be equal to 8, 16 = i + 8*(1-W). FLM is an 8 bit
> + field[0:7]. Need to set the bit that corresponds to the value of i
> + that you want [0:7].
> + */
(The */ should not go on a new line).
The derivation isn't super clear to me, but 1 is the correct mask, yes.
> + tmp_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
> + emit_insn (gen_rs6000_mtfsf (GEN_INT (0x01), tmp_df));
> +(define_expand "rs6000_set_fpscr_drn"
> + /* Insert new RN mode into FSCPR. */
> + emit_insn (gen_rs6000_mffs (tmp_df));
> + tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> + emit_insn (gen_anddi3_mask (tmp_di, tmp_di, GEN_INT (0xFFF8FFFFFFFF)));
> + emit_insn (gen_iordi3 (tmp_di, tmp_di, tmp_rn));
Why is this masking off the top 16 bits of the original contents? That
seems wrong.
> +;; The ISA 3.0 mffsl instruction is a lower latency instruction
> +;; for reading the FPSCR
For reading _a part_ of the FPSCR, the other bits are set to 0 in the result.
This matters, because otherwise we should just use __builtin_mffs always;
but it does not do the same thing, so we cannot.
> +(define_insn "rs6000_mffsl0"
> + [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
> + (unspec_volatile:DF [(const_int 0)] UNSPECV_MFFSL))]
> + "TARGET_HARD_FLOAT && TARGET_P9_MISC"
> + "mffsl %0")
(Please use a better name than that "0"... We have used "_hw" before).
> +(define_expand "rs6000_mffsl"
> + [(set (match_operand:DF 0 "gpc_reg_operand")
> + (unspec_volatile:DF [(const_int 0)] UNSPECV_MFFSL))]
> + "TARGET_HARD_FLOAT && TARGET_P9_MISC"
You don't want the latter...
> +{
> + /* If the low latency mffsl instruction (ISA 3.0) is available use it,
> + otherwise fall back to the older mffs instruction which does the same
> + thing but with a little more latency. */
> +
> + if (TARGET_P9_VECTOR)
... but you want it here (instead of the _VECTOR).
> + emit_insn (gen_rs6000_mffsl0 (operands[0]));
> + else
> + emit_insn (gen_rs6000_mffs (operands[0]));
> +(define_insn "rs6000_mtfsf_L0W1"
> + [(unspec_volatile [(match_operand:SI 0 "const_int_operand" "i")
> + (match_operand:DF 1 "gpc_reg_operand" "d")]
> + UNSPECV_MTFSF_L0W1)]
> + "TARGET_HARD_FLOAT"
> + "mtfsf %0,%1,0,1")
Maybe name it rs6000_mtsfs_high? L0W1 reads like "low" :-)
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -15745,6 +15745,10 @@ uint64_t __builtin_ppc_get_timebase ();
> unsigned long __builtin_ppc_mftb ();
> __ibm128 __builtin_unpack_ibm128 (__ibm128, int);
> __ibm128 __builtin_pack_ibm128 (double, double);
> +double __builtin_mffs(void);
> +void __builtin_mtfsb0(const int);
> +void __builtin_mtfsb1(const int);
> +void __builtin_set_fpscr_rn(int);
> @end smallexample
(space before opening paren)
> +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_builtins.c
> @@ -0,0 +1,282 @@
> +/* { dg-do run { target { powerpc64*-*-* && lp64 } } } */
> +/* { dg-require-effective-target lp64 } */
You have "lp64" in the selector already, repeating it here doesn't do
anything.
> +/* { dg-options "-pedantic" } */
Why is this?
Segher
More information about the Gcc-patches
mailing list