Replace FMA_EXPR with one internal fn per optab

H.J. Lu hjl.tools@gmail.com
Tue May 22 20:11:00 GMT 2018


On Thu, May 17, 2018 at 1:56 AM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:
>>> @@ -2698,23 +2703,26 @@ convert_mult_to_fma_1 (tree mul_result,
>>>          }
>>
>>>         if (negate_p)
>>> -       mulop1 = force_gimple_operand_gsi (&gsi,
>>> -                                          build1 (NEGATE_EXPR,
>>> -                                                  type, mulop1),
>>> -                                          true, NULL_TREE, true,
>>> -                                          GSI_SAME_STMT);
>>> +       mulop1 = gimple_build (&seq, NEGATE_EXPR, type, mulop1);
>>
>>> -      fma_stmt = gimple_build_assign (gimple_assign_lhs (use_stmt),
>>> -                                     FMA_EXPR, mulop1, op2, addop);
>>> +      if (seq)
>>> +       gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT);
>>> +      fma_stmt = gimple_build_call_internal (IFN_FMA, 3, mulop1, op2,
>> addop);
>>> +      gimple_call_set_lhs (fma_stmt, gimple_assign_lhs (use_stmt));
>>> +      gimple_call_set_nothrow (fma_stmt, !stmt_can_throw_internal
>> (use_stmt));
>>> +      gsi_replace (&gsi, fma_stmt, true);
>>> +      /* Valueize aggressively so that we generate FMS, FNMA and FNMS
>>> +        regardless of where the negation occurs.  */
>>> +      if (fold_stmt (&gsi, aggressive_valueize))
>>> +       update_stmt (gsi_stmt (gsi));
>>
>> I think it would be nice to be able to use gimple_build () with IFNs so you
>> can
>> gimple_build () the IFN and then use gsi_replace_with_seq () on it.  You
>> only need to fold with generated negates, not with negates already in the
>> IL?
>> The the folding implied with gimple_build will take care of it.
>
> The idea was to pick up existing negates that feed the multiplication
> as well as any added by the pass itself.
>
> On IRC yesterday we talked about how this should handle the ECF_NOTHROW
> flag, and whether things like IFN_SQRT and IFN_FMA should always be
> nothrow (like the built-in functions are).  But in the end I thought
> it'd be better to keep things as they are.  We already handle
> -fnon-call-exceptions for unfused a * b + c and before the patch also
> handled it for FMA_EXPR.  It'd seem like a step backwards if the new
> internal functions didn't handle it too.  If anything it seems like the
> built-in functions should change to be closer to the tree_code and
> internal_fn way of doing things, if we want to support -fnon-call-exceptions
> properly.
>
> This also surprised me when doing the if-conversion patch I sent yesterday.
> We're happy to vectorise:
>
>   for (int i = 0; i < 100; ++i)
>     x[i] = ... ? sqrt (x[i]) : 0;
>
> by doing the sqrt unconditionally and selecting on the result, even with
> the default maths flags, but refuse to vectorise the simpler:
>
>   for (int i = 0; i < 100; ++i)
>     x[i] = ... ? x[i] + 1 : 0;
>
> in the same way.
>
>> Otherwise can you please move aggressive_valueize to gimple-fold.[ch]
>> alongside no_follow_ssa_edges / follow_single_use_edges and maybe
>> rename it as follow_all_ssa_edges?
>
> Ah, yeah, that's definitely a better name.
>
> I also renamed all_scalar_fma to scalar_all_fma, since I realised
> after Andrew's reply that the old name made it sound like it was
> "all scalars", whereas it meant to mean "all fmas".
>
> Tested as before.
>
> Thanks,
> Richard
>
> 2018-05-17  Richard Sandiford  <richard.sandiford@linaro.org>
>
> gcc/
>         * doc/sourcebuild.texi (scalar_all_fma): Document.
>         * tree.def (FMA_EXPR): Delete.
>         * internal-fn.def (FMA, FMS, FNMA, FNMS): New internal functions.
>         * internal-fn.c (ternary_direct): New macro.
>         (expand_ternary_optab_fn): Likewise.
>         (direct_ternary_optab_supported_p): Likewise.
>         * Makefile.in (build/genmatch.o): Depend on case-fn-macros.h.
>         * builtins.c (fold_builtin_fma): Delete.
>         (fold_builtin_3): Don't call it.
>         * cfgexpand.c (expand_debug_expr): Remove FMA_EXPR handling.
>         * expr.c (expand_expr_real_2): Likewise.
>         * fold-const.c (operand_equal_p): Likewise.
>         (fold_ternary_loc): Likewise.
>         * gimple-pretty-print.c (dump_ternary_rhs): Likewise.
>         * gimple.c (DEFTREECODE): Likewise.
>         * gimplify.c (gimplify_expr): Likewise.
>         * optabs-tree.c (optab_for_tree_code): Likewise.
>         * tree-cfg.c (verify_gimple_assign_ternary): Likewise.
>         * tree-eh.c (operation_could_trap_p): Likewise.
>         (stmt_could_throw_1_p): Likewise.
>         * tree-inline.c (estimate_operator_cost): Likewise.
>         * tree-pretty-print.c (dump_generic_node): Likewise.
>         (op_code_prio): Likewise.
>         * tree-ssa-loop-im.c (stmt_cost): Likewise.
>         * tree-ssa-operands.c (get_expr_operands): Likewise.
>         * tree.c (commutative_ternary_tree_code, add_expr): Likewise.
>         * fold-const-call.h (fold_fma): Delete.
>         * fold-const-call.c (fold_const_call_ssss): Handle CFN_FMS,
>         CFN_FNMA and CFN_FNMS.
>         (fold_fma): Delete.
>         * genmatch.c (combined_fn): New enum.
>         (commutative_ternary_tree_code): Remove FMA_EXPR handling.
>         (commutative_op): New function.
>         (commutate): Use it.  Handle more than 2 operands.
>         (dt_operand::gen_gimple_expr): Use commutative_op.
>         (parser::parse_expr): Allow :c to be used with non-binary
>         operators if the commutative operand is known.
>         * gimple-ssa-backprop.c (backprop::process_builtin_call_use): Handle
>         CFN_FMS, CFN_FNMA and CFN_FNMS.
>         (backprop::process_assign_use): Remove FMA_EXPR handling.
>         * hsa-gen.c (gen_hsa_insns_for_operation_assignment): Likewise.
>         (gen_hsa_fma): New function.
>         (gen_hsa_insn_for_internal_fn_call): Use it for IFN_FMA, IFN_FMS,
>         IFN_FNMA and IFN_FNMS.
>         * match.pd: Add folds for IFN_FMS, IFN_FNMA and IFN_FNMS.
>         * gimple-fold.h (follow_all_ssa_edges): Declare.
>         * gimple-fold.c (follow_all_ssa_edges): New function.
>         * tree-ssa-math-opts.c (convert_mult_to_fma_1): Use the
>         gimple_build interface and use follow_all_ssa_edges to fold the result.
>         (convert_mult_to_fma): Use direct_internal_fn_suppoerted_p
>         instead of checking for optabs directly.
>         * config/i386/i386.c (ix86_add_stmt_cost): Recognize FMAs as calls
>         rather than FMA_EXPRs.
>         * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Create a
>         call to IFN_FMA instead of an FMA_EXPR.
>
> gcc/brig/
>         * brigfrontend/brig-function.cc
>         (brig_function::get_builtin_for_hsa_opcode): Use BUILT_IN_FMA
>         for BRIG_OPCODE_FMA.
>         (brig_function::get_tree_code_for_hsa_opcode): Treat BUILT_IN_FMA
>         as a call.
>
> gcc/c/
>         * gimple-parser.c (c_parser_gimple_postfix_expression): Remove
>         __FMA_EXPR handlng.
>
> gcc/cp/
>         * constexpr.c (cxx_eval_constant_expression): Remove FMA_EXPR handling.
>         (potential_constant_expression_1): Likewise.
>
> gcc/testsuite/
>         * lib/target-supports.exp (check_effective_target_scalar_all_fma):
>         New proc.
>         * gcc.dg/fma-1.c: New test.
>         * gcc.dg/fma-2.c: Likewise.
>         * gcc.dg/fma-3.c: Likewise.
>         * gcc.dg/fma-4.c: Likewise.
>         * gcc.dg/fma-5.c: Likewise.
>         * gcc.dg/fma-6.c: Likewise.
>         * gcc.dg/fma-7.c: Likewise.
>         * gcc.dg/gimplefe-26.c: Use .FMA instead of __FMA and require
>         scalar_all_fma.
>         * gfortran.dg/reassoc_7.f: Pass -ffp-contract=off.
>         * gfortran.dg/reassoc_8.f: Likewise.
>         * gfortran.dg/reassoc_9.f: Likewise.
>         * gfortran.dg/reassoc_10.f: Likewise.
>

This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85881


H.J.



More information about the Gcc-patches mailing list