This is the mail archive of the
mailing list for the GCC project.
Re: Expansion of narrowing math built-ins into power instructions
Martin Jambor <email@example.com> writes:
> On Thu, Aug 22 2019, Segher Boessenkool wrote:
>>> > Hi Tejas,
>>> > [ Please do not top-post. ]
>> On Thu, Aug 22, 2019 at 01:27:06PM +0530, Tejas Joshi wrote:
>>> > What happens then? "It does not work" is very very vague. At least it
>>> > seems the compiler does build now?
>>> Oh, compiler builds but instruction is still "bl fadd". It should be
>>> "fadds" right?
>> Yes, but that means the problem is earlier, before it hits RTL perhaps.
>> Compile with -dap, look at the expand dump (the lowest numbered one, 234
>> or so), and see what it looked like in the final Gimple, and then in the
>> RTL generated from that. And then drill down.
> Tejas sent me his patch and I looked at why it did not work. I found
> two reasons:
> 1. associated_internal_fn (in builtins.c) does not handle
> DEF_INTERNAL_OPTAB_FN kind of internal functions, and Tejas
> (sensibly, I'd say) used that macro to define the internal function.
> But when I worked around that by manually adding a case for it in the
> switch statement, I ran into an assert because...
> 2. direct_internal_fn_supported_p on which replacement_internal_fn
> depends to expand built-ins as internal functions cannot handle
> conversion optabs... and narrowing is a kind of conversion and the
> optab is added as such with OPTAB_CD.
> Actually, the second statement is not entirely true because somehow it
> can handle optab while_ult which is a conversion optab but a) the way it
> is handled, if I can understand it at all, seems to be a big hack and
> would be even worse if we decided to copy that for all narrowing math
Think "big hack" is a bit unfair. The way that the internal function
maps argument types to the optab modes, and the way it expands calls
into rtl, depends on the "optab type" argument (the final argument to
DEF_INTERNAL_OPTAB_FN). This is relatively flexible in that it can use
a single-mode "direct" optab or a dual-mode "conversion" optab, with the
modes coming from whichever arguments are appropriate. New optab types
can be added as needed.
FWIW, several other DEF_INTERNAL_OPTAB_FNs are conversion optabs too
(e.g. IFN_LOAD_LANES, IFN_STORE_LANES, IFN_MASK_LOAD, etc.).
> and b) it gets both modes from argument types whereas we need one from
> the result type and so we would have to rewrite
> replacement_internal_fn anyway.
...yeah, I agree this breaks the current model. The reason IFN_WHILE_ULT
doesn't rely on the return type is that if you have:
_2 = .WHILE_ULT (_0, _1) // returning a vector of 4 booleans
_3 = .WHILE_ULT (_0, _1) // returning a vector of 8 booleans
then the calls look equivalent. So instead we pass an extra argument
indicating the required boolean vector "shape".
The same "problem" could in principle apply to FADD if we ever needed
to support double+double->_Float16 for example.
> Therefore, at least for now (GSoC deadline is kind of looming), I
> decided that the best way forward would be to not rely on internal
> functions but plug into expand_builtin() and I wrote the following,
> lightly tested patch - which of course misses testcases and stuff - but
> I'd be curious about any feedback now anyway. When I proposed a very
> similar approach for the roundeven x86_64 expansion, Uros actually then
> opted for a solution based on internal functions, so I am curious
> whether there are simple alternatives I do not see.
> Tejas, of course cases for other fadd variants should at least be added
> to expand_builtin.
> 2019-08-23 Tejas Joshi <firstname.lastname@example.org>
> Martin Jambor <email@example.com>
> * builtins.c (expand_builtin_binary_conversion): New function.
> (expand_builtin): Call it.
> * config/rs6000/rs6000.md (unspec): Add UNSPEC_ADD_NARROWING.
> (add_truncdfsf3): New define_insn.
> * optabs.def (fadd_optab): New.
> diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> index 9461693bcd1..3f56880c23f 100644
> --- a/gcc/internal-fn.def
> +++ b/gcc/internal-fn.def
> @@ -140,6 +140,8 @@ DEF_INTERNAL_OPTAB_FN (WHILE_ULT, ECF_CONST | ECF_NOTHROW, while_ult, while)
> DEF_INTERNAL_OPTAB_FN (VEC_SHL_INSERT, ECF_CONST | ECF_NOTHROW,
> vec_shl_insert, binary)
> +DEF_INTERNAL_OPTAB_FN (FADD, ECF_CONST, fadd, binary)
> DEF_INTERNAL_OPTAB_FN (FMS, ECF_CONST, fms, ternary)
> DEF_INTERNAL_OPTAB_FN (FNMA, ECF_CONST, fnma, ternary)
> DEF_INTERNAL_OPTAB_FN (FNMS, ECF_CONST, fnms, ternary)
Should be dropped now.
OK with that change and the ones Segher asked for.