[AArch64] Enable generation of FRINTNZ instructions

Thu Nov 18 11:05:55 GMT 2021

On Wed, 17 Nov 2021, Andre Vieira (lists) wrote:

> 
> On 16/11/2021 12:10, Richard Biener wrote:
> > On Fri, 12 Nov 2021, Andre Simoes Dias Vieira wrote:
> >
> >> On 12/11/2021 10:56, Richard Biener wrote:
> >>> On Thu, 11 Nov 2021, Andre Vieira (lists) wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> This patch introduces two IFN's FTRUNC32 and FTRUNC64, the corresponding
> >>>> optabs and mappings. It also creates a backend pattern to implement them
> >>>> for
> >>>> aarch64 and a match.pd pattern to idiom recognize these.
> >>>> These IFN's (and optabs) represent a truncation towards zero, as if
> >>>> performed
> >>>> by first casting it to a signed integer of 32 or 64 bits and then back to
> >>>> the
> >>>> same floating point type/mode.
> >>>>
> >>>> The match.pd pattern choses to use these, when supported, regardless of
> >>>> trapping math, since these new patterns mimic the original behavior of
> >>>> truncating through an integer.
> >>>>
> >>>> I didn't think any of the existing IFN's represented these. I know it's a
> >>>> bit
> >>>> late in stage 1, but I thought this might be OK given it's only used by a
> >>>> single target and should have very little impact on anything else.
> >>>>
> >>>> Bootstrapped on aarch64-none-linux.
> >>>>
> >>>> OK for trunk?
> >>> On the RTL side ftrunc32/ftrunc64 would probably be better a conversion
> >>> optab (with two modes), so not
> >>>
> >>> +OPTAB_D (ftrunc32_optab, "ftrunc$asi2")
> >>> +OPTAB_D (ftrunc64_optab, "ftrunc$adi2")
> >>>
> >>> but
> >>>
> >>> OPTAB_CD (ftrunc_shrt_optab, "ftrunc$a$I$b2")
> >>>
> >>> or so?  I know that gets somewhat awkward for the internal function,
> >>> but IMHO we shouldn't tie our hands because of that?
> >> I tried doing this originally, but indeed I couldn't find a way to
> >> correctly
> >> tie the internal function to it.
> >>
> >> direct_optab_supported_p with multiple types expect those to be of the same
> >> mode. I see convert_optab_supported_p does but I don't know how that is
> >> used...
> >>
> >> Any ideas?
> > No "nice" ones.  The "usual" way is to provide fake arguments that
> > specify the type/mode.  We could use an integer argument directly
> > secifying the mode (then the IL would look host dependent - ugh),
> > or specify a constant zero in the intended mode (less visibly
> > obvious - but at least with -gimple dumping you'd see the type...).
> Hi,
> 
> So I reworked this to have a single optab and IFN. This required a bit of
> fiddling with custom expander and supported_p functions for the IFN. I decided
> to pass a MAX_INT for the 'int' type to the IFN to be able to pass on the size
> of the int we use as an intermediate cast.  I tried 0 first, but gcc was being
> too smart and just demoted it to an 'int' for the long long test-cases.
> 
> Bootstrapped on aarch64-none-linux.
> 
> OK for trunk?

@@ -3713,12 +3713,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
    trapping behaviour, so require !flag_trapping_math. */
 #if GIMPLE
 (simplify
-   (float (fix_trunc @0))
-   (if (!flag_trapping_math
-       && types_match (type, TREE_TYPE (@0))
-       && direct_internal_fn_supported_p (IFN_TRUNC, type,
-                                         OPTIMIZE_FOR_BOTH))
-      (IFN_TRUNC @0)))
+   (float (fix_trunc@1 @0))
+   (if (types_match (type, TREE_TYPE (@0)))
+    (if (TYPE_SIGN (TREE_TYPE (@1)) == SIGNED
+        && direct_internal_fn_supported_p (IFN_FTRUNC_INT, type,
+                                           TREE_TYPE (@1),
OPTIMIZE_FOR_BOTH))
+     (with {
+      tree int_type = TREE_TYPE (@1);
+      unsigned HOST_WIDE_INT max_int_c
+       = (1ULL << (element_precision (int_type) - 1)) - 1;

That's only half-way supporting vector types I fear - you use
element_precision but then build a vector integer constant
in an unsupported way.  I suppose vector support isn't present
for arm?  The cleanest way would probably be to do

       tree int_type = element_type (@1);

with providing element_type in tree.[ch] like we provide
element_precision.

+      }
+      (IFN_FTRUNC_INT @0 { build_int_cst (int_type, max_int_c); }))

Then you could use wide_int_to_tree (int_type, wi::max_value 
(TYPE_PRECISION (int_type), SIGNED))
to build the special integer constant (which seems to be always
scalar).

+     (if (!flag_trapping_math
+         && direct_internal_fn_supported_p (IFN_TRUNC, type,
+                                            OPTIMIZE_FOR_BOTH))
+      (IFN_TRUNC @0)))))
 #endif

does IFN_FTRUNC_INT preserve the same exceptions as doing
explicit intermediate float->int conversions?  I think I'd
prefer to have !flag_trapping_math on both cases.

> gcc/ChangeLog:
> 
>         * config/aarch64/aarch64.md (ftrunc<mode><frintnz_mode>2): New
> pattern.
>         * config/aarch64/iterators.md (FRINTZ): New iterator.
>         * doc/md.texi: New entry for ftrunc pattern name.
>         * internal-fn.def (FTRUNC_INT): New IFN.
>         * match.pd: Add to the existing TRUNC pattern match.
>         * optabs.def (ftrunc_int): New entry.
> 
> gcc/testsuite/ChangeLog:
> 
>         * gcc.target/aarch64/merge_trunc1.c: Adapted to skip if frintNz
> instruction available.
>         * lib/target-supports.exp: Added arm_v8_5a_frintnzx_ok target.
>         * gcc.target/aarch64/frintnz.c: New test.
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)