111309 – va_arg alternative for _BitInt

Bug 111309 - va_arg alternative for _BitInt

Summary: va_arg alternative for _BitInt

Status:	UNCONFIRMED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	c (show other bugs)
Version:	14.0

Importance:	P3 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Depends on:
Blocks:

Reported:	2023-09-06 17:05 UTC by Jakub Jelinek
Modified:	2024-09-18 23:24 UTC (History)
CC List:	5 users (show)

See Also:	111280
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Jakub Jelinek 2023-09-06 17:05:07 UTC

For https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2858.pdf , I wonder if we shouldn't have a __builtin_va_arg variant which would allow to read arbitrary _BitInt
into array of limbs.  The builtin IMHO should be passed at least the N from _BitInt(N), probably whether it is signed vs. unsigned and pointer to the array of limbs, dunno whether it should only support limb type which it also uses for the libgcc APIs or whether the type should be e.g. inferred from the scalar integer type the pointer argument points to.  And whether the endianity in which limbs are ordered should be host endianity, some other argument to the builtin, or always the _BitInt endianity.
On arches like x86-64 where the passing ABI says for N <= 8 pass like char, for N <= 16 pass like short, for N <= 32 pass like int, for N <= 64 pass like long long, otherwise pass like struct { long long a[(N + 63) / 64]; } it would need to differentiate at runtime (unless N is constant obviously) the different cases to perform proper VA_ARG for that, plus handle the generic case which would be always passed in memory.

Comment 1 jsm-csl@polyomino.org.uk 2023-09-06 17:29:50 UTC

Yes, we should have APIs for building type-generic _BitInt interfaces 
(also a width-of operation to give the width in bits of an integer type; 
also type-generic versions of operations such as clz, ctz, parity, 
popcount that work to the width in bits of any unsigned operand).  Though 
I suspect any library implementations of printf _BitInt support would end 
up needing architecture-specific workarounds for a while to avoid 
depending on having GCC new enough to support _BitInt in order to build a 
library with that support.

Comment 2 Jakub Jelinek 2023-09-06 17:40:17 UTC

For clz/ctz/parity/popcount/clrsb/ffs, it should be quite easy to implement them, the primary questions are, what the builtin name should be (because __builtin_clz etc. is already taken for unsigned int argument (resp. int for clrsb), so we want some suffix either for all type-generic _BitInt types, or make it type-generic completely and support all integral types in there.
And second question for clz/ctz is whether we should preserve the UB behavior for 0 argument, or define return value even for those.  For larger _BitInt it will be implemented using loops and so will need to take 0 limbs into account anyway, so the advantage of invoking UB on it is smaller, and for users testing whether _BitInt(32768) is non-zero is already quite expensive if they wanted to have well-defined result in that case.

Comment 3 jsm-csl@polyomino.org.uk 2023-09-06 17:47:39 UTC

Defined values for 0 are marginally more convenient for implementing the 
standard <stdbit.h> operations which have defined results for all 
arguments, and I think it's appropriate for the type-generic built-in 
functions to work for all integer types - at least all unsigned integer 
types (and including unsigned __int128) - rather than just _BitInt types.  
(<stdbit.h> itself - providing both functions and type-generic macros - 
makes most sense to provide in libc, I think.  The type-generic macros 
there don't actually support bit-precise types whose width doesn't match a 
standard/extended type, but providing such support, given appropriate 
built-in functions, certainly makes sense as an extension.)

Comment 4 GCC Commits 2023-11-14 09:52:35 UTC

The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:7383cb56e1170789929201b0dadc156888928fdd

commit r14-5435-g7383cb56e1170789929201b0dadc156888928fdd
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Tue Nov 14 10:38:56 2023 +0100

    Add type-generic clz/ctz/clrsb/ffs/parity/popcount builtins [PR111309]
    
    The following patch adds 6 new type-generic builtins,
    __builtin_clzg
    __builtin_ctzg
    __builtin_clrsbg
    __builtin_ffsg
    __builtin_parityg
    __builtin_popcountg
    The g at the end stands for generic because the unsuffixed variant
    of the builtins already have unsigned int or int arguments.
    
    The main reason to add these is to support arbitrary unsigned (for
    clrsb/ffs signed) bit-precise integer types and also __int128 which
    wasn't supported by the existing builtins, so that e.g. <stdbit.h>
    type-generic functions could then support not just bit-precise unsigned
    integer type whose width matches a standard or extended integer type,
    but others too.
    
    None of these new builtins promote their first argument, so the argument
    can be e.g. unsigned char or unsigned short or unsigned __int20 etc.
    The first 2 support either 1 or 2 arguments, if only 1 argument is supplied,
    the behavior is undefined for argument 0 like for other __builtin_c[lt]z*
    builtins, if 2 arguments are supplied, the second argument should be int
    that will be returned if the argument is 0.  All other builtins have
    just one argument.  For __builtin_clrsbg and __builtin_ffsg the argument
    shall be any signed standard/extended or bit-precise integer, for the others
    any unsigned standard/extended or bit-precise integer (bool not allowed).
    
    One possibility would be to also allow signed integer types for
    the clz/ctz/parity/popcount ones (and just cast the argument to
    unsigned_type_for during folding) and similarly unsigned integer types
    for the clrsb/ffs ones, dunno what is better; for stdbit.h the current
    version is sufficient and diagnoses use of the inappropriate sign,
    though on the other side I wonder if users won't be confused by
    __builtin_clzg (1) being an error and having to write __builtin_clzg (1U).
    
    The new builtins are lowered to corresponding builtins with other suffixes
    or internal calls (plus casts and adjustments where needed) during FE
    folding or during gimplification at latest, the non-suffixed builtins
    handling precisions up to precision of int, l up to precision of long,
    ll up to precision of long long, up to __int128 precision lowered to
    double-word expansion early and the rest (which must be _BitInt) lowered
    to internal fn calls - those are then lowered during bitint lowering pass.
    
    The patch also changes representation of IFN_CLZ and IFN_CTZ calls,
    previously they were in the IL only if they are directly supported optab
    and depending on C[LT]Z_DEFINED_VALUE_AT_ZERO (...) == 2 they had or didn't
    have defined behavior at 0, now they are in the IL either if directly
    supported optab, or for the large/huge BITINT_TYPEs and they have either
    1 or 2 arguments.  If one, the behavior is undefined at zero, if 2, the
    second argument is an int constant that should be returned for 0.
    As there is no extra support during expansion, for directly supported optab
    the second argument if present should still match the
    C[LT]Z_DEFINED_VALUE_AT_ZERO (...) == 2 value, but for BITINT_TYPE arguments
    it can be arbitrary int INTEGER_CST.
    
    The indended uses in stdbit.h are e.g.
     #ifdef __has_builtin
     #if __has_builtin(__builtin_clzg) && __has_builtin(__builtin_ctzg) && __has_builtin(__builtin_popcountg)
     #define stdc_leading_zeros(value) \
     ((unsigned int) __builtin_clzg (value, __builtin_popcountg ((__typeof (value)) ~(__typeof (value)) 0)))
     #define stdc_leading_ones(value) \
     ((unsigned int) __builtin_clzg ((__typeof (value)) ~(value), __builtin_popcountg ((__typeof (value)) ~(__typeof (value)) 0)))
     #define stdc_first_trailing_one(value) \
     ((unsigned int) (__builtin_ctzg (value, -1) + 1))
     #define stdc_trailing_zeros(value) \
     ((unsigned int) __builtin_ctzg (value, __builtin_popcountg ((__typeof (value)) ~(__typeof (value)) 0)))
     #endif
     #endif
    where __builtin_popcountg ((__typeof (x)) -1) computes the bit precision
    of x's type (kind of _Bitwidthof (x) alternative).
    
    They also allow casting of arbitrary unsigned _BitInt other than
    unsigned _BitInt(1) to corresponding signed _BitInt by using
    signed _BitInt(__builtin_popcountg ((__typeof (a)) -1))
    and of arbitrary signed _BitInt to corresponding unsigned _BitInt
    using unsigned _BitInt(__builtin_clrsbg ((__typeof (a)) -1) + 1).
    
    2023-11-14  Jakub Jelinek  <jakub@redhat.com>
    
            PR c/111309
    gcc/
            * builtins.def (BUILT_IN_CLZG, BUILT_IN_CTZG, BUILT_IN_CLRSBG,
            BUILT_IN_FFSG, BUILT_IN_PARITYG, BUILT_IN_POPCOUNTG): New
            builtins.
            * builtins.cc (fold_builtin_bit_query): New function.
            (fold_builtin_1): Use it for
            BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
            (fold_builtin_2): Use it for BUILT_IN_{CLZ,CTZ}G.
            * fold-const-call.cc: Fix comment typo on tm.h inclusion.
            (fold_const_call_ss): Handle
            CFN_BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
            (fold_const_call_sss): New function.
            (fold_const_call_1): Call it for 2 argument functions returning
            scalar when passed 2 INTEGER_CSTs.
            * genmatch.cc (cmp_operand): For function calls also compare
            number of arguments.
            (fns_cmp): New function.
            (dt_node::gen_kids): Sort fns and generic_fns.
            (dt_node::gen_kids_1): Handle fns with the same id but different
            number of arguments.
            * match.pd (CLZ simplifications): Drop checks for defined behavior
            at zero.  Add variant of simplifications for IFN_CLZ with 2 arguments.
            (CTZ simplifications): Drop checks for defined behavior at zero,
            don't optimize precisions above MAX_FIXED_MODE_SIZE.  Add variant of
            simplifications for IFN_CTZ with 2 arguments.
            (a != 0 ? CLZ(a) : CST -> .CLZ(a)): Use TREE_TYPE (@3) instead of
            type, add BITINT_TYPE handling, create 2 argument IFN_CLZ rather than
            one argument.  Add variant for matching CLZ with 2 arguments.
            (a != 0 ? CTZ(a) : CST -> .CTZ(a)): Similarly.
            * gimple-lower-bitint.cc (bitint_large_huge::lower_bit_query): New
            method.
            (bitint_large_huge::lower_call): Use it for IFN_{CLZ,CTZ,CLRSB,FFS}
            and IFN_{PARITY,POPCOUNT} calls.
            * gimple-range-op.cc (cfn_clz::fold_range): Don't check
            CLZ_DEFINED_VALUE_AT_ZERO for m_gimple_call_internal_p, instead
            assume defined value at zero if the call has 2 arguments and use
            second argument value for that case.
            (cfn_ctz::fold_range): Similarly.
            (gimple_range_op_handler::maybe_builtin_call): Use op_cfn_clz_internal
            or op_cfn_ctz_internal only if internal fn call has 2 arguments and
            set m_op2 in that case.
            * tree-vect-patterns.cc (vect_recog_ctz_ffs_pattern,
            vect_recog_popcount_clz_ctz_ffs_pattern): For value defined at zero
            use second argument of calls if present, otherwise assume UB at zero,
            create 2 argument .CLZ/.CTZ calls if needed.
            * tree-vect-stmts.cc (vectorizable_call): Handle 2 argument .CLZ/.CTZ
            calls.
            * tree-ssa-loop-niter.cc (build_cltz_expr): Create 2 argument
            .CLZ/.CTZ calls if needed.
            * tree-ssa-forwprop.cc (simplify_count_trailing_zeroes): Create 2
            argument .CTZ calls if needed.
            * tree-ssa-phiopt.cc (cond_removal_in_builtin_zero_pattern): Handle
            2 argument .CLZ/.CTZ calls, handle BITINT_TYPE, create 2 argument
            .CLZ/.CTZ calls.
            * doc/extend.texi (__builtin_clzg, __builtin_ctzg, __builtin_clrsbg,
            __builtin_ffsg, __builtin_parityg, __builtin_popcountg): Document.
    gcc/c-family/
            * c-common.cc (check_builtin_function_arguments): Handle
            BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
            * c-gimplify.cc (c_gimplify_expr): If __builtin_c[lt]zg second
            argument hasn't been folded into constant yet, transform it to one
            argument call inside of a COND_EXPR which for first argument 0
            returns the second argument.
    gcc/c/
            * c-typeck.cc (convert_arguments): Don't promote first argument
            of BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
    gcc/cp/
            * call.cc (magic_varargs_p): Return 4 for
            BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
            (build_over_call): Don't promote first argument of
            BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
            * cp-gimplify.cc (cp_gimplify_expr): For BUILT_IN_C{L,T}ZG use
            c_gimplify_expr.
    gcc/testsuite/
            * c-c++-common/pr111309-1.c: New test.
            * c-c++-common/pr111309-2.c: New test.
            * gcc.dg/torture/bitint-43.c: New test.
            * gcc.dg/torture/bitint-44.c: New test.

Comment 5 Christophe Lyon 2023-11-20 10:31:53 UTC

The new test pr111309-2.c has a few failures on arm-eabi:
FAIL:g++:g++.dg/dg.exp=c-c++-common/pr111309-2.c  -std=c++14  (test for errors, line 35)
FAIL:g++:g++.dg/dg.exp=c-c++-common/pr111309-2.c  -std=c++14  (test for errors, line 54)
FAIL:g++:g++.dg/dg.exp=c-c++-common/pr111309-2.c  -std=c++17  (test for errors, line 35)
FAIL:g++:g++.dg/dg.exp=c-c++-common/pr111309-2.c  -std=c++17  (test for errors, line 54)
FAIL:g++:g++.dg/dg.exp=c-c++-common/pr111309-2.c  -std=c++20  (test for errors, line 35)
FAIL:g++:g++.dg/dg.exp=c-c++-common/pr111309-2.c  -std=c++20  (test for errors, line 54)
FAIL:g++:g++.dg/dg.exp=c-c++-common/pr111309-2.c  -std=c++98  (test for errors, line 35)
FAIL:g++:g++.dg/dg.exp=c-c++-common/pr111309-2.c  -std=c++98  (test for errors, line 54)

That is, no error message for lines 35 and 54:
/* { dg-error "does not have 'int' type" "" { target c++ } } */

Comment 6 Jakub Jelinek 2023-11-20 16:55:50 UTC

Does
2023-11-20  Jakub Jelinek  <jakub@redhat.com>

	PR c/111309
	* c-c++-common/pr111309-2.c (foo): Don't expect errors for C++ with
	-fshort-enums if second argument is E0.

--- gcc/testsuite/c-c++-common/pr111309-2.c.jj	2023-11-14 10:52:16.191276028 +0100
+++ gcc/testsuite/c-c++-common/pr111309-2.c	2023-11-20 17:52:30.606386073 +0100
@@ -32,7 +32,7 @@ foo (void)
   __builtin_clzg (0U, 2LL);	/* { dg-error "does not have 'int' type" } */
   __builtin_clzg (0U, 2U);	/* { dg-error "does not have 'int' type" } */
   __builtin_clzg (0U, true);
-  __builtin_clzg (0U, E0);	/* { dg-error "does not have 'int' type" "" { target c++ } } */
+  __builtin_clzg (0U, E0);	/* { dg-error "does not have 'int' type" "" { target { c++ && { ! short_enums } } } } */
   __builtin_ctzg ();		/* { dg-error "too few arguments" } */
   __builtin_ctzg (0U, 1, 2);	/* { dg-error "too many arguments" } */
   __builtin_ctzg (0);		/* { dg-error "has signed type" } */
@@ -51,7 +51,7 @@ foo (void)
   __builtin_ctzg (0U, 2LL);	/* { dg-error "does not have 'int' type" } */
   __builtin_ctzg (0U, 2U);	/* { dg-error "does not have 'int' type" } */
   __builtin_ctzg (0U, true);
-  __builtin_ctzg (0U, E0);	/* { dg-error "does not have 'int' type" "" { target c++ } } */
+  __builtin_ctzg (0U, E0);	/* { dg-error "does not have 'int' type" "" { target { c++ && { ! short_enums } } } } */
   __builtin_clrsbg ();		/* { dg-error "too few arguments" } */
   __builtin_clrsbg (0, 1);	/* { dg-error "too many arguments" } */
   __builtin_clrsbg (0U);	/* { dg-error "has unsigned type" } */
fix that?  For -fshort-enums in C++ E0 has smaller precision than int and so eventhough it is unsigned, it is (or would be) promoted to int.

Comment 7 GCC Commits 2023-11-21 09:04:11 UTC

The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:1fcfd224ff67afd08ea5aa66a8bd687bb21798b2

commit r14-5639-g1fcfd224ff67afd08ea5aa66a8bd687bb21798b2
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Tue Nov 21 10:03:26 2023 +0100

    testsuite: Fix up pr111309-2.c on arm [PR111309]
    
    ARM defaults to -fshort-enums and the following testcase FAILs there in 2
    lines.  The difference is that in C++, E0 has enum E type, which normally
    has unsigned int underlying type, so it isn't int nor something that
    promotes to int, which is why we diagnose it (in C it is promoted to int).
    But with -fshort-enums, the underlying type is unsigned char in that case,
    which promotes to int just fine.
    
    The following patch adjusts the expectations, such that we don't expect
    it on arm or when people manually test with -fshort-enums.
    
    2023-11-21  Jakub Jelinek  <jakub@redhat.com>
    
            PR c/111309
            * c-c++-common/pr111309-2.c (foo): Don't expect errors for C++ with
            -fshort-enums if second argument is E0.