Jakub Jelinek [Sat, 24 Sep 2022 07:24:26 +0000 (09:24 +0200)]
openmp: Fix ICE with taskgroup at -O0 -fexceptions [PR107001]
The following testcase ICEs because with -O0 -fexceptions GOMP_taskgroup_end
call isn't directly followed by GOMP_RETURN statement, but there are some
conditionals to handle exceptions and we fail to find the correct GOMP_RETURN.
The fix is to treat taskgroup similarly to target data, both of these constructs
emit a try { body } finally { end_call } around the construct's body during
gimplification and we need to see proper construct nesting during gimplification
and omp lowering (including nesting of regions checks), but during omp expansion
we don't really need their nesting anymore, all we need is emit something at
the start of the region and the end of the region is the end API call we've
already emitted during gimplification. For target data, we weren't adding
GOMP_RETURN statement during omp lowering, so after that pass it is treated
merely like stand-alone omp directives. This patch does the same for
taskgroup too.
2022-09-24 Jakub Jelinek <jakub@redhat.com>
PR c/107001
* omp-low.cc (lower_omp_taskgroup): Don't add GOMP_RETURN statement
at the end.
* omp-expand.cc (build_omp_regions_1): Clarify GF_OMP_TARGET_KIND_DATA
is not stand-alone directive. For GIMPLE_OMP_TASKGROUP, also don't
update parent.
(omp_make_gimple_edges) <case GIMPLE_OMP_TASKGROUP>: Reset
cur_region back after new_omp_region.
Jakub Jelinek [Sat, 24 Sep 2022 07:19:26 +0000 (09:19 +0200)]
openmp, c: Tighten up c_tree_equal [PR106981]
This patch changes c_tree_equal to work more like cp_tree_equal, be
more strict in what it accepts. The ICE on the first testcase was
due to INTEGER_CST wi::wide (t1) == wi::wide (t2) comparison which
ICEs if the two constants have different precision, but as the second
testcase shows, being too lenient in it can also lead to miscompilation
of valid OpenMP programs where we think certain expression is the same
even when it isn't and can be guaranteed at runtime to represent different
memory location. So, the patch looks through only NON_LVALUE_EXPRs
and for constants as well as casts requires that the types match before
actually comparing the constant values or recursing on the cast operands.
2022-09-24 Jakub Jelinek <jakub@redhat.com>
PR c/106981
gcc/c/
* c-typeck.cc (c_tree_equal): Only strip NON_LVALUE_EXPRs at the
start. For CONSTANT_CLASS_P or CASE_CONVERT: return false if t1 and
t2 have different types.
gcc/testsuite/
* c-c++-common/gomp/pr106981.c: New test.
libgomp/
* testsuite/libgomp.c-c++-common/pr106981.c: New test.
Jonathan Wakely [Fri, 23 Sep 2022 12:28:37 +0000 (13:28 +0100)]
libstdc++: Fix std::is_nothrow_invocable_r for uncopyable prvalues [PR91456]
This is the last missing piece of PR 91456.
This also removes the only use of the C++11 version of
std::is_nothrow_invocable, which was just renamed to
__is_nothrow_invocable_lib. We can remove that now.
libstdc++-v3/ChangeLog:
PR libstdc++/91456
* include/std/type_traits (__is_nothrow_invocable_lib): Remove.
(__is_invocable_impl::__nothrow_type): New member type which
checks if the conversion can throw.
(__is_nt_invocable_impl): Replace class template with alias
template to __is_nt_invocable_impl::__nothrow_type.
* testsuite/20_util/is_nothrow_invocable/91456.cc: New test.
* testsuite/20_util/is_nothrow_convertible/value.cc: Remove
macro used by value_ext.cc test.
* testsuite/20_util/is_nothrow_convertible/value_ext.cc: Remove
test for non-standard __is_nothrow_invocable_lib trait.
Joseph Myers [Fri, 23 Sep 2022 21:23:01 +0000 (21:23 +0000)]
testsuite: Add more C2x tests
There are some new requirements in C2x where GCC already behaves as
required (for all standard versions), where previous standard versions
either had weaker requirements permitting the GCC behavior, or were
arguably defective in what they said in that area. Add tests that
specifically verify GCC behaves as expected for C2x. (There may be
further such tests to be added in future for already-supported C2x
features.)
* Compound literals in function parameter lists have automatic storage
duration. (This is a case where strictly this wasn't specified by
previous standard versions, but it seems to make more sense to treat
this as a defect in those versions than to implement something
different conditionally for those versions.)
* Concatenation of string literals with different prefixes is a
constraint violation (previously it was implementation-defined
whether it was permitted, and GCC did not permit it).
* UCNs above 0x10ffff are a constraint violation; previously they were
implicitly undefined behavior by virtue of wording about "designates
the character" referring to code points outside the ISO/IEC 10646
range; GCC diagnosed such UCNs since commit 0900e29cdbc533fecf2a311447bbde17f101bbd6 (Sep 2019). The test I
added also has more detailed coverage of what lower-valued UCNs are
accepted.
Tested for x86_64-pc-linux-gnu.
* gcc.dg/c2x-complit-1.c, gcc.dg/c2x-concat-1.c,
gcc.dg/cpp/c2x-ucn-1.c: New tests.
In the test cases, it's clearly written that intrinsics is not
implemented on arm*. A simple xfail does not help since there are
link error and that would cause an UNRESOLVED testcase rather than
XFAIL.
By changing to dg-skip-if, the entire test case is omitted.
To improve compile times, the C++ library could use compiler built-ins
rather than implementing std::is_convertible (and _nothrow) as class
templates. This patch adds the built-ins. We already have
__is_constructible and __is_assignable, and the nothrow forms of those.
Microsoft (and clang, for compatibility) also provide an alias called
__is_convertible_to. I did not add it, but it would be trivial to do
so.
I noticed that our __is_assignable doesn't implement the "Access checks
are performed as if from a context unrelated to either type" requirement,
therefore std::is_assignable / __is_assignable give two different results
here:
class S {
operator int();
friend void g(); // #1
};
* include/std/type_traits: Rename __is_nothrow_convertible to
__is_nothrow_convertible_lib.
* testsuite/20_util/is_nothrow_convertible/value_ext.cc: Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/ext/has-builtin-1.C: Enhance to test __is_convertible and
__is_nothrow_convertible.
* g++.dg/ext/is_convertible1.C: New test.
* g++.dg/ext/is_convertible2.C: New test.
* g++.dg/ext/is_nothrow_convertible1.C: New test.
* g++.dg/ext/is_nothrow_convertible2.C: New test.
The following extends the skipping of same valued stores to
handle an arbitrary number of them as long as they are from the
same value (which we now record). That's an obvious extension
which allows to optimize the m_engaged member of std::optional
more reliably.
PR tree-optimization/106922
* tree-ssa-sccvn.cc (vn_reference_lookup_3): Allow
an arbitrary number of same valued skipped stores.
The frange setter does all its work in trees. This incurs a penalty
for the real_value variants because they must wrap their arguments
into a tree and pass it to the tree setter, which will then do the
opposite. This is leftovers from the irange setter.
Even though the we still need constructors taking trees so we can
interact with the tree world, there's no sense penalizing the rest of
the implementation.
Tested on x86-64 Linux.
gcc/ChangeLog:
* value-range.cc (frange::set): Swap setters such that the one
accepting REAL_VALUE_TYPE does all the work.
Jonathan Wakely [Fri, 23 Sep 2022 09:12:09 +0000 (10:12 +0100)]
libstdc++: Enable constexpr std::bitset for debug mode
We already disable all debug mode checks for C++11 and later, so we can
easily make everything constexpr. This fixes the FAIL results for the
new tests when using -D_GLIBCXX_DEBUG.
Also fix some other tests failing with non-default test flags.
libstdc++-v3/ChangeLog:
* include/debug/bitset (__debug::bitset): Add constexpr to all
member functions.
(operator&, operator|, operator^): Add inline and constexpr.
(operator>>, operator<<): Add inline.
* testsuite/20_util/bitset/access/constexpr.cc: Only check using
constexpr std::string for the cxx11 ABI.
* testsuite/20_util/bitset/cons/constexpr_c++23.cc: Likewise.
* testsuite/20_util/headers/bitset/synopsis.cc: Check constexpr
for C++23.
Jonathan Wakely [Thu, 22 Sep 2022 17:36:04 +0000 (18:36 +0100)]
libstdc++: Optimize std::bitset<N>::to_string
This makes to_string approximately twice as fast at any optimization
level. Instead of iterating through every bit, jump straight to the next
bit that is set, by using _Find_first and _Find_next.
libstdc++-v3/ChangeLog:
* include/std/bitset (bitset::_M_copy_to_string): Find set bits
instead of iterating over individual bits.
This patch adds -mcpu/-mtune support for the Arm Neoverse V2 core.
This updates the internal references to "demeter", but leaves "demeter" as an
accepted value to -mcpu/-mtune as it appears in the released GCC 12 series.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/ChangeLog:
* config/aarch64/aarch64-cores.def (neoverse-v2): New entry.
(demeter): Update tunings to neoversev2.
* config/aarch64/aarch64-tune.md: Regenerate.
* config/aarch64/aarch64.cc (demeter_addrcost_table): Rename to
neoversev2_addrcost_table.
(demeter_regmove_cost): Rename to neoversev2_addrcost_table.
(demeter_advsimd_vector_cost): Rename to neoversev2_advsimd_vector_cost.
(demeter_sve_vector_cost): Rename to neoversev2_sve_vector_cost.
(demeter_scalar_issue_info): Rename to neoversev2_scalar_issue_info.
(demeter_advsimd_issue_info): Rename to neoversev2_advsimd_issue_info.
(demeter_sve_issue_info): Rename to neoversev2_sve_issue_info.
(demeter_vec_issue_info): Rename to neoversev2_vec_issue_info.
Update references to above.
(demeter_vector_cost): Rename to neoversev2_vector_cost.
(demeter_tunings): Rename to neoversev2_tunings.
(aarch64_vec_op_count::rename_cycles_per_iter): Use
neoversev2_sve_issue_info instead of demeter_sve_issue_info.
* doc/invoke.texi (AArch64 Options): Document neoverse-v2.
frange: drop endpoints to min/max representable numbers for -ffinite-math-only.
Similarly to how we drop NANs to UNDEFINED when -ffinite-math-only, I
think we can drop the numbers outside of the min/max representable
numbers to the representable number.
This means the endpoings to VR_VARYING for -ffinite-math-only can now
be the min/max representable, instead of -INF and +INF.
Saturating in the setter means that the upcoming implementation for
binary operators no longer have to worry about doing the right
thing for -ffinite-math-only. If the range goes outside the limits,
it'll get chopped down.
Tested on x86-64 Linux.
gcc/ChangeLog:
* range-op-float.cc (build_le): Use vrp_val_*.
(build_lt): Same.
(build_ge): Same.
(build_gt): Same.
* value-range.cc (frange::set): Chop ranges outside of the
representable numbers for -ffinite-math-only.
(frange::normalize_kind): Use vrp_val*.
(frange::verify_range): Same.
(frange::set_nonnegative): Same.
(range_tests_floats): Remove tests that depend on -INF and +INF.
* value-range.h (real_max_representable): Add prototype.
(real_min_representable): Same.
(vrp_val_max): Set max representable number for
-ffinite-math-only.
(vrp_val_min): Same but for min.
(frange::set_varying): Use vrp_val*.
It has been suggested that if we start bumping numbers by an ULP when
calculating open ranges (for example the numbers less than 3.0) that
dumping these will become increasingly harder to read, and instead we
should opt for the hex representation. I still find the floating
point representation easier to read for most numbers, but perhaps we
could have both?
With this patch this is the representation for [15.0, 20.0]:
Martin Liska [Thu, 22 Sep 2022 13:57:48 +0000 (15:57 +0200)]
opts: fix --help=common with '\t' description
Fixes -flto-compression option:
- -flto-compression-level=<number> Use z Use zlib/zstd compression level <number> for IL.
+ -flto-compression-level=<0,19> Use zlib/zstd compression level <number> for IL.
gcc/ChangeLog:
* common.opt: Update -flto-compression-level documentation.
* opts.cc (print_filtered_help): Do not append range to an
option that uses \t syntax.
I've noticed
+UNRESOLVED: g++.dg/tree-ssa/pr106922.C -std=gnu++20 scan-tree-dump-not dce3 "m_initialized"
+UNRESOLVED: g++.dg/tree-ssa/pr106922.C -std=gnu++2b scan-tree-dump-not dce3 "m_initialized"
with this change, both on x86_64 and i686.
The dump is still cddce3, additionally as the last reference to the pre
dump is gone, not sure it is worth creating that dump.
With the following patch, there aren't FAILs nor UNRESOLVED tests with
GXX_TESTSUITE_STDS=98,11,14,17,20,2b make check-g++ RUNTESTFLAGS="--target_board=unix\{-m32,-m64\} dg.exp='pr106922.C'"
2022-09-23 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/106922
* g++.dg/tree-ssa/pr106922.C: Scan in cddce3 dump rather than
dce3. Remove -fdump-tree-pre-details from dg-options.
Jakub Jelinek [Fri, 23 Sep 2022 07:10:16 +0000 (09:10 +0200)]
attribs: Improve diagnostics
When looking at the attribs code, I've noticed weird diagnostics
like
int a __attribute__((section ("foo", "bar")));
a.c:1:1: error: wrong number of arguments specified for ‘section’ attribute
1 | int a __attribute__((section ("foo", "bar")));
| ^~~
a.c:1:1: note: expected between 1 and 1, found 2
As roughly 50% of attributes that accept any arguments have
spec->min_length == spec->max_length, I think it is worth it to have
separate wording for such common case and just write simpler
a.c:1:1: note: expected 1, found 2
2022-09-23 Jakub Jelinek <jakub@redhat.com>
* attribs.cc (decl_attributes): Improve diagnostics, instead of
saying expected between 1 and 1, found 2 just say expected 1, found 2.
i386: Optimize code generation of __mm256_zextsi128_si256(__mm_set1_epi8(-1))
gcc/ChangeLog:
PR target/94962
* config/i386/constraints.md (BH): New define_constraint.
* config/i386/i386.cc (standard_sse_constant_p): Add return
3/4 when operand matches new predicate.
(standard_sse_constant_opcode): Add new alternative branch to
return "vpcmpeqd".
* config/i386/predicates.md
(vector_all_ones_zero_extend_half_operand): New define_predicate.
(vector_all_ones_zero_extend_quarter_operand): Ditto.
* config/i386/sse.md: Add constraint to insn "mov<mode>_internal".
Thomas Neumann [Mon, 19 Sep 2022 16:10:02 +0000 (18:10 +0200)]
Avoid depending on destructor order
In some scenarios (e.g., when mixing gcc and clang code), it can
happen that frames are deregistered after the lookup structure
has already been destroyed. That in itself would be fine, but
it triggers an assert in __deregister_frame_info_bases that
expects to find the frame.
To avoid that, we now remember that the btree as already been
destroyed and disable the assert in that case.
libgcc/ChangeLog:
* unwind-dw2-fde.c: (release_register_frames) Remember
when the btree has been destroyed.
(__deregister_frame_info_bases) Disable the assert when
shutting down.
Andrew MacLeod [Tue, 20 Sep 2022 23:19:30 +0000 (19:19 -0400)]
Convert CFN_BUILT_IN_GOACC_DIM_* to range-ops.
* gimple-range-fold.cc (range_of_builtin_int_call): Remove case
for CFN_GOACC_DIM_*.
* gimple-range-op.cc (class cfn_goacc_dim): New.
(gimple_range_op_handler::maybe_builtin_call): Set arguments.
Andrew MacLeod [Tue, 20 Sep 2022 23:05:03 +0000 (19:05 -0400)]
Convert CFN_BUILT_IN_STRLEN to range-ops.
* gimple-range-fold.cc (range_of_builtin_int_call): Remove case
for CFN_BUILT_IN_STRLEN.
* gimple-range-op.cc (class cfn_strlen): New.
(gimple_range_op_handler::maybe_builtin_call): Set arguments.
Andrew MacLeod [Tue, 20 Sep 2022 22:21:04 +0000 (18:21 -0400)]
Convert CFN_BUILT_IN_CLRSB to range-ops.
* gimple-range-fold.cc (range_of_builtin_int_call): Remove case
for CFN_BUILT_IN_CLRSB.
* gimple-range-op.cc (class cfn_clrsb): New.
(gimple_range_op_handler::maybe_builtin_call): Set arguments.
Andrew MacLeod [Tue, 20 Sep 2022 22:19:30 +0000 (18:19 -0400)]
Convert CFN_CTZ builtins to range-ops.
* gimple-range-fold.cc (range_of_builtin_int_call): Remove case
for CFN_CTZ.
* gimple-range-op.cc (class cfn_ctz): New.
(gimple_range_op_handler::maybe_builtin_call): Set arguments.
Andrew MacLeod [Tue, 20 Sep 2022 22:12:25 +0000 (18:12 -0400)]
Convert CFN_CLZ builtins to range-ops.
* gimple-range-fold.cc (range_of_builtin_int_call): Remove case
for CFN_CLZ.
* gimple-range-op.cc (class cfn_clz): New.
(gimple_range_op_handler::maybe_builtin_call): Set arguments.
Andrew MacLeod [Tue, 20 Sep 2022 22:07:14 +0000 (18:07 -0400)]
Convert CFN_BUILT_FFS and CFN_POPCOUNT to range-ops.
* gimple-range-fold.cc (range_of_builtin_int_call): Remove case
for CFN_FFS and CFN_POPCOUNT.
* gimple-range-op.cc (class cfn_pocount): New.
(gimple_range_op_handler::maybe_builtin_call): Set arguments.
Andrew MacLeod [Tue, 20 Sep 2022 21:14:30 +0000 (17:14 -0400)]
Convert CFN_BUILT_IN_TOUPPER and TOLOWER to range-ops.
* gimple-range-fold.cc (get_letter_range): Move to new class.
(range_of_builtin_int_call): Remove case for CFN_BUILT_IN_TOUPPER
and CFN_BUILT_IN_TOLOWER.
* gimple-range-op.cc (class cfn_toupper_tolower): New.
(gimple_range_op_handler::maybe_builtin_call): Set arguments.
Andrew MacLeod [Wed, 21 Sep 2022 13:29:40 +0000 (09:29 -0400)]
Convert CFN_BUILT_IN_SIGNBIT to range-ops.
* gimple-range-fold.cc (range_of_builtin_int_call): Remove case
for CFN_BUILT_IN_SIGNBIT.
* gimple-range-op.cc (class cfn_signbit): New.
(gimple_range_op_handler::maybe_builtin_call): Set arguments.
Andrew MacLeod [Tue, 20 Sep 2022 20:53:37 +0000 (16:53 -0400)]
Add range-ops support for builtin functions.
Convert CFN_BUILT_IN_CONSTANT_P as first POC.
* gimple-range-fold.cc
(fold_using_range::range_of_builtin_int_call): Remove case for
CFN_BUILT_IN_CONSTANT_P.
* gimple-range-op.cc (gimple_range_op_handler::supported_p):
Check if a call also creates a range-op object.
(gimple_range_op_handler): Also check builtin calls.
(class cfn_constant_float_p): New. Float CFN_BUILT_IN_CONSTANT_P.
(class cfn_constant_p): New. Integral CFN_BUILT_IN_CONSTANT_P.
(gimple_range_op_handler::maybe_builtin_call): Set arguments and
handler for supported built-in calls.
* gimple-range-op.h (maybe_builtin_call): New prototype.
Andrew MacLeod [Wed, 21 Sep 2022 20:15:02 +0000 (16:15 -0400)]
Always check the return value of fold_range.
The fold_range routine in range-ops returns FALSE if the operation
fails. There are a few places which assume the operation was
successful. Fix those.
* gimple-range-fold.cc (range_of_range_op): Set result to
VARYING if the call to fold_range fails.
* tree-data-ref.cc (compute_distributive_range): Ditto.
* tree-vrp.cc (range_fold_binary_expr): Ditto.
(range_fold_unary_expr): Ditto.
* value-query.cc (range_query::get_tree_range): Ditto.
Andrew MacLeod [Tue, 20 Sep 2022 16:34:08 +0000 (12:34 -0400)]
Add missing float fold_range prototype for floats.
Unary operations require op2 to be the range of the type of the LHS.
This is so the type for the LHS can be properly set.
* range-op-float.cc (range_operator_float::fold_range): New base
method for "int = float op int".
* range-op.cc (range_op_handler::fold_range): New case.
* range-op.h: Update prototypes.
Andrew MacLeod [Thu, 22 Sep 2022 14:27:17 +0000 (10:27 -0400)]
Fix calc_op1 for undefined op2_range.
Unary operations pass the type of operand 1 into op1_range. If that
range is undefined, the routine blindly picks the type of operand 2,
which in the case of a unary op, does not exist and traps.
* gimple-range-op.cc (gimple_range_op_handler::calc_op1): Use
operand 1 for second range if there is no operand 2.
Andrew MacLeod [Thu, 1 Sep 2022 14:34:55 +0000 (10:34 -0400)]
Create gimple_range_op_handler in a new source file.
Range-ops is meant to be IL independent. Some gimple processing has
be placed in range-ops, and some is located in gori. Split it all into
a file and isolate it in a new class gimple_range_op_handler.
* Makefile.in (OBJS): Add gimple-range-op.o.
* gimple-range-edge.cc (gimple_outgoing_range_stmt_p): Use
gimple_range_op_handler.
* gimple-range-fold.cc (gimple_range_base_of_assignment): Move
to a method in gimple_range_op_handler.
(gimple_range_operand1): Ditto.
(gimple_range_operand2): Ditto.
(fold_using_range::fold_stmt): Use gimple_range_op_handler.
(fold_using_range::range_of_range_op): Ditto.
(fold_using_range::relation_fold_and_or): Ditto.
(fur_source::register_outgoing_edges): Ditto.
(gimple_range_ssa_names): Relocate to gimple-range-op.cc.
* gimple-range-fold.h: Adjust prototypes.
* gimple-range-gori.cc (gimple_range_calc_op1): Move
to a method in gimple_range_op_handler.
(gimple_range_calc_op2): Ditto.
(gori_compute::compute_operand_range): Use
gimple_range_op_handler.
(gori_compute::compute_logical_operands): Ditto.
(compute_operand1_range): Ditto.
(gori_compute::compute_operand2_range): Ditto.
(gori_compute::compute_operand1_and_operand2_range): Ditto.
* gimple-range-gori.h: Adjust protoypes.
* gimple-range-op.cc: New. Supply gimple_range_op_handler methods.
* gimple-range-op.h: New. Supply gimple_range_op_handler class.
* gimple-range.cc (gimple_ranger::prefill_name): Use
gimple_range_op_handler.
(gimple_ranger::prefill_stmt_dependencies): Ditto.
* gimple-range.h: Include gimple-range-op.h.
* range-op.cc (range_op_handler::range_op_handler): Adjust and
remove gimple * parameter option.
* range-op.h: Adjust prototypes.
Andrew MacLeod [Wed, 31 Aug 2022 18:07:13 +0000 (14:07 -0400)]
Adjust range_op_handler to store the handler directly.
Range_op_handler currently stores a tree code and a type. It defers
checking to see if there is a valid handler until asked.
This change checks at constuctor time and store a pointer to
the handler if there is one.
* range-op.cc (range_op_handler::set_op_handler): Set new fields.
(ange_op_handler::range_op_handler): Likewise.
(range_op_handler::operator bool): Remove.
(range_op_handler::fold_range): Use appropriate handler.
(range_op_handler::op1_range): Likewise.
(range_op_handler::op2_range): Likewise.
(range_op_handler::lhs_op1_relation): Likewise.
(range_op_handler::lhs_op2_relation): Likewise.
(range_op_handler::op1_op2_relation): Likewise.
* range-op.h (class range_op_handler): Store handler pointers.
(range_op_handler:: operator bool): Inline.
Jonathan Wakely [Thu, 22 Sep 2022 14:00:08 +0000 (15:00 +0100)]
libstdc++: Implement constexpr std::bitset for C++23 (P2417R2)
Also add _GLIBCXX_HOSTED checks to simplify making <bitset>
freestanding in the near future.
libstdc++-v3/ChangeLog:
* include/std/bitset (bitset): Add constexpr for C++23. Guard
members using std::string with _GLIBCXX_HOSTED.
* include/std/version (__cpp_lib_constexpr_bitset): Define.
* testsuite/20_util/bitset/access/constexpr.cc: New test.
* testsuite/20_util/bitset/cons/constexpr_c++23.cc: New test.
* testsuite/20_util/bitset/count/constexpr.cc: New test.
* testsuite/20_util/bitset/ext/constexpr.cc: New test.
* testsuite/20_util/bitset/operations/constexpr_c++23.cc: New test.
* testsuite/20_util/bitset/version.cc: New test.
Jonathan Wakely [Thu, 22 Sep 2022 13:37:58 +0000 (14:37 +0100)]
libstdc++: Rearrange tests for <bitset>
In C++03 std::bitset was in the Container clause, but since C++11 it has
been in the Utilties clause. This moves the tests to the 20_util
directory, where most people probably expect to find them.
Also create 'access', 'observers', and 'io' subdirectories and group
some tests under there, rather than having one directory per function
name, and only a single test in that directory.
libstdc++-v3/ChangeLog:
* testsuite/23_containers/bitset/18604.cc: Moved to...
* testsuite/20_util/bitset/18604.cc: ...here.
* testsuite/23_containers/bitset/45713.cc: Moved to...
* testsuite/20_util/bitset/45713.cc: ...here.
* testsuite/23_containers/bitset/to_string/dr396.cc: Moved to...
* testsuite/20_util/bitset/access/dr396.cc: ...here.
* testsuite/23_containers/bitset/to_string/1.cc: Moved to...
* testsuite/20_util/bitset/access/to_string.cc: ...here.
* testsuite/23_containers/bitset/to_ullong/1.cc: Moved to...
* testsuite/20_util/bitset/access/to_ullong.cc: ...here.
* testsuite/23_containers/bitset/to_ulong/1.cc: Moved to...
* testsuite/20_util/bitset/access/to_ulong.cc: ...here.
* testsuite/23_containers/bitset/cons/1.cc: Moved to...
* testsuite/20_util/bitset/cons/1.cc: ...here.
* testsuite/23_containers/bitset/cons/16020.cc: Moved to...
* testsuite/20_util/bitset/cons/16020.cc: ...here.
* testsuite/23_containers/bitset/cons/2.cc: Moved to...
* testsuite/20_util/bitset/cons/2.cc: ...here.
* testsuite/23_containers/bitset/cons/3.cc: Moved to...
* testsuite/20_util/bitset/cons/3.cc: ...here.
* testsuite/23_containers/bitset/cons/38244.cc: Moved to...
* testsuite/20_util/bitset/cons/38244.cc: ...here.
* testsuite/23_containers/bitset/cons/50268.cc: Moved to...
* testsuite/20_util/bitset/cons/50268.cc: ...here.
* testsuite/23_containers/bitset/cons/6282.cc: Moved to...
* testsuite/20_util/bitset/cons/6282.cc: ...here.
* testsuite/23_containers/bitset/cons/constexpr.cc: Moved to...
* testsuite/20_util/bitset/cons/constexpr.cc: ...here.
* testsuite/23_containers/bitset/cons/dr1325-1.cc: Moved to...
* testsuite/20_util/bitset/cons/dr1325-1.cc: ...here.
* testsuite/23_containers/bitset/cons/dr1325-2.cc: Moved to...
* testsuite/20_util/bitset/cons/dr1325-2.cc: ...here.
* testsuite/23_containers/bitset/cons/dr396.cc: Moved to...
* testsuite/20_util/bitset/cons/dr396.cc: ...here.
* testsuite/23_containers/bitset/debug/invalidation/1.cc: Moved to...
* testsuite/20_util/bitset/debug/invalidation/1.cc: ...here.
* testsuite/23_containers/bitset/ext/15361.cc: Moved to...
* testsuite/20_util/bitset/ext/15361.cc: ...here.
* testsuite/23_containers/bitset/hash/1.cc: Moved to...
* testsuite/20_util/bitset/hash/1.cc: ...here.
* testsuite/23_containers/bitset/input/1.cc: Moved to...
* testsuite/20_util/bitset/io/input.cc: ...here.
* testsuite/23_containers/bitset/count/6124.cc: Moved to...
* testsuite/20_util/bitset/observers/6124.cc: ...here.
* testsuite/23_containers/bitset/all/1.cc: Moved to...
* testsuite/20_util/bitset/observers/all.cc: ...here.
* testsuite/23_containers/bitset/test/1.cc: Moved to...
* testsuite/20_util/bitset/observers/test.cc: ...here.
* testsuite/23_containers/bitset/operations/1.cc: Moved to...
* testsuite/20_util/bitset/operations/1.cc: ...here.
* testsuite/23_containers/bitset/operations/13838.cc: Moved to...
* testsuite/20_util/bitset/operations/13838.cc: ...here.
* testsuite/23_containers/bitset/operations/2.cc: Moved to...
* testsuite/20_util/bitset/operations/2.cc: ...here.
* testsuite/23_containers/bitset/operations/96303.cc: Moved to...
* testsuite/20_util/bitset/operations/96303.cc: ...here.
* testsuite/23_containers/bitset/operations/constexpr-2.cc: Moved to...
* testsuite/20_util/bitset/operations/constexpr-2.cc: ...here.
* testsuite/23_containers/bitset/operations/constexpr.cc: Moved to...
* testsuite/20_util/bitset/operations/constexpr.cc: ...here.
* testsuite/23_containers/bitset/requirements/constexpr_functions.cc: Moved to...
* testsuite/20_util/bitset/requirements/constexpr_functions.cc: ...here.
* testsuite/23_containers/bitset/requirements/explicit_instantiation/1.cc: Moved to...
* testsuite/20_util/bitset/requirements/explicit_instantiation/1.cc: ...here.
* testsuite/23_containers/bitset/requirements/explicit_instantiation/1_c++0x.cc: Moved to...
* testsuite/20_util/bitset/requirements/explicit_instantiation/1_c++0x.cc: ...here.
* testsuite/23_containers/headers/bitset/synopsis.cc: Moved to...
* testsuite/20_util/headers/bitset/synopsis.cc: ...here.
Ian Lance Taylor [Wed, 21 Sep 2022 23:26:08 +0000 (16:26 -0700)]
cmd/cgo: add and use runtime/cgo.Incomplete instead of //go:notinheap
This ports https://go.dev/cl/421879 to libgo. This is a quick port to
update gofrontend to work with the version of cgo in gc mainline.
A more complete port will follow, changing the gc version of cmd/cgo to
choose an approach based on feature testing the gccgo in use.
Patrick Palka [Thu, 22 Sep 2022 12:46:23 +0000 (08:46 -0400)]
c++ modules: partial variable template specializations [PR106826]
With partial variable template specializations, it looks like we
stream the VAR_DECL (i.e. the DECL_TEMPLATE_RESULT of the corresponding
TEMPLATE_DECL) since process_partial_specialization adds it to the
specializations table, but we end up never streaming the corresponding
TEMPLATE_DECL itself that's reachable only from the primary template's
DECL_TEMPLATE_SPECIALIZATIONS list, which leads to this list being
incomplete on stream-in.
The modules machinery already has special logic for streaming partial
specializations of class templates; this patch attempts to generalize
it to handle those of variable templates as well.
PR c++/106826
gcc/cp/ChangeLog:
* module.cc (trees_out::decl_value): Use get_template_info in
the MK_partial case to handle both VAR_DECL and TYPE_DECL.
(trees_out::key_mergeable): Likewise.
(trees_in::key_mergeable): Likewise.
(has_definition): Consider DECL_INITIAL of a partial variable
template specialization.
(depset::hash::make_dependency): Handle partial variable template
specializations too.
gcc/testsuite/ChangeLog:
* g++.dg/modules/partial-2_a.C: New test.
* g++.dg/modules/partial-2_b.C: New test.
David Malcolm [Thu, 22 Sep 2022 12:35:26 +0000 (08:35 -0400)]
c: fix uninitialized c_expr::m_decimal [PR106830]
I added c_expr::m_decimal in r13-2386-gbedfca647a9e9c1a as part of the
implementation of -Wxor-used-as-pow, but I missed various places where
the field needed to be initialized.
Fixed thusly.
gcc/c-family/ChangeLog:
PR c/106830
* c-warn.cc (check_for_xor_used_as_pow): Don't try checking
values that don't fit in uhwi.
Richard Biener [Wed, 21 Sep 2022 11:52:56 +0000 (13:52 +0200)]
tree-optimization/106922 - missed FRE/PRE
The following enhances the store-with-same-value trick in
vn_reference_lookup_3 by not only looking for
a = val;
*ptr = val;
.. = a;
but also
*ptr = val;
other = x;
.. = a;
where the earlier store is more than one hop away. It does this
by queueing the actual value to compare until after the walk but
as disadvantage only allows a single such skipped store from a
constant value.
Unfortunately we cannot handle defs from non-constants this way
since we're prone to pick up values from the past loop iteration
this way and we have no good way to identify values that are
invariant in the currently iterated cycle. That's why we keep
the single-hop lookup for those cases. gcc.dg/tree-ssa/pr87126.c
would be a testcase that's un-XFAILed when we'd handle those
as well.
PR tree-optimization/106922
* tree-ssa-sccvn.cc (vn_walk_cb_data::same_val): New member.
(vn_walk_cb_data::finish): Perform delayed verification of
a skipped may-alias.
(vn_reference_lookup_pieces): Likewise.
(vn_reference_lookup): Likewise.
(vn_reference_lookup_3): When skipping stores of the same
value also handle constant stores that are more than a
single VDEF away by delaying the verification.
* gcc.dg/tree-ssa/ssa-fre-100.c: New testcase.
* g++.dg/tree-ssa/pr106922.C: Adjust.
Max Filippov [Thu, 14 Jul 2022 09:39:59 +0000 (02:39 -0700)]
xtensa: gcc: implement MI thunk generation for call0 ABI
gcc/
* config/xtensa/xtensa.cc (xtensa_can_output_mi_thunk)
(xtensa_output_mi_thunk): New functions.
(TARGET_ASM_CAN_OUTPUT_MI_THUNK)
(TARGET_ASM_OUTPUT_MI_THUNK): New macro definitions.
(xtensa_prepare_expand_call): Use fixed register a8 as temporary
when called with reload_completed set to 1.
Richard Biener [Thu, 22 Sep 2022 07:40:40 +0000 (09:40 +0200)]
tree-optimization/99407 - DSE with data-ref analysis
The following resolves the issue that DSE cannot handle references
with variable offsets well when identifying possible uses of a store.
Instead of just relying on ref_maybe_used_by_stmt_p we use data-ref
analysis, making sure to perform that at most once per stmt. The
new mode is only exercised by the DSE pass before loop optimization
as specified by a new pass parameter and when expensive optimizations
are enabled, so it's disabled below -O2.
PR tree-optimization/99407
* tree-ssa-dse.cc (dse_stmt_to_dr_map): New global.
(dse_classify_store): Use data-ref analysis to disambiguate more uses.
(pass_dse::use_dr_analysis_p): New pass parameter.
(pass_dse::set_pass_param): Implement.
(pass_dse::execute): Allocate and deallocate dse_stmt_to_dr_map.
* passes.def: Allow DR analysis for the DSE pass before loop.
Jonathan Wakely [Wed, 21 Sep 2022 13:59:18 +0000 (14:59 +0100)]
libstdc++: Fix accidental duplicate test [PR91456]
It looks like I committed the testcase for std::function twice, instead
of one for std::function and one for std::is_invocable_r. This replaces
the is_invocable_r one with the example from the PR.
libstdc++-v3/ChangeLog:
PR libstdc++/91456
* testsuite/20_util/function/91456.cc: Add comment with PR
number.
* testsuite/20_util/is_invocable/91456.cc: Likewise. Replace
std::function checks with std::is_invocable_r checks.
[PR106967] Set known NANs to undefined for flag_finite_math_only.
Explicit NANs in the IL can be treated as undefined for
flag_finite_math_only. This causes all the right things to happen wrt
threading, folding, etc. It also saves us special casing throughout.
PR tree-optimization/106967
gcc/ChangeLog:
* value-range.cc (frange::set): Set known NANs to undefined for
flag_finite_math_only.
Clear unused flags in frange for undefined ranges.
gcc/ChangeLog:
* value-range.cc (frange::combine_zeros): Call set_undefined.
(frange::intersect_nans): Same.
(frange::intersect): Same.
(frange::verify_range): Undefined ranges do not have a type.
* value-range.h (frange::set_undefined): Clear NAN flags and type.
aarch64: Rewrite -march=native to -mcpu if no other -mcpu or -mtune is given
We have received requests to improve the out-of-the box experience and
performance of AArch64 GCC users, particularly those porting software from other
architectures. This has many aspects. One such aspect are apps built natively
with an -march=native used as a tuning flag in the Makefile.
On AArch64 this selects the right architecture features on GNU+Linux for the
host system but tunes for the "generic" CPU target.
This patch makes GCC also tune for the host CPU, as well as selecting its
architecture. That is, it translates -march=native into -mcpu=native.
This maintains the documentation that it "causes the compiler to pick the
architecture of the host system" since -mcpu=native does that, but it also
gives a better performance experience for the user.
If the user explicitly asked for a particular CPU tuning through -mcpu or
-mtune then we don't do this rewriting so that the user option is honoured.
This would have been a one-line patch if it wasn't for --with-tune
configure-time arguments. When GCC is configured with --with-tune=<CORE> the
OPTION_DEFAULT_SPECS will insert an -mtune=<CORE> in the options if no other
-mcpu or -mtune options were given. This will spook the aforementioned desired
rewriting of -march=native into -mcpu=native, though I'd argue that we want to
do the rewrite even then. Therefore, this patch moves some specs in aarch64.h
around and refactors the --with-tune rewriting into CONFIG_TUNE_SPEC so that
the materialization of the implicit -mtune=<CORE> does not happen if -march=native
is used.
Bootstrapped and tested on aarch64-none-linux-gnu and checked with the output
of -### from the driver that the option rewriting works as expected on
aarch64-linux-gnu.
gcc/ChangeLog:
* config/aarch64/aarch64.h (HAVE_LOCAL_CPU_DETECT,
EXTRA_SPEC_FUNCTIONS, MCPU_MTUNE_NATIVE_SPECS): Move definitions up before
OPTION_DEFAULT_SPECS.
(MCPU_MTUNE_NATIVE_SPECS): Pass "cpu" to
local_cpu_detect when rewriting -march=native and no -mcpu or -mtune
is given.
(CONFIG_TUNE_SPEC): Define.
(OPTION_DEFAULT_SPECS): Use CONFIG_TUNE_SPEC for "tune".
[PR106967] frange: revamp relational operators for NANs.
Since NANs can be inserted by other passes even for -ffinite-math-only,
we can't depend on the flag to determine if a NAN is a possiblity.
Instead, we must explicitly check for them.
In the case of -ffinite-math-only, paths leading up to a NAN are
undefined and can be considered unreachable. I have audited all the
relational code and made sure we're handling the known NAN case before
anything else, setting undefined when appropriate.
In the process, I revamped all the relational code handling NANs to
correctly notice paths that are unreachable.
The basic structure for ordered relational operators (except != of
course) is this:
If either operand is a known NAN, return FALSE.
The true side of a relop when one operand is a NAN is
unreachable.
On the false side of a relop when one operand is a NAN, we
know nothing about the other operand.
Regstrapped on x86-64 and ppc64le Linux.
lapack testing on x86-64 with and without -ffinite-math-only.
Don't check can_vec_perm_const_p for nonlinear iv_init when it's constant.
When init_expr is INTEGER_CST or REAL_CST, can_vec_perm_const_p is not
necessary since there's no real vec_perm needed, but
vec_gen_perm_mask_checked will gcc_assert (can_vec_perm_const_p). So
it's better to use vec_gen_perm_mask_any in
vect_create_nonlinear_iv_init.
gcc/ChangeLog:
PR tree-optimization/106963
* tree-vect-loop.cc (vect_create_nonlinear_iv_init): Use
vec_gen_perm_mask_any instead of vec_gen_perm_mask_check.
Jonathan Wakely [Tue, 20 Sep 2022 23:46:04 +0000 (00:46 +0100)]
libstdc++: Add <initializer_list> to ranges_base.h header
The header should be included explicitly to use std::initializer_list.
With the upcoming changes to make <ranges> available for freestanding
this becomes an error, because <initializer_list> is no longer provided
by any of the other headers involved here.
libstdc++-v3/ChangeLog:
* include/bits/ranges_base.h: Include <initializer_list>.
Fortran: F2018 type(*),dimension(*) with scalars [PR104143]
Assumed-size dummy arguments accept arrays and array elements as actual
arguments. There are also a few exceptions when real scalars are permitted.
Since F2018, this includes scalar arguments to assumed-type dummies; while
type(*) was added in TS29113, this change is only in F2018 itself.
PR fortran/104143
gcc/fortran/ChangeLog:
* interface.cc (compare_parameter): Permit scalar args to
'type(*), dimension(*)'.
gcc/testsuite/ChangeLog:
* gfortran.dg/c-interop/c407b-2.f90: Remove dg-error.
* gfortran.dg/assumed_type_16.f90: New test.
* gfortran.dg/assumed_type_17.f90: New test.
Patrick Palka [Tue, 20 Sep 2022 20:13:48 +0000 (16:13 -0400)]
c++: xtreme-header modules tests cleanups
This adds some recently implemented C++20/23 library headers to the
xtreme-header tests as appropriate. Also, it looks like we can safely
re-add <execution> and remove the NO_ASSOCIATED_LAMBDA workaround.
gcc/testsuite/ChangeLog:
* g++.dg/modules/xtreme-header-2.h: Include <execution>.
* g++.dg/modules/xtreme-header-6.h: Include implemented
C++20 library headers.
* g++.dg/modules/xtreme-header.h: Likewise. Remove
NO_ASSOCIATED_LAMBDA workaround. Include implemented C++23
library headers.
Patrick Palka [Tue, 20 Sep 2022 20:08:14 +0000 (16:08 -0400)]
c++: modules and non-dependent auto deduction
The modules streaming code seems to rely on the invariant that a
TEMPLATE_DECL and its DECL_TEMPLATE_RESULT have the same TREE_TYPE.
But for a non-dependent VAR_DECL with deduced type, the two TREE_TYPEs
end up diverging: cp_finish_decl deduces the type of the initializer
ahead of time and updates the TREE_TYPE of the VAR_DECL, but neglects to
update the corresponding TEMPLATE_DECL as well, which leads to a
"conflicting global module declaration" error for each of the
__phase_alignment decls in the below testcase (and for the xtreme-header
tests if we try including <barrier>).
This patch makes cp_finish_decl update the TREE_TYPE of the corresponding
TEMPLATE_DECL so that the invariant is maintained.
gcc/cp/ChangeLog:
* decl.cc (cp_finish_decl): After updating the deduced type of a
VAR_DECL, also update the corresponding TEMPLATE_DECL if there
is one.
gcc/testsuite/ChangeLog:
* g++.dg/modules/auto-3.h: New test.
* g++.dg/modules/auto-3_a.H: New test.
* g++.dg/modules/auto-3_b.C: New test.
frange::maybe_isnan() should return FALSE for undefined ranges.
Undefined ranges have undefined NAN bits. We can't depend on them,
as they may contain garbage. This patch returns false from
maybe_isnan() for undefined ranges (the empty set).
gcc/ChangeLog:
* value-range.h (frange::maybe_isnan): Return false for
undefined ranges.
A specifically nonnegative range should not contain -NAN, otherwise
signbit_p() would return false, because we'd be unsure of the sign.
PR 68097/tree-optimization
gcc/ChangeLog:
* value-range.cc (frange::set_nonnegative): Set +NAN.
(range_tests_signed_zeros): New test.
* value-range.h (frange::update_nan): New overload to set NAN sign.
It turns out that GTY(()) markers in definitions like:
GTY(()) tree scalar_types[NUM_VECTOR_TYPES];
are not effective and are silently ignored. The GTY(()) has
to come after an extern or static.
The externs associated with the SVE ACLE GTY variables are in
aarch64-sve-builtins.h. This file is not in tm_include_list because
we don't want every target-facing file to include it. It therefore
isn't in the list of GC header files either.
In this case that's a blessing in disguise, since the variables
belong to a namespace and gengtype doesn't understand namespaces.
I think the fix is instead to add an extra extern before each
variable declaration, similarly to varasm.cc and vtable-verify.cc.
(This works due to a "using namespace" at the end of the file.)
gcc/
PR target/106491
* config/aarch64/aarch64-sve-builtins.cc (scalar_types)
(acle_vector_types, acle_svpattern, acle_svprfop): Add GTY
markup to (new) extern declarations instead of to the main
definition.
vect: Fix SLP layout handling of masked loads [PR106794]
PR106794 shows that I'd forgotten about masked loads when
doing the SLP layout changes. These loads can't currently
be permuted independently of their mask input, so during
construction they never get a load permutation.
(If we did support permuting masked loads in future, the mask
would need to be in the right order for the load, rather than in
the order implied by the result of the permutation. Since masked
loads can't be partly or fully scalarised in the way that normal
permuted loads can be, there's probably no benefit to fusing the
permutation and the load. Permutation after the fact is probably
good enough.)
gcc/
PR tree-optimization/106794
PR tree-optimization/106914
* tree-vect-slp.cc (vect_optimize_slp_pass::internal_node_cost):
Only consider loads that already have a permutation.
(vect_optimize_slp_pass::start_choosing_layouts): Assert that
loads with permutations are leaf nodes. Prevent any kind of grouped
access from changing layout if it doesn't have a load permutation.
gcc/testsuite/
* gcc.dg/vect/pr106914.c: New test.
* g++.dg/vect/pr106794.cc: Likewise.
While writing a testcase for PR106794, I noticed that we failed
to vectorise the testcase in the patch for SVE. The code that
recognises gather loads tries to optimise the point at which
the offset is calculated, to avoid unnecessary extensions or
truncations:
/* Don't include the conversion if the target is happy with
the current offset type. */
But breaking only makes sense if we're at an SSA_NAME (which could
then be vectorised). We shouldn't break on a conversion embedded
in a generic expression.
gcc/
* tree-vect-data-refs.cc (vect_check_gather_scatter): Restrict
early-out optimisation to SSA_NAMEs.
gcc/testsuite/
* gcc.dg/vect/vect-gather-5.c: New test.
Patrick Palka [Tue, 20 Sep 2022 14:19:30 +0000 (10:19 -0400)]
c++: stream PACK_EXPANSION_EXTRA_ARGS [PR106761]
It looks like after the libstdc++ commit r13-2158-g02f6b405f0e9dc
some xtreme-header-* tests are failing with "conflicting global module
declaration" errors ultimately because we're neglecting to stream
PACK_EXPANSION_EXTRA_ARGS, which leads to wrong equivalences of
different partial instantiations of _TupleConstraints::__constructible.