gcc.gnu.org Git - gcc.git/log

Update ChangeLog and version files for release

Daily bump.

Revert "c++: DMI in template with virtual base [PR106890]"

The PR106890 patch caused PR109666; for 12.3 let's just revert it.

PR c++/109666

This reverts commit 94569d91bd4c604da755b4aae84256e7fe21196a.

tree-optimization/109724 - new testcase

The following adds a testcase for PR109724 which was caused by
backporting r13-2375-gbe1b42de9c151d and fixed by r11-199-g2b42509f8b7bdf.

PR tree-optimization/109724
* g++.dg/torture/pr109724.C: New testcase.

(cherry picked from commit ee99aaae4aeecd55f1d945a959652cf07e3b2e9e)

Daily bump.

libstdc++: Set _M_string_length before calling _M_dispose() [PR109703]

This always sets _M_string_length in the constructor for ranges of input
iterators, such as stream iterators.

We copy from the source range to the local buffer, and then repeatedly
reallocate a larger one if necessary. When disposing the old buffer,
_M_is_local() is used to tell if the buffer is the local one or not (and
so must be deallocated). In addition to comparing the buffer address
with the local buffer, _M_is_local() has an optimization hint so that
the compiler knows that for a string using the local buffer, there is an
invariant that _M_string_length <= _S_local_capacity (added for PR109299
via r13-6915-gbf78b43873b0b7). But we failed to set _M_string_length in
the constructor taking a pair of iterators, so the invariant might not
hold, and __builtin_unreachable() is reached. This causes UBsan errors,
and potentially misoptimization.

To ensure the invariant holds, _M_string_length is initialized to zero
before doing anything else, so that _M_is_local() doesn't see an
uninitialized value.

This issue only surfaces when constructing a string with a range of
input iterator, and the uninitialized _M_string_length happens to be
greater than _S_local_capacity, i.e., 15 for the std::string
specialization.

libstdc++-v3/ChangeLog:

PR libstdc++/109703
* include/bits/basic_string.h (basic_string(Iter, Iter, Alloc)):
Initialize _M_string_length.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Co-authored-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit cbf6c7a1d16490a1e63e9a5ce00e9a5c44c4c2f2)

libstdc++: Strip absolute paths from files shown in Doxygen docs

This avoids showing absolute paths from the expansion of
@srcdir@/libsupc++/ in the doxygen File List view.

libstdc++-v3/ChangeLog:

* doc/doxygen/user.cfg.in (STRIP_FROM_PATH): Remove prefixes
from header paths.

(cherry picked from commit 975e8e836ead0e9055a125a2a23463db5d847cb3)

c++: Fix up VEC_INIT_EXPR gimplification after r12-7069

During patch backporting, I've noticed that while most cp_walk_tree calls
with cp_fold_r callback callers were changed from &pset to cp_fold_data
&data, the VEC_INIT_EXPR gimplifications has not, so it still passes just
address of a hash_set<tree> and so if during the folding we ever touch
data->genericize, we use uninitialized data there.

The following patch changes it to do the same thing as cp_fold_function
because the VEC_INIT_EXPR gimplifications will happen on function bodies
only.

2023-05-03 Jakub Jelinek <jakub@redhat.com>

* cp-gimplify.cc (cp_fold_data): Move definition earlier.
(cp_gimplify_expr): Pass address of genericize = true
constructed data rather than &pset to cp_walk_tree with cp_fold_r.

(cherry picked from commit 8d193b12d6f07ae0196db8296a49c881c1638c01)

Daily bump.

c++, coroutines: Fix block nests when the function has no top-level bind.

When the function contains no local vars and also no nested scopes, there
is no top-level bind expression. Because the rewritten coroutine body will
require both local vars and contain nested scopes, we add a bind expression
to such functions. When this was done the necessary scope blocks were
omitted which leads to disconnected function content.

Fixed by adding a new block to the added bind expression.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (coro_rewrite_function_body): Ensure that added
bind expressions have scope blocks.

(cherry picked from commit a8d7631d333c22e38a067d32d11fd2b60cf1d960)

c++,coroutines: Stabilize names of promoted slot vars [PR101118].

When we need to 'promote' a value (i.e. store it in the coroutine frame) it
is given a frame entry name. This was based on the DECL_UID for slot vars.
However, when LTO is used, the names from multiple TUs become visible at the
same time, and the DECL_UIDs usually differ between units. This leads to a
"ODR mismatch" warning for the frame type.

The fix here is to use the current promoted temporaries count to produce
the name, this is stable between TUs and computed per coroutine.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR c++/101118

gcc/cp/ChangeLog:

* coroutines.cc (flatten_await_stmt): Use the current count of
promoted temporaries to build a unique name for the frame entries.

(cherry picked from commit fc4cde2e6aa4d6ebdf7f70b7b4359fb59a1915ae)

coroutines: Build pointer initializers with nullptr_node [PR107768]

The PR reports that using integer_zero_node triggers a warning for
-Wzero-as-null-pointer-constant which comes from compiler-generated code so
makes no sense to the end user.

Co-Authored-By: Iain Sandoe <iain@sandoe.co.uk>
PR c++/107768

gcc/cp/ChangeLog:

* coroutines.cc (coro_rewrite_function_body): Initialize pointers
from nullptr_node. (morph_fn_to_coro): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr107768.C: New test.

(cherry picked from commit 0b1d66658ecdcc3d9251641a0b902b4c73ace303)

Daily bump.

libstdc++: Implement LWG 3904 change to lazy_split_view's iterator

libstdc++-v3/ChangeLog:

* include/std/ranges (lazy_split_view::_OuterIter::_OuterIter):
Propagate _M_trailing_empty in the const-converting constructor
as per LWG 3904.
* testsuite/std/ranges/adaptors/lazy_split.cc (test12): New test.

(cherry picked from commit aa65771427d32299cffecea64cbb766411aa8faf)

libstdc++: Implement P2520R0 changes to move_iterator's iterator_concept

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h (move_iterator::_S_iter_concept):
Define.
(__cpp_lib_move_iterator_concept): Define for C++20.
(move_iterator::iterator_concept): Strengthen as per P2520R0.
* include/std/version (__cpp_lib_move_iterator_concept): Define
for C++20.
* testsuite/24_iterators/move_iterator/p2520r0.cc: New test.

(cherry picked from commit 2b204accd07a3185b58b1edc6e9b019472857a5d)

libstdc++: Make views::single/iota/istream SFINAE-friendly [PR108362]

PR libstdc++/108362

libstdc++-v3/ChangeLog:

* include/std/ranges (__detail::__can_single_view): New concept.
(_Single::operator()): Constrain it.  Move [[nodiscard]] to the
end of the function declarator.
(__detail::__can_iota_view): New concept.
(_Iota::operator()): Constrain it.  Move [[nodiscard]] to the
end of the function declarator.
(__detail::__can_istream_view): New concept.
(_Istream::operator()): Constrain it.  Move [[nodiscard]] to the
end of the function declarator.
* testsuite/std/ranges/iota/lwg3292_neg.cc: Prune "in
requirements" diagnostic.
* testsuite/std/ranges/iota/iota_view.cc (test07): New test.
* testsuite/std/ranges/istream_view.cc (test08): New test.
* testsuite/std/ranges/single_view.cc (test07): New test.

(cherry picked from commit 95827e1b9f7d5dd5a697bd60292e3876a7e8c15c)

Daily bump.

libstdc++: Fix __max_diff_type::operator>>= for negative values

This patch fixes sign bit propagation when right-shifting a negative
__max_diff_type value by more than one, a bug that our existing test
coverage didn't expose until r14-159-g03cebd304955a6 fixed the front
end's 'signed typedef-name' handling that the test relies on (which is
a non-standard extension to the language grammar).

libstdc++-v3/ChangeLog:

* include/bits/max_size_type.h (__max_diff_type::operator>>=):
Fix propagation of sign bit.
* testsuite/std/ranges/iota/max_size_type.cc: Avoid using the
non-standard 'signed typedef-name'. Add some compile-time tests
for right-shifting a negative __max_diff_type value by more than
one.

(cherry picked from commit 83470a5cd4c3d233e1d55b5e5553e1b9c553bf28)

c++: outer 'this' leaking into local class [PR106969]

Here when resolving the implicit object for '&wrapped' within the
local class Foo, we expect to obtain a dummy object of type Foo& since
there's no 'this' available in this context.  And yet at this point
current_class_ref still corresponds to the outer class Context (and is
const), which confuses maybe_dummy_object into propagating the cv-quals
of current_class_ref and returning an object of type const Foo&.  Thus
decltype(&wrapped) wrongly yields const int* instead of int*.

The problem ultimately seems to be that the 'this' from the enclosing
class appears available for use when parsing the local class, but 'this'
shouldn't persist across classes like that.  This patch fixes this by
clearing current_class_ptr/ref before parsing a class definition.

After this change, for the test name-clash11.C in C++98 mode we would
now complain about an invalid use of 'this' in e.g.

  ASSERT (sizeof (this->A) == 16);

due to the way the test defines the ASSERT macro via a local class.
This patch redefines the macro using a local typedef instead.

PR c++/106969

gcc/cp/ChangeLog:

* parser.cc (cp_parser_class_specifier): Clear current_class_ptr
and current_class_ref sooner, before parsing a class definition.

gcc/testsuite/ChangeLog:

* g++.dg/lookup/name-clash11.C: Fix ASSERT macro definition in
C++98 mode.
* g++.dg/lookup/this2.C: New test.

(cherry picked from commit bbf2424c57c2e13d1a972c4ef4e871c3119b9cb4)

c++: unevaluated array new-expr size constantness [PR108219]

Here we're mishandling the unevaluated array new-expressions due to a
supposed non-constant array size ever since r12-5253-g4df7f8c79835d569
made us no longer perform constant evaluation of non-manifestly-constant
expressions within unevaluated contexts.  This shouldn't make a difference
here since the array sizes are constant literals, except they're expressed
as NON_LVALUE_EXPR location wrappers around INTEGER_CST, wrappers which
used to get stripped as part of constant evaluation and now no longer do.
Moreover it means build_vec_init can't constant fold the MINUS_EXPR
'maxindex' passed from build_new_1 when in an unevaluated context (since
it tries reducing it via maybe_constant_value called with mce_unknown).

This patch fixes these issues by making maybe_constant_value (and
fold_non_dependent_expr) try folding an unevaluated non-manifestly-constant
operand via fold(), as long as it simplifies to a simple constant, rather
than doing no simplification at all.  This covers e.g. simple arithmetic
and casts including stripping of location wrappers around INTEGER_CST.

In passing, this patch also fixes maybe_constant_value to avoid constant
evaluating an unevaluated operand when called with mce_false, by adjusting
the early exit test appropriately.

Co-authored-by: Jason Merrill <jason@redhat.com>
PR c++/108219
PR c++/108218

gcc/cp/ChangeLog:

* constexpr.cc (fold_to_constant): Define.
(maybe_constant_value): Move up early exit test for unevaluated
operands.  Try reducing an unevaluated operand to a constant via
fold_to_constant.
(fold_non_dependent_expr_template): Add early exit test for
CONSTANT_CLASS_P nodes.  Try reducing an unevaluated operand
to a constant via fold_to_constant.
* cp-tree.h (fold_to_constant): Declare.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/new6.C: New test.
* g++.dg/cpp2a/concepts-new1.C: New test.

(cherry picked from commit 096f034a8f5df41f610e62c1592fb90a3f551cd5)

Daily bump.

libstdc++: Fix error in doxygen comments in <atomic>

libstdc++-v3/ChangeLog:

* include/std/atomic: Add missing @endcond doxygen comment.

libstdc++: Call predicate with non-const values in std::erase_if [PR107850]

As specified in the standard, the predicate for std::erase_if has to be
invocable as non-const with a non-const lvalues argument. Restore
support for predicates that only accept non-const arguments.

It's not strictly nevessary to change it for the set and unordered_set
overloads, because they only give const access to the elements anyway.
I've done it for them too just to keep them all consistent.

libstdc++-v3/ChangeLog:

PR libstdc++/107850
* include/bits/erase_if.h (__erase_nodes_if): Use non-const
reference to the container.
* include/experimental/map (erase_if): Likewise.
* include/experimental/set (erase_if): Likewise.
* include/experimental/unordered_map (erase_if): Likewise.
* include/experimental/unordered_set (erase_if): Likewise.
* include/std/map (erase_if): Likewise.
* include/std/set (erase_if): Likewise.
* include/std/unordered_map (erase_if): Likewise.
* include/std/unordered_set (erase_if): Likewise.
* testsuite/23_containers/map/erasure.cc: Check with
const-incorrect predicate.
* testsuite/23_containers/set/erasure.cc: Likewise.
* testsuite/23_containers/unordered_map/erasure.cc: Likewise.
* testsuite/23_containers/unordered_set/erasure.cc: Likewise.
* testsuite/experimental/map/erasure.cc: Likewise.
* testsuite/experimental/set/erasure.cc: Likewise.
* testsuite/experimental/unordered_map/erasure.cc: Likewise.
* testsuite/experimental/unordered_set/erasure.cc: Likewise.

(cherry picked from commit f54ceb2062c7fef294f85ae093914fa6c7ca35b8)

libstdc++: Replace non-ASCII character in comment

This has an unnecessary UTF-8 non-breaking space.

libstdc++-v3/ChangeLog:

* testsuite/26_numerics/random/subtract_with_carry_engine/cons/lwg3809.cc:
Replace non-ASCII character.

(cherry picked from commit c775e2b81fca39f366040d423e3e44f4abecf753)

libstdc++: Fix uses_allocator_construction_args for pair<T&&, U&&> [PR108952]

This implements LWG 3527 which fixes the handling of pair<T&&, U&&> in
std::uses_allocator_construction_args.

libstdc++-v3/ChangeLog:

PR libstdc++/108952
* include/bits/uses_allocator_args.h
(uses_allocator_construction_args): Implement LWG 3527.
* testsuite/20_util/pair/astuple/get-2.cc: New test.
* testsuite/20_util/scoped_allocator/108952.cc: New test.
* testsuite/20_util/uses_allocator/lwg3527.cc: New test.

(cherry picked from commit 8e342c04550466ab088c33746091ce7f3498ee44)

libstdc++: Avoid -Wmaybe-uninitialized warning in std::stop_source [PR109339]

We pass a const-reference to *this before it's constructed, and GCC
assumes that all const-references are accessed. Add the access attribute
to say it's not accessed.

libstdc++-v3/ChangeLog:

PR libstdc++/109339
* include/std/stop_token (_Stop_state_ptr(const stop_source&)):
Add attribute access with access-mode 'none'.
* testsuite/30_threads/stop_token/stop_source/109339.cc: New test.

(cherry picked from commit a35e8042fbc7a3eb9cece1fba4cdd3b6cdfb906f)

libstdc++: Make 16-bit std::subtract_with_carry_engine work [PR107466]

This implements the proposed resolution of LWG 3809, so that
std::subtract_with_carry_engine can be used with a 16-bit result_type.
Currently this produces a narrowing error when instantiating the
std::linear_congruential_engine to create the initial state. It also
truncates the default_seed constant when passing it as a result_type
argument.

Change the type of the constant to uint_least32_t and pass 0u when the
default_seed should be used.

libstdc++-v3/ChangeLog:

PR libstdc++/107466
* include/bits/random.h (subtract_with_carry_engine): Use 32-bit
type for default seed. Use 0u as default argument for
subtract_with_carry_engine(result_type) constructor and
seed(result_type) member function.
* include/bits/random.tcc (subtract_with_carry_engine): Use
32-bit type for default seed and engine used for initial state.
* testsuite/26_numerics/random/subtract_with_carry_engine/cons/lwg3809.cc:
New test.

(cherry picked from commit a64775a0edd46980036b757041f0c065ed9f8d22)

libstdc++: Fix typo in doxygen comment

libstdc++-v3/ChangeLog:

* include/bits/mofunc_impl.h: Fix typo in doxygen comment.

(cherry picked from commit 481281ccf41aa2bc596e548edaad4e57833f3340)

libstdc++: Reduce Doxygen output for PDF

Including the header source code in the doxygen-generated PDF file makes
it too large, and causes pdflatex to run out of memory. If we only set
SOURCE_BROWSER=YES for the HTML docs then we won't include the sources
in the PDF file.

There are several macros defined for std::valarray that are only used to
generate repetitive code and then #undef'd. Those aren't useful in the
doxygen docs, especially the ones that reuse the same name in different
files. Omitting them avoids warnings about duplicate labels in the
refman.tex file.

libstdc++-v3/ChangeLog:

* doc/doxygen/user.cfg.in (SOURCE_BROWSER): Only set to YES for
HTML docs.
* include/bits/gslice_array.h (_DEFINE_VALARRAY_OPERATOR): Omit
from doxygen docs.
* include/bits/indirect_array.h (_DEFINE_VALARRAY_OPERATOR):
Likewise.
* include/bits/mask_array.h (_DEFINE_VALARRAY_OPERATOR):
Likewise.
* include/bits/slice_array.h (_DEFINE_VALARRAY_OPERATOR):
Likewise.
* include/std/valarray (_DEFINE_VALARRAY_UNARY_OPERATOR)
(_DEFINE_VALARRAY_AUGMENTED_ASSIGNMENT)
(_DEFINE_VALARRAY_EXPR_AUGMENTED_ASSIGNMENT)
(_DEFINE_BINARY_OPERATOR): Likewise.

(cherry picked from commit afa69618d1627435841c9164b019ef98000e0365)

PR tree-optimization/109392

If we have an object with SSA_NAME_OCCURS_IN_ABNORMAL_PHI, then
maybe_push_res_to_seq may fail. Directly build the extraction
for that case.

PR tree-optimization/109392

gcc/
* tree-vect-generic.cc (tree_vec_extract): Handle failure
of maybe_push_res_to_seq better.

gcc/testsuite/

* gcc.dg/pr109392.c: New test.

(cherry picked from commit 101380a8394c22a7a2ea70de2060ee93716156e2)

tree-optimization/108791 - checking ICE with sloppy ADDR_EXPR

The following fixes a checking ICE by choosing a more appropriate
type for an ADDR_EXPR built by forwprop.

PR tree-optimization/108791
* tree-ssa-forwprop.cc (optimize_vector_load): Build
the ADDR_EXPR of a TARGET_MEM_REF using a more meaningful
type.

* gcc.dg/torture/pr108791.c: New testcase.

(cherry picked from commit 441c466fd4d8b9afd99f585f7c4bfade911c4652)

PR rtl-optimization/106421: ICE in bypass_block from non-local goto.

This patch fixes PR rtl-optimization/106421, an ICE-on-valid (but
undefined) regression. The fix, as proposed by Richard Biener, is to
defend against BLOCK_FOR_INSN returning NULL in cprop's bypass_block.

2023-01-10 Roger Sayle <roger@nextmovesoftware.com>

gcc/ChangeLog
PR rtl-optimization/106421
* cprop.cc (bypass_block): Check that DEST is local to this
function (non-NULL) before calling find_edge.

gcc/testsuite/ChangeLog
PR rtl-optimization/106421
* gcc.dg/pr106421.c: New test case.

(cherry picked from commit 851e1ba03f9de699a754dd8648fc151c3e26d697)

x86: Disable -mforce-indirect-call for PIC in 32-bit mode

-mforce-indirect-call generates invalid instruction in 32-bit MI thunk
since there are no available scratch registers in 32-bit PIC mode.
Disable -mforce-indirect-call for PIC in 32-bit mode when generating
MI thunk.

gcc/

PR target/105980
* config/i386/i386.cc (x86_output_mi_thunk): Disable
-mforce-indirect-call for PIC in 32-bit mode.

gcc/testsuite/

PR target/105980
* g++.target/i386/pr105980.C: New test.

(cherry picked from commit a396a123596d82d4a2f14dc43a382cb17826411c)

Fix invalid devirtualization when combining final keyword and anonymous types

this patch fixes a wrong code issue where we incorrectly devirtualize to
__builtin_unreachable.  The problem occurs in combination of anonymous
namespaces and final keyword used on methods.  We do two optimizations here
1) when reacing final method we cut the search for possible new targets
2) if the type is anonymous we detect whether it is ever instatiated by
    looking if its vtable is referred to.
Now this goes wrong when thre is an anonymous type with final method that
is not instantiated while its derived type is.  So if 1 triggers we need
to make 2 to look for vtables of all derived types as done by this patch.

Bootstrpaped/regtested x86_64-linux

Honza

gcc/ChangeLog:

2022-08-10  Jan Hubicka  <hubicka@ucw.cz>

PR middle-end/106057
* ipa-devirt.cc (type_or_derived_type_possibly_instantiated_p): New
function.
(possible_polymorphic_call_targets): Use it.

gcc/testsuite/ChangeLog:

2022-08-10  Jan Hubicka  <hubicka@ucw.cz>

PR middle-end/106057
* g++.dg/tree-ssa/pr101839.C: New test.

(cherry picked from commit 0f2c7ccd14a29a8af8318f50b8296098fb0ab218)

Daily bump.

ipa: Fix double reference-count decrements for the same edge (PR 107769, PR 109318)

It turns out that since addition of the code that can identify globals
which are only read from, the code that keeps track of the references
can decrement their count for the same calls, once during IPA-CP and
then again during inlining.  Fixed by adding a special flag to the
pass-through variant and simply wiping out the reference to the
refdesc structure from the constant ones.

Moreover, during debugging of the issue I have discovered that the
code removing references could remove a reference associated with the
same statement but of a wrong type.  In all cases it wanted to remove
an IPA_REF_ADDR reference so removing a lesser one instead should do
no harm in practice, but we should try to be consistent and so this
patch extends symtab_node::find_reference so that it searches for a
reference of a given type only.

gcc/ChangeLog:

2023-04-14  Martin Jambor  <mjambor@suse.cz>

PR ipa/107769
PR ipa/109318
* cgraph.h (symtab_node::find_reference): Add parameter use_type.
* ipa-prop.h (ipa_pass_through_data): New flag refdesc_decremented.
(ipa_zap_jf_refdesc): New function.
(ipa_get_jf_pass_through_refdesc_decremented): Likewise.
(ipa_set_jf_pass_through_refdesc_decremented): Likewise.
* ipa-cp.cc (ipcp_discover_new_direct_edges): Provide a value for
the new parameter of find_reference.
(adjust_references_in_caller): Likewise. Make sure the constant jump
function is not used to decrement a refdec counter again.  Only
decrement refdesc counters when the pass_through jump function allows
it.  Added a detailed dump when decrementing refdesc counters.
* ipa-prop.cc (ipa_print_node_jump_functions_for_edge): Dump new flag.
(ipa_set_jf_simple_pass_through): Initialize the new flag.
(ipa_set_jf_unary_pass_through): Likewise.
(ipa_set_jf_arith_pass_through): Likewise.
(remove_described_reference): Provide a value for the new parameter of
find_reference.
(update_jump_functions_after_inlining): Zap refdesc of new jfunc if
the previous pass_through had a flag mandating that we do so.
(propagate_controlled_uses): Likewise.  Only decrement refdesc
counters when the pass_through jump function allows it.
(ipa_edge_args_sum_t::duplicate): Provide a value for the new
parameter of find_reference.
(ipa_write_jump_function): Assert the new flag does not have to be
streamed.
* symtab.cc (symtab_node::find_reference): Add parameter use_type, use
it in searching.

gcc/testsuite/ChangeLog:

2023-04-06  Martin Jambor  <mjambor@suse.cz>

PR ipa/107769
PR ipa/109318
* gcc.dg/ipa/pr109318.c: New test.
* gcc.dg/lto/pr107769_0.c: Likewise.

(cherry picked from commit 8e08c7886eed5824bebd0e011526ec302d622844)

powerpc: Fix up *branch_anddi3_dot for -m32 -mpowerpc64 [PR109566]

The following testcase reduced from newlib ICEs on powerpc-linux,
with -O2 -m32 -mpowerpc64 since r12-6433 PR102239 optimization was
added and on the original testcase since some ranger improvements in
GCC 13 made it no longer latent on newlib.
The problem is that the *branch_anddi3_dot define_insn_and_split
relies on the *rotldi3_mask_dot define_insn_and_split being recognized
during splitting.  The rs6000_is_valid_rotate_dot_mask function checks whether
the mask is a CONST_INT which is a valid mask, but *rotl<mode>3_mask_dot in
addition to checking that it is a valid mask also has
  (<MODE>mode == Pmode || UINTVAL (operands[3]) <= 0x7fffffff)
test in the condition.  For TARGET_64BIT that doesn't add any further
requirements, but for !TARGET_64BIT && TARGET_POWERPC64 if the AND
second operand is larger than INT_MAX it will not be recognized.

The rs6000_is_valid_rotate_dot_mask function is used solely in one spot,
condition of *branch_anddi3_dot, so the following patch adjusts it
to check for that as well.

2023-04-25  Jakub Jelinek  <jakub@redhat.com>

PR target/109566
* config/rs6000/rs6000.cc (rs6000_is_valid_rotate_dot_mask): For
!TARGET_64BIT, don't return true if UINTVAL (mask) << (63 - nb)
is larger than signed int maximum.

* gcc.target/powerpc/pr109566.c: New test.

(cherry picked from commit 97f8f2d0a0384d377ca46da88495f9a3d18d4415)

tree-optimization/109609 - correctly interpret arg size in fnspec

By majority vote and a hint from the API name which is
arg_max_access_size_given_by_arg_p this interprets a memory access
size specified as given as other argument such as for strncpy
in the testcase which has "1cO313" as specifying the _maximum_
size read/written rather than the exact size. There are two
uses interpreting it that way already and one differing. The
following adjusts the differing and clarifies the documentation.

PR tree-optimization/109609
* attr-fnspec.h (arg_max_access_size_given_by_arg_p):
Clarify semantics.
* tree-ssa-alias.cc (check_fnspec): Correctly interpret
the size given by arg_max_access_size_given_by_arg_p as
maximum, not exact, size.

* gcc.dg/torture/pr109609.c: New testcase.

(cherry picked from commit e8d00353017f895d03a9eabae3506fd126ce1a2d)

rtl-optimization/109585 - alias analysis typo

When r10-514-gc6b84edb6110dd2b4fb improved access path analysis
it introduced a typo that triggers when there's an access to a
trailing array in the first access path leading to false
disambiguation.

PR rtl-optimization/109585
* tree-ssa-alias.cc (aliasing_component_refs_p): Fix typo.

* gcc.dg/torture/pr109585.c: New testcase.

(cherry picked from commit 6d4bd27a60447c7505cb4783e675e98a191a8904)

tree-optimization/109573 - avoid ICEing on unexpected live def

The following relaxes the assert in vectorizable_live_operation
where we catch currently unhandled cases to also allow an
intermediate copy as it happens here but also relax the assert
to checking only.

PR tree-optimization/109573
* tree-vect-loop.cc (vectorizable_live_operation): Allow
unhandled SSA copy as well. Demote assert to checking only.

* g++.dg/vect/pr109573.cc: New testcase.

(cherry picked from commit cddfe6bc40b3dc0806e260bbfb4cac82d609a258)

Remove obsolete configure code in gnattools

It was recently pointed out that we generate symbolic links to ghost files
when building the GNAT tools, as the mlib-tgt-specific-*.adb files are gone.

gnattools/
* configure.ac (TOOLS_TARGET_PAIRS): Remove obsolete settings.
(EXTRA_GNATTOOLS): Likewise.
* configure: Regenerate.

Daily bump.

Fortran: resolve correct generic with TYPE(C_PTR) arguments [PR61615,PR99982]

gcc/fortran/ChangeLog:

PR fortran/61615
PR fortran/99982
* interface.cc (compare_parameter): Enable type and rank checks for
arguments of derived type from the intrinsic module ISO_C_BINDING.

gcc/testsuite/ChangeLog:

PR fortran/61615
PR fortran/99982
* gfortran.dg/interface_49.f90: New test.

(cherry picked from commit c482995cc5bac4a2168ea0049041e712544e474b)

Fortran: handle zero-sized arrays in ctors with typespec [PR108010]

gcc/fortran/ChangeLog:

PR fortran/108010
* arith.cc (reduce_unary): Handle zero-sized arrays.
(reduce_binary_aa): Likewise.

gcc/testsuite/ChangeLog:

PR fortran/108010
* gfortran.dg/pr108010.f90: New test.

(cherry picked from commit 7d6512d102a5a4fb3939de95bce38d154512a19f)

Daily bump.

c++: attribute on dtor in template [PR108795]

Since r7-2549 we were throwing away the explicit C:: when we found that ~C
has an attribute that we treat as making its type dependent.

PR c++/108795

gcc/cp/ChangeLog:

* semantics.cc (finish_id_expression_1): Check scope before
returning id_expression.

gcc/testsuite/ChangeLog:

* g++.dg/ext/attr-tsafe1.C: New test.

libstdc++: Optimize std::try_facet and std::use_facet [PR103755]

The std::try_facet and std::use_facet functions were optimized in
r13-3888-gb3ac43a3c05744 to avoid redundant checking for all facets that
are required to always be present in every locale.

This performs a simpler version of the optimization that only applies to
std::ctype<char>, std::num_get<char>, std::num_put<char>, and the
wchar_t specializations of those facets. Those are the facets that are
cached by std::basic_ios, which means they're used on construction for
every iostream object. This smaller change is suitable for the gcc-12
branch, and mitigates the performance loss for powerpc64le-linux on the
gcc-12 branch caused by r12-9454-g24cf9f4c6f45f7 for PR 103387. It also
greatly improves the performance of constructing iostreams objects, for
all targets.

libstdc++-v3/ChangeLog:

PR libstdc++/103755
* include/bits/locale_classes.tcc (try_facet, use_facet): Do not
check array index or dynamic type when accessing required
specializations of std::ctype, std::num_get, or std::num_put.
* testsuite/22_locale/ctype/is/string/89728_neg.cc: Adjust
expected errors.

Fix handling of large arguments passed by value.

2023-04-15  John David Anglin  <danglin@gcc.gnu.org>

gcc/ChangeLog:

PR target/109478
* config/pa/pa-protos.h (pa_function_arg_size): Update prototype.
* config/pa/pa.cc (pa_function_arg): Return NULL_RTX if argument
size is zero.
(pa_arg_partial_bytes): Don't call pa_function_arg_size twice.
(pa_function_arg_size): Change return type to int.  Return zero
for arguments larger than 1 GB.  Update comments.

rs6000: correct vector sign extend builtins on Big Endian

gcc/
PR target/108812
* config/rs6000/vsx.md (vsx_sign_extend_qi_<mode>): Rename to...
(vsx_sign_extend_v16qi_<mode>): ... this.
(vsx_sign_extend_hi_<mode>): Rename to...
(vsx_sign_extend_v8hi_<mode>): ... this.
(vsx_sign_extend_si_v2di): Rename to...
(vsx_sign_extend_v4si_v2di): ... this.
(vsignextend_qi_<mode>): Remove.
(vsignextend_hi_<mode>): Remove.
(vsignextend_si_v2di): Remove.
(vsignextend_v2di_v1ti): Remove.
(*xxspltib_<mode>_split): Replace gen_vsx_sign_extend_qi_v2di with
gen_vsx_sign_extend_v16qi_v2di and gen_vsx_sign_extend_qi_v4si
with gen_vsx_sign_extend_v16qi_v4si.
* config/rs6000/rs6000.md (split for DI constant generation):
Replace gen_vsx_sign_extend_qi_si with gen_vsx_sign_extend_v16qi_si.
(split for HSDI constant generation): Replace gen_vsx_sign_extend_qi_di
with gen_vsx_sign_extend_v16qi_di and gen_vsx_sign_extend_qi_si
with gen_vsx_sign_extend_v16qi_si.
* config/rs6000/rs6000-builtins.def (__builtin_altivec_vsignextsb2d):
Set bif-pattern to vsx_sign_extend_v16qi_v2di.
(__builtin_altivec_vsignextsb2w): Set bif-pattern to
vsx_sign_extend_v16qi_v4si.
(__builtin_altivec_visgnextsh2d): Set bif-pattern to
vsx_sign_extend_v8hi_v2di.
(__builtin_altivec_vsignextsh2w): Set bif-pattern to
vsx_sign_extend_v8hi_v4si.
(__builtin_altivec_vsignextsw2d): Set bif-pattern to
vsx_sign_extend_si_v2di.
(__builtin_altivec_vsignext): Set bif-pattern to
vsx_sign_extend_v2di_v1ti.
* config/rs6000/rs6000-builtin.cc (lxvrse_expand_builtin): Replace
gen_vsx_sign_extend_qi_v2di with gen_vsx_sign_extend_v16qi_v2di,
gen_vsx_sign_extend_hi_v2di with gen_vsx_sign_extend_v8hi_v2di and
gen_vsx_sign_extend_si_v2di with gen_vsx_sign_extend_v4si_v2di.

gcc/testsuite/
PR target/108812
* gcc.target/powerpc/p9-sign_extend-runnable.c: Set corresponding
expected vectors for Big Endian.
* gcc.target/powerpc/int_128bit-runnable.c: Likewise.

(cherry picked from commit a213e2c965382c24fe391ee5798effeba8da0fdf)

Daily bump.

Revert "c++: signed __int128_t [PR108099]"

Given the remaining issues even after these patches, let's leave this bug
alone on the GCC 12 branch for now.

This reverts commit 32955c1c2246aa336d3fd2423c32546c39a6ca30.
This reverts commit 7b30f13b904f137c77e5180357af7917a3b47af0.

libstdc++: Avoid bogus warning in std::vector::insert [PR107852]

GCC assumes that any global variable might be modified by operator new,
and so in the testcase for this PR all data members get reloaded after
allocating new storage. By making local copies of the _M_start and
_M_finish members we avoid that, and then the compiler has enough info
to remove the dead branches that trigger bogus -Warray-bounds warnings.

libstdc++-v3/ChangeLog:

PR libstdc++/107852
PR libstdc++/106199
PR libstdc++/100366
* include/bits/vector.tcc (vector::_M_fill_insert): Copy
_M_start and _M_finish members before allocating.
(vector::_M_default_append): Likewise.
(vector::_M_range_insert): Likewise.

(cherry picked from commit cca06f0d6d76b08ed4ddb7667eda93e2e9f2589e)

libstdc++: Fix outdated docs about demangling exception messages

The string returned by std::bad_exception::what() hasn't been a mangled
name since PR libstdc++/14493 was fixed for GCC 4.2.0, so remove the
docs showing how to demangle it.

libstdc++-v3/ChangeLog:

* doc/xml/manual/extensions.xml: Remove std::bad_exception from
example program.
* doc/html/manual/ext_demangling.html: Regenerate.

(cherry picked from commit 688d126b69215db29774c249b052e52d765782b3)

libstdc++: Define std::sub_match::swap member function (LWG 3204)

This was approved at the C++ meeting in February.

libstdc++-v3/ChangeLog:

* include/bits/regex.h (sub_match::swap): New function.
* testsuite/28_regex/sub_match/lwg3204.cc: New test.

(cherry picked from commit 44e17b8d8999a658af9f86681504d74a119a5f6f)

libstdc++: Do not use facets cached in ios for ALT128 build [PR103387]

For the powerpc64le build with two different long double
representations, we cannot use the ios_base::_M_num_put and
ios_base::_M_num_get pointers, because they might have been initialized
in a translation unit using the other long double type. With the changes
to add __try_use_facet to GCC 13 the cache isn't really needed now, we
can just access the right facet in the locale directly, without needing
RTTI checks.

libstdc++-v3/ChangeLog:

PR libstdc++/103387
* include/bits/istream.tcc (istream::_M_extract(ValueT&)): Use
std::use_facet instead of cached _M_num_get facet.
(istream::operator>>(short&)): Likewise.
(istream::operator>>(int&)): Likewise.
* include/bits/ostream.tcc (ostream::_M_insert(ValueT)): Use
std::use_facet instead of cached _M_num_put facet.

(cherry picked from commit ec12639c82e944d37200a744156e183ea19add00)

libstdc++: Remove unconditional -pthread from test options

This shouldn't be in the common options, it's already added for the
relevant targets using dg-additional-options.

libstdc++-v3/ChangeLog:

* testsuite/30_threads/jthread/jthread.cc: Remove -pthread from
dg-options.

(cherry picked from commit 5ba715ed1aaf24d4f07c0d250a0a9df03aa0acb2)

libstdc++: Fix assigning nullptr to std::atomic<shared_ptr<T>> (LWG 3893)

LWG voted this to Tentatively Ready recently.

libstdc++-v3/ChangeLog:

* include/bits/shared_ptr_atomic.h (atomic::operator=(nullptr_t)):
Add overload, as per LWG 3893.
* testsuite/20_util/shared_ptr/atomic/atomic_shared_ptr.cc:
Check assignment from nullptr.

(cherry picked from commit a495b738e4a89a8104798d005fd09474bbb916ff)

libstdc++: Apply small fix from LWG 3843 to std::expected

LWG 3843 adds some type requirements to std::expected::value to ensure
that it can correctly copy the error value if it needs to throw an
exception. We don't need to do anything to enforce that, because it will
already be ill-formed if the type can't be copied. The issue also makes
a small drive-by fix to ensure that a const E& is copied from the
non-const value()& overload, which this change implements.

libstdc++-v3/ChangeLog:

* include/std/expected (expected::value() &): Use const lvalue
for unex member passed to bad_expected_access constructor, as
per LWG 3843.

(cherry picked from commit ce39714a1ce58f2f32e8a44a224061290670db0f)

libstdc++: Use emplace in std::variant::operator=(T&&) as per LWG 3585

This was approved at the October 2021 plenary.

libstdc++-v3/ChangeLog:

* include/std/variant (variant::operator=): Implement resolution
of LWG 3585.
* testsuite/20_util/variant/lwg3585.cc: New test.

(cherry picked from commit 1395415fdc2d60e5346dbcf476749daf42d5b724)

testsuite: Fix up g++.dg/ext/int128-8.C testcase [PR109560]

The testcase needs to be restricted to int128 effective targets,
it expectedly fails on i386 and other 32-bit targets.

2023-04-20 Jakub Jelinek <jakub@redhat.com>

PR c++/108099
PR testsuite/109560
* g++.dg/ext/int128-8.C: Require int128 effective target.

(cherry picked from commit bd4a1a547242a924663712ac7a13799433cdf476)

Daily bump.

c++: fix 'unsigned __int128_t' semantics [PR108099]

My earlier patch for 108099 made us accept this non-standard pattern but
messed up the semantics, so that e.g. unsigned __int128_t was not a 128-bit
type.

PR c++/108099

gcc/cp/ChangeLog:

* decl.cc (grokdeclarator): Keep typedef_decl for __int128_t.

gcc/testsuite/ChangeLog:

* g++.dg/ext/int128-8.C: New test.

Daily bump.

c++: constexpr aggregate destruction [PR109357]

We were assuming that the result of evaluation of TARGET_EXPR_INITIAL would
always be the new value of the temporary, but that's not necessarily true
when the initializer is complex (i.e. target_expr_needs_replace). In that
case evaluating the initializer initializes the temporary as a side-effect.

PR c++/109357

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_constant_expression) [TARGET_EXPR]:
Check for complex initializer.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/constexpr-dtor15.C: New test.

c-family: -Wsequence-point and COMPONENT_REF [PR107163]

The patch for PR91415 fixed -Wsequence-point to treat shifts and ARRAY_REF
as sequenced in C++17, and COMPONENT_REF as well. But this is unnecessary
for COMPONENT_REF, since the RHS is just a FIELD_DECL with no actual
evaluation, and in this testcase handling COMPONENT_REF as sequenced blows
up fast in a deep inheritance tree. Instead, look through it.

PR c++/107163

gcc/c-family/ChangeLog:

* c-common.cc (verify_tree): Don't use sequenced handling
for COMPONENT_REF.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wsequence-point-5.C: New test.

c++: default template arg, partial ordering [PR105481]

The default argument code in type_unification_real was assuming that all
targs we've deduced by that point are non-dependent, but that's not the case
for partial ordering.

PR c++/105481

gcc/cp/ChangeLog:

* pt.cc (type_unification_real): Adjust for partial ordering.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/fntmpdefarg-partial1.C: New test.

c++: constexpr PMF conversion [PR105996]

Here, we were calling build_reinterpret_cast regardless of whether there was
actually a cast, and that now sets REINTERPRET_CAST_P. But that
optimization seems dodgy anyway, as it involves NOP_EXPR from one
RECORD_TYPE to another and we try to reserve NOP_EXPR for fundamental types.
And the generated code seems the same, so let's drop it. And also strip
location wrappers.

PR c++/105996

gcc/cp/ChangeLog:

* typeck.cc (build_ptrmemfunc): Drop 0-offset optimization
and location wrappers.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-pmf3.C: New test.

c++: DMI in template with virtual base [PR106890]

When parsing a default member init we just build a CONVERT_EXPR for
converting to a virtual base, and then expand that into the more complex
form when we actually use the DMI in a constructor.  But that wasn't working
for the template case where we are considering the conversion at the point
that the constructor needs the DMI instantiation, so it seemed like we were
in a constructor already.  And then when the other constructor tries to
reuse the instantiation, it sees uses of the first constructor's parameters,
and dies.  So ensure that we get the CONVERT_EXPR in this case, too.

PR c++/106890

gcc/cp/ChangeLog:

* init.cc (maybe_instantiate_nsdmi_init): Don't leave
current_function_decl set to a constructor.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/nsdmi-template25.C: New test.

c++: constant, array, lambda, template [PR108975]

When a lambda refers to a constant local variable in the enclosing scope, we
tentatively capture it, but if we end up pulling out its constant value, we
go back at the end of the lambda and prune any unneeded captures. Here
while parsing the template we decided that the dim capture was unneeded,
because we folded it away, but then we brought back the use in the template
trees that try to preserve the source representation with added type info.
So then when we tried to instantiate that use, we couldn't find what it was
trying to use, and crashed.

Fixed by not trying to prune when parsing a template; we'll prune at
instantiation time.

PR c++/108975

gcc/cp/ChangeLog:

* lambda.cc (prune_lambda_captures): Don't bother in a template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-const11.C: New test.

c++: namespace-scoped friend in local class [PR69410]

do_friend was only considering class-qualified identifiers for the
qualified-id case, but we also need to skip local scope when there's an
explicit namespace scope.

PR c++/69410

gcc/cp/ChangeLog:

* friend.cc (do_friend): Handle namespace as scope argument.
* decl.cc (grokdeclarator): Pass down in_namespace.

gcc/testsuite/ChangeLog:

* g++.dg/lookup/friend24.C: New test.

c++: __func__ and local class DMI [PR105809]

As in 108242, we need to instantiate in the context of the enclosing
function, not after it's gone.

PR c++/105809

gcc/cp/ChangeLog:

* init.cc (get_nsdmi): Split out...
(maybe_instantiate_nsdmi_init): ...this function.
* cp-tree.h: Declare it.
* pt.cc (tsubst_expr): Use it.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-__func__3.C: New test.

c++: generic lambda, local class, __func__ [PR108242]

Here we are trying to do name lookup in a deferred instantiation of t() and
failing to find __func__. tsubst_expr already tries to instantiate members
of local classes, but was failing with the partial instantiation of generic
lambdas.

PR c++/108242

gcc/cp/ChangeLog:

* pt.cc (tsubst_expr) [TAG_DEFN]: Handle partial instantiation.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/lambda-generic-func2.C: New test.

c++: &enum::enumerator [PR101869]

We don't want to call build_offset_ref with an enum.

PR c++/101869

gcc/cp/ChangeLog:

* semantics.cc (finish_qualified_id_expr): Don't try to build a
pointer-to-member if the scope is an enumeration.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/enum43.C: New test.

c++: co_await and move-only type [PR105406]

Here we were building a temporary MoveOnlyAwaitable to hold the result of
evaluating 'o', but since 'o' is an lvalue we should build a reference
instead.

PR c++/105406

gcc/cp/ChangeLog:

* coroutines.cc (build_co_await): Handle lvalue 'o'.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/co-await-moveonly1.C: New test.

c++: co_await and initializer_list [PR103871]

When flatten_await_stmt processes the backing array for an initializer_list,
we call cp_build_modify_expr to initialize the promoted variable from the
TARGET_EXPR; that needs to be accepted.

PR c++/103871
PR c++/98056

gcc/cp/ChangeLog:

* typeck.cc (cp_build_modify_expr): Allow array initialization of
DECL_ARTIFICIAL variable.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/co-await-initlist1.C: New test.

c++: variable tmpl partial specialization [PR108468]

Generally we expect TPARMS_PRIMARY_TEMPLATE to be set, but sometimes it
isn't for partial instantiations. This ought to be improved, but it's
trivial to work around it in this case.

PR c++/108468

gcc/cp/ChangeLog:

* pt.cc (unify_pack_expansion): Check that TPARMS_PRIMARY_TEMPLATE
is non-null.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/var-templ78.C: New test.

c++: -Wreturn-type with if (true) throw [PR107310]

I removed this folding in GCC 12 because it was interfering with an
experiment of richi's, but that never went in and the change causes
regressions, so let's put it back.

This reverts commit r12-5638-ga3e75c1491cd2d.

PR c++/107310

gcc/cp/ChangeLog:

* cp-gimplify.cc (genericize_if_stmt): Restore folding
of constant conditions.

gcc/testsuite/ChangeLog:

* c-c++-common/Wimplicit-fallthrough-39.c: Adjust warning.
* g++.dg/warn/Wreturn-6.C: New test.

c++: class NTTP and nested anon union [PR108566]

We were failing to come up with the name for the anonymous union. It seems
like unfortunate redundancy, but the ABI does say that the name of an
anonymous union is its first named member.

PR c++/108566

gcc/cp/ChangeLog:

* mangle.cc (anon_aggr_naming_decl): New.
(write_unqualified_name): Use it.

gcc/testsuite/ChangeLog:

* g++.dg/abi/anon6.C: New test.

c++: fix debug info for array temporary [PR107154]

In the testcase the elaboration of the array init that happens at genericize
time was getting the location info for the end of the function; fixed by
doing the expansion at the location of the original expression.

PR c++/107154

gcc/cp/ChangeLog:

* cp-gimplify.cc (cp_genericize_init_expr): Use iloc_sentinel.
(cp_genericize_target_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/debug/dwarf2/lineno-array1.C: New test.

c++: signed __int128_t [PR108099]

The code for handling signed + typedef was breaking on __int128_t, because
it isn't a proper typedef: it doesn't have DECL_ORIGINAL_TYPE.

PR c++/108099

gcc/cp/ChangeLog:

* decl.cc (grokdeclarator): Handle non-typedef typedef_decl.

gcc/testsuite/ChangeLog:

* g++.dg/ext/int128-7.C: New test.

reassoc: Fix up another ICE with returns_twice call [PR109410]

The following testcase ICEs in reassoc, unlike the last case I've fixed
there here SSA_NAME_USED_IN_ABNORMAL_PHI is not the case anywhere.
build_and_add_sum places new statements after the later appearing definition
of an operand but if both operands are default defs or constants, we place
statement at the start of the function.

If the very first statement of a function is a call to returns_twice
function, this doesn't work though, because that call has to be the first
thing in its basic block, so the following patch splits the entry successor
edge such that the new statements are added into a different block from the
returns_twice call.

I think we should in stage1 reconsider such placements, I think it
unnecessarily enlarges the lifetime of the new lhs if its operand(s)
are used more than once in the function.  Unless something sinks those
again.  Would be nice to place it closer to the actual uses (or where
they will be placed).

2023-04-12  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/109410
* tree-ssa-reassoc.cc (build_and_add_sum): Split edge from entry
block if first statement of the function is a call to returns_twice
function.

* gcc.dg/pr109410.c: New test.

(cherry picked from commit 51856718a82ce60f067910d9037ca255645b37eb)

c++: Fix Solaris bootstraps across midnight

When working on the PR109040 fix, I wanted to test it on some
WORD_REGISTER_OPERATIONS target and tried sparc-solaris on GCC Farm.
My bootstrap failed in comparison failure on cp/module.o, because
Solaris date doesn't support the -r option and one stage's cp/module.o
was built before midnight and next stage's cp/module.o after midnight,
so they had different -DMODULE_VERSION= value.

Now, I think the advice (don't bootstrap at midnight) is something
we shouldn't have, so the following patch stores the module version
(still generated through the same way, date -r cp/module.cc
if it works, otherwise just date) into a temporary file, makes sure
that temporary file is updated when cp/module.cc source is updated
and when date -r doesn't work copies file from previous stage
if it is newer than cp/module.cc.

2023-04-12 Jakub Jelinek <jakub@redhat.com>

* Make-lang.in (s-cp-module-version): New target.
(cp/module.o): Depend on it.
(MODULE_VERSION): Remove variable.
(CFLAGS-cp/module.o): For -DMODULE_VERSION= argument just
cat s-cp-module-version.

(cherry picked from commit 56529056cb42baa382c40de7d239d02dbf72c94f)

libiberty: Make strstr.c in libiberty ANSI compliant

On Fri, Nov 13, 2020 at 11:53:43AM -0700, Jeff Law via Gcc-patches wrote:
>
> On 5/1/20 6:06 PM, Seija Kijin via Gcc-patches wrote:
> > The original code in libiberty says "FIXME" and then says it has not been
> > validated to be ANSI compliant. However, this patch changes the function to
> > match implementations that ARE compliant, and such code is in the public
> > domain.
> >
> > I ran the test results, and there are no test failures.
>
> Thanks.  This seems to be the standard "simple" strstr implementation.
> There's significantly faster implementations available, but I doubt it's
> worth the effort as the version in this file only gets used if there is
> no system strstr.c.

Except that PR109306 says the new version is non-compliant and
is certainly slower than what we used to have.  The only problem I see
on the old version (sure, it is not very fast version) is that for
strstr ("abcd", "") it returned "abcd"+4 rather than "abcd" because
strchr in that case changed p to point to the last character and then
strncmp returned 0.

The question reported in PR109306 is whether memcmp is required not to
access characters beyond the first difference or not.
For all of memcmp/strcmp/strncmp, C17 says:
"The sign of a nonzero value returned by the comparison functions memcmp, strcmp, and strncmp
is determined by the sign of the difference between the values of the first pair of characters (both
interpreted as unsigned char) that differ in the objects being compared."
but then in memcmp description says:
"The memcmp function compares the first n characters of the object pointed to by s1 to the first n
characters of the object pointed to by s2."
rather than something similar to strncmp wording:
"The strncmp function compares not more than n characters (characters that follow a null character
are not compared) from the array pointed to by s1 to the array pointed to by
s2."
So, while for strncmp it seems clearly well defined when there is zero
terminator before reaching the n, for memcmp it is unclear if say
int
memcmp (const void *s1, const void *s2, size_t n)
{
  int ret = 0;
  size_t i;
  const unsigned char *p1 = (const unsigned char *) s1;
  const unsigned char *p2 = (const unsigned char *) s2;

  for (i = n; i; i--)
    if (p1[i - 1] != p2[i - 1])
      ret = p1[i - 1] < p2[i - 1] ? -1 : 1;
  return ret;
}
wouldn't be valid implementation (one which always compares all characters
and just returns non-zero from the first one that differs).

So, shouldn't we just revert and handle the len == 0 case correctly?

I think almost nothing really uses it, but still, the old version
at least worked nicer with a fast strchr.
Could as well strncmp (p + 1, s2 + 1, len - 1) if that is preferred
because strchr already compared the first character.

2023-04-02  Jakub Jelinek  <jakub@redhat.com>

PR other/109306
* strstr.c: Revert the 2020-11-13 changes.
(strstr): Return s1 if len is 0.

(cherry picked from commit 1719fa40c4ee4def60a2ce2f27e17f8168cf28ba)

c++: Fix up ICE in build_min_non_dep_op_overload [PR109319]

The following testcase ICEs, because grok_array_decl during
processing_template_decl handling of a non-dependent subscript
emits a -Wcomma-subscript pedwarn, we decide to pass to the
single index argument the index expressions as if it was wrapped
with () around it, but then when preparing it for later instantiation
we don't actually take that into account and ICE on a mismatch of
number of index arguments (the overload expects a single index,
testcase has two index expressions in this case).
For non-dependent subscript which are builtin subscripts we also
emit the same pedwarn and don't ICE, but emit the same pedwarn
again whenever we instantiate it, which is also IMHO undesirable,
it is enough to warn once during parsing the template.

The following patch fixes it by turning even the original index expressions
(those which didn't go through make_args_non_dependent) into a single
index using comma expression(s).

2023-03-30 Jakub Jelinek <jakub@redhat.com>

PR c++/109319
* decl2.cc (grok_array_decl): After emitting a pedwarn for
-Wcomma-subscript, if processing_template_decl set orig_index_exp
to compound expr from orig_index_exp_list.

* g++.dg/cpp23/subscript14.C: New test.

(cherry picked from commit c016887c91a79d67b6a3c7e19b9219f5ab1e2a4d)

sanopt: Return TODO_cleanup_cfg if any .{UB,HWA,A}SAN_* calls were lowered [PR106190]

The following testcase ICEs, because without optimization eh lowering
decides not to duplicate finally block of try/finally and so we end up
with variable guarded cleanup.  The sanopt pass creates a cfg that ought
to be cleaned up (some IFN_UBSAN_* functions are lowered in this case with
constant conditions in gcond and when not allowing recovery some bbs which
end with noreturn calls actually have successor edges), but the cfg cleanup
is actually (it is -O0) done only during the optimized pass.  We notice
there that the d[1][a] = 0; statement which has an EH edge is unreachable
(because ubsan would always abort on the out of bounds d[1] access), remove
the EH landing pad and block, but because that block just sets a variable
and jumps to another one which tests that variable and that one is reachable
from normal control flow, the __builtin_eh_pointer (1) later in there is
kept in the IL and we ICE during expansion of that statement because the
EH region has been removed.

The following patch fixes it by doing the cfg cleanup already during
sanopt pass if we create something that might need it, while the EH
landing pad is then removed already during sanopt pass, there is ehcleanup
later and we don't ICE anymore.

2023-03-28  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/106190
* sanopt.cc (pass_sanopt::execute): Return TODO_cleanup_cfg if any
of the IFN_{UB,HWA,A}SAN_* internal fns are lowered.

* gcc.dg/asan/pr106190.c: New test.

(cherry picked from commit 39a43dc336561e0eba0de477b16c7355f19d84ee)

i386: Require just 32-bit alignment for SLOT_FLOATxFDI_387 -m32 -mpreferred-stack-boundary=2 DImode temporaries [PR109276]

The following testcase ICEs since r11-2259 because assign_386_stack_local
-> assign_stack_local -> ix86_local_alignment now uses 64-bit alignment
for DImode temporaries rather than 32-bit as before.
Most of the spots in the backend which ask for DImode temporaries are during
expansion and those apparently are handled fine with -m32
-mpreferred-stack-boundary=2, we dynamically realign the stack in that case
(most of the spots actually don't need that alignment but at least one
does), then 2 spots are in STV which I assume also work correctly.
But during splitting we can create a DImode slot which doesn't need to be
64-bit alignment (it is nicer for performance though), when we apparently
aren't able to detect it for dynamic stack realignment purposes.

The following patch just makes the slot 32-bit aligned in that rare case.

2023-03-28 Jakub Jelinek <jakub@redhat.com>

PR target/109276
* config/i386/i386.cc (assign_386_stack_local): For DImode
with SLOT_FLOATxFDI_387 and -m32 -mpreferred-stack-boundary=2 pass
align 32 rather than 0 to assign_stack_local.

* gcc.target/i386/pr109276.c: New test.

(cherry picked from commit 4b5ef857f5faf09f274c0a95c67faaa80d198124)

predict: Don't emit -Wsuggest-attribute=cold warning for functions which already have that attribute [PR105685]

In the following testcase, we predict baz to have cold
entry regardless of the user supplied attribute (as it call
unconditionally a cold function), but still issue
a -Wsuggest-attribute=cold warning despite it having that attribute
already.

The following patch avoids that.

2023-03-26 Jakub Jelinek <jakub@redhat.com>

PR ipa/105685
* predict.cc (compute_function_frequency): Don't call
warn_function_cold if function already has cold attribute.

* c-c++-common/cold-2.c: New test.

(cherry picked from commit 7eca91d4781bb3df941f25c30b971dac66ba1b3d)

tree-vect-generic: Fix up expand_vector_condition [PR109176]

The following testcase ICEs on aarch64-linux, because
expand_vector_condition attempts to piecewise lower SVE
  d_3 = a_1(D) < b_2(D);
  _5 = VEC_COND_EXPR <d_3, c_4(D), d_3>;
which isn't possible - nunits_for_known_piecewise_op ICEs but
the rest of the code assumes constant number of elements too.

expand_vector_condition attempts to find if a (rhs1) is a SSA_NAME
for comparison and calls expand_vec_cond_expr_p (type, TREE_TYPE (a1), code)
where a1 is one of the operands of the comparison and code is the comparison
code.  That one indeed isn't supported here, but what aarch64 SVE supports
are the individual statements, comparison (expand_vec_cmp_expr_p) and
expand_vec_cond_expr_p (type, TREE_TYPE (a), SSA_NAME), the latter because
that function starts with
  if (VECTOR_BOOLEAN_TYPE_P (cmp_op_type)
      && get_vcond_mask_icode (TYPE_MODE (value_type),
                               TYPE_MODE (cmp_op_type)) != CODE_FOR_nothing)
    return true;

In an earlier version of the patch (in the PR), we did this
  if (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (a))
      && expand_vec_cond_expr_p (type, TREE_TYPE (a), ERROR_MARK))
    return true;
before the code == SSA_NAME handling plus some further tweaks later.
While that fixed the ICE, it broke quite a few tests on x86 and some on
aarch64 too.  The problem is that expand_vector_comparison doesn't lower
comparisons which aren't supported and only feed VEC_COND_EXPR first operand
and expand_vector_condition succeeds for those, so with the above mentioned
change we'd verify the VEC_COND_EXPR is implementable using optab alone,
but nothing would verify the tcc_comparison which relied on
expand_vector_condition to verify.

So, the following patch instead queries whether optabs can handle the
comparison and VEC_COND_EXPR together (if a (rhs1) is a comparison;
otherwise as before it checks only the VEC_COND_EXPR) and if that fails,
also checks whether the two operations could be supported individually
and only if even that fails does the piecewise lowering.

2023-03-23  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/109176
* tree-vect-generic.cc (expand_vector_condition): If a has
vector boolean type and is a comparison, also check if both
the comparison and VEC_COND_EXPR could be successfully expanded
individually.

* gcc.target/aarch64/sve/pr109176.c: New test.

(cherry picked from commit 484c41c747d95f9cee15a33b75b32ae2e7eb45f3)

c++: Drop TREE_READONLY on vars (possibly) initialized by tls wrapper [PR109164]

The following two testcases are miscompiled, because we keep TREE_READONLY
on the vars even when they are (possibly) dynamically initialized by a TLS
wrapper function.  Normally cp_finish_decl drops TREE_READONLY from vars
which need dynamic initialization, but for TLS we do this kind of
initialization upon every access to those variables.  Keeping them
TREE_READONLY means e.g. PRE can hoist loads from those before loops
which contain the TLS wrapper calls, so we can access the TLS variables
before they are initialized.

2023-03-20  Jakub Jelinek  <jakub@redhat.com>

PR c++/109164
* cp-tree.h (var_needs_tls_wrapper): Declare.
* decl2.cc (var_needs_tls_wrapper): No longer static.
* decl.cc (cp_finish_decl): Clear TREE_READONLY on TLS variables
for which a TLS wrapper will be needed.

* g++.dg/tls/thread_local13.C: New test.
* g++.dg/tls/thread_local13-aux.cc: New file.
* g++.dg/tls/thread_local14.C: New test.
* g++.dg/tls/thread_local14-aux.cc: New file.

(cherry picked from commit 0a846340b99675d57fc2f2923a0412134eed09d3)

Daily bump.

PR target/108589 - Check REG_P for AARCH64_FUSE_ADDSUB_2REG_CONST1

This adds a check for REG_P on SET_DEST for the new idiom recognizer
for AARCH64_FUSE_ADDSUB_2REG_CONST1. The reported ICE is only
observable with checking=rtl.

Bootstrapped/regtested aarch64-linux, committed.

PR target/108589

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch_macro_fusion_pair_p): Check
REG_P on SET_DEST.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/pr108589.c: New test.

(cherry picked from commit a39c6ec97906766ad65d15d4856fd41121ee7a45)

aarch64: disable LDP via tuning structure for -mcpu=ampere1

AmpereOne (-mcpu=ampere1) breaks LDP instructions into two uops.
Given the chance that this causes instructions to slip into the next
decoding cycle and the additional overheads when handling
cacheline-crossing LDP instructions, we disable the generation of LDP
isntructions through the tuning structure from instruction combining
(such as in peephole2).

Given the code-density benefits in builtins and prologue/epilogue
expansion, we allow LDPs there.

This commit:
* adds a new tuning option AARCH64_EXTRA_TUNE_NO_LDP_COMBINE
* allows -moverride=tune=... to override this

These changes are benchmark-driven, yielding the following changes
(with a net-overall improvement):
   503.bwaves_r.      -0.88%
   507.cactuBSSN_r     0.35%
   508.namd_r          3.09%
   510.parest_r       -2.99%
   511.povray_r        5.54%
   519.lbm_r          15.83%
   521.wrf_r           0.56%
   526.blender_r       2.47%
   527.cam4_r          0.70%
   538.imagick_r       0.00%
   544.nab_r          -0.33%
   549.fotonik3d_r.   -0.42%
   554.roms_r          0.00%
   -------------------------
   = total             1.79%

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Co-Authored-By: Di Zhao <di.zhao@amperecomputing.com>
gcc/ChangeLog:

* config/aarch64/aarch64-tuning-flags.def (AARCH64_EXTRA_TUNING_OPTION):
Add AARCH64_EXTRA_TUNE_NO_LDP_COMBINE.
* config/aarch64/aarch64.cc (aarch64_operands_ok_for_ldpstp):
Check for the above tuning option when processing loads.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/ampere1-no_ldp_combine.c: New test.

(cherry picked from commit f200c56787f2c6f93ffb739d57d01a294ab72f68)