gcc.gnu.org Git - gcc.git/log

c++: alias CTAD and copy deduction guide [PR115198]

Here we're neglecting to update DECL_NAME during the alias CTAD guide
transformation, which causes copy_guide_p to return false for the
transformed copy deduction guide since DECL_NAME is still __dguide_C
with TREE_TYPE C<B, T> but it should be __dguide_A with TREE_TYPE A<T>
(i.e. C<false, T>). This ultimately results in ambiguity during
overload resolution between the copy deduction guide vs copy ctor guide.

This patch makes us update DECL_NAME of a transformed guide accordingly
during alias/inherited CTAD.

PR c++/115198

gcc/cp/ChangeLog:

* pt.cc (alias_ctad_tweaks): Update DECL_NAME of the transformed
guides.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/class-deduction-alias22.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 06ebb7c6f31fe42ffdea6f51ac1ba1f6b058c090)

c++: using non-dep array var of unknown bound [PR115358]

For a non-dependent array variable of unknown bound, it seems we need to
try instantiating its definition upon use in a template context for sake
of proper checking and typing of the overall expression, like we do for
function specializations with deduced return type.

PR c++/115358

gcc/cp/ChangeLog:

* decl2.cc (mark_used): Call maybe_instantiate_decl for an array
variable with unknown bound.
* semantics.cc (finish_decltype_type): Remove now redundant
handling of array variables with unknown bound.
* typeck.cc (cxx_sizeof_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/template/array37.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit e3915c1ad56591cbd68229a64c941c38330abd69)

libstdc++: Fix std::format for chrono::duration with unsigned rep [PR115668]

Using std::chrono::abs is only valid if numeric_limits<rep>::is_signed
is true, so using it unconditionally made it ill-formed to format a
duration with an unsigned rep.

The duration formatter might as well negate the duration itself instead
of using chrono::abs, because it already needs to check for a negative
value.

libstdc++-v3/ChangeLog:

PR libstdc++/115668
* include/bits/chrono_io.h (formatter<duration<R,P, C>::format):
Do not use chrono::abs.
* testsuite/20_util/duration/io.cc: Check formatting a duration
with unsigned rep.

(cherry picked from commit dafa750c8a6f0a088677871bfaad054881737ab1)

rs6000: Fix wrong RTL patterns for vector merge high/low word on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low word, which are altivec_vmrg[hl]w,
vsx_xxmrg[hl]w_<VSX_W:mode>.  These defines are mainly for
built-in function vec_merge{h,l}, __builtin_vsx_xxmrghw,
__builtin_vsx_xxmrghw_4si and some internal gen function
needs.  These functions should consider endianness, taking
vec_mergeh as example, as PVIPR defines, vec_mergeh "Merges
the first halves (in element order) of two vectors", it does
note it's in element order.  So it's mapped into vmrghw on
BE while vmrglw on LE respectively.  Although the mapped
insns are different, as the discussion in PR106069, the RTL
pattern should be still the same, it is conformed before
commit r12-4496, define_expand altivec_vmrghw got expanded
into:

  (vec_select:VSX_W
     (vec_concat:<VS_double>
        (match_operand:VSX_W 1 "register_operand" "wa,v")
        (match_operand:VSX_W 2 "register_operand" "wa,v"))
        (parallel [(const_int 0) (const_int 4)
                   (const_int 1) (const_int 5)])))]

on both BE and LE then.  But commit r12-4496 changed it to
expand into:

  (vec_select:VSX_W
     (vec_concat:<VS_double>
        (match_operand:VSX_W 1 "register_operand" "wa,v")
        (match_operand:VSX_W 2 "register_operand" "wa,v"))
        (parallel [(const_int 0) (const_int 4)
                   (const_int 1) (const_int 5)])))]

on BE, and

  (vec_select:VSX_W
     (vec_concat:<VS_double>
        (match_operand:VSX_W 1 "register_operand" "wa,v")
        (match_operand:VSX_W 2 "register_operand" "wa,v"))
        (parallel [(const_int 2) (const_int 6)
                   (const_int 3) (const_int 7)])))]

on LE, although the mapped insn are still vmrghw on BE and
vmrglw on LE, the associated RTL pattern is completely
wrong and inconsistent with the mapped insn.  If optimization
passes leave this pattern alone, even if its pattern doesn't
represent its mapped insn, it's still fine, that's why simple
testing on bif doesn't expose this issue.  But once some
optimization pass such as combine does some changes basing
on this wrong pattern, because the pattern doesn't match the
semantics that the expanded insn is intended to represent,
it would cause the unexpected result.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghw expands
into altivec_vmrghb_direct_be or altivec_vmrglb_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo <xionghuluo@tencent.com>
PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghw_direct_<VSX_W:mode>): Rename
to ...
(altivec_vmrghw_direct_<VSX_W:mode>_be): ... this.  Add the condition
BYTES_BIG_ENDIAN.
(altivec_vmrghw_direct_<VSX_W:mode>_le): New define_insn.
(altivec_vmrglw_direct_<VSX_W:mode>): Rename to ...
(altivec_vmrglw_direct_<VSX_W:mode>_be): ... this.  Add the condition
BYTES_BIG_ENDIAN.
(altivec_vmrglw_direct_<VSX_W:mode>_le): New define_insn.
(altivec_vmrghw): Adjust by calling gen_altivec_vmrghw_direct_v4si_be
for BE and gen_altivec_vmrglw_direct_v4si_le for LE.
(altivec_vmrglw): Adjust by calling gen_altivec_vmrglw_direct_v4si_be
for BE and gen_altivec_vmrghw_direct_v4si_le for LE.
(vec_widen_umult_hi_v8hi): Adjust the call to
gen_altivec_vmrghw_direct_v4si by gen_altivec_vmrghw for BE
and by gen_altivec_vmrglw for LE.
(vec_widen_smult_hi_v8hi): Likewise.
(vec_widen_umult_lo_v8hi): Adjust the call to
gen_altivec_vmrglw_direct_v4si by gen_altivec_vmrglw for BE
and by gen_altivec_vmrghw for LE
(vec_widen_smult_lo_v8hi): Likewise.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghw_direct_v4si by
CODE_FOR_altivec_vmrghw_direct_v4si_be for BE and
CODE_FOR_altivec_vmrghw_direct_v4si_le for LE.  And replace
CODE_FOR_altivec_vmrglw_direct_v4si by
CODE_FOR_altivec_vmrglw_direct_v4si_be for BE and
CODE_FOR_altivec_vmrglw_direct_v4si_le for LE.
* config/rs6000/vsx.md (vsx_xxmrghw_<VSX_W:mode>): Adjust by calling
gen_altivec_vmrghw_direct_v4si_be for BE and
gen_altivec_vmrglw_direct_v4si_le for LE.
(vsx_xxmrglw_<VSX_W:mode>): Adjust by calling
gen_altivec_vmrglw_direct_v4si_be for BE and
gen_altivec_vmrghw_direct_v4si_le for LE.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/pr106069.C: New test.
* gcc.target/powerpc/pr115355.c: New test.

(cherry picked from commit 52c112800d9f44457c4832309a48c00945811313)

Daily bump.

libstdc++: Replace viewcvs links in docs with cgit links

For this backport to the release branch, the links to the git repo refer
to the branch.

libstdc++-v3/ChangeLog:

* doc/xml/faq.xml: Replace viewcvs links with cgit links.
* doc/xml/manual/allocator.xml: Likewise.
* doc/xml/manual/mt_allocator.xml: Likewise.
* doc/html/*: Regenerate.

(cherry picked from commit 9d8021d1875677286c3dde90dfed2aca864edad0)

[libstdc++] [testsuite] defer to check_vect_support* [PR115454]

The newly-added testcase overrides the default dg-do action set by
check_vect_support_and_set_flags (in libstdc++-dg/conformance.exp), so
it attempts to run the test even if runtime vector support is not
available.

Remove the explicit dg-do directive, so that the default is honored,
and the test is run if vector support is found, and only compiled
otherwise.

for libstdc++-v3/ChangeLog

PR libstdc++/115454
* testsuite/experimental/simd/pr115454_find_last_set.cc: Defer
to check_vect_support_and_set_flags's default dg-do action.

(cherry picked from commit 95faa1bea7bdc7f92fcccb3543bfcbc8184c5e5b)

aarch64: Add support for -mcpu=grace

This adds support for the NVIDIA Grace CPU to aarch64.
We reuse the tuning decisions for the Neoverse V2 core, but include a
number of architecture features that are not enabled by default in
-mcpu=neoverse-v2.

This allows Grace users to more simply target the CPU with -mcpu=grace
rather than remembering what extensions to tag on top of
-mcpu=neoverse-v2.

Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/

* config/aarch64/aarch64-cores.def (grace): New entry.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi (AArch64 Options): Document the above.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>

tree-ssa-pre.c/115214(ICE in find_or_generate_expression, at tree-ssa-pre.c:2780): Return NULL_TREE when deal special cases.

Return NULL_TREE when genop3 equal EXACT_DIV_EXPR.
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652641.html

version log v3: remove additional POLY_INT_CST check.
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652795.html

gcc/ChangeLog:

* tree-ssa-pre.cc (create_component_ref_by_pieces_1): New conditions.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/pr115214.c: New test.

Daily bump.

libstdc++: Remove confusing text from status tables for release branch

When I tried to make the release branch versions of these docs refer to
the release branch instead of "mainline GCC", for some reason I left the
text "not any particular release" there. That's just confusing, because
the docs are for a particular release, the latest on that branch. Remove
that confusing text in several places.

libstdc++-v3/ChangeLog:

* doc/xml/manual/status_cxx1998.xml: Remove confusing "not in
any particular release" text.
* doc/xml/manual/status_cxx2011.xml: Likewise.
* doc/xml/manual/status_cxx2014.xml: Likewise.
* doc/xml/manual/status_cxx2017.xml: Likewise.
* doc/xml/manual/status_cxx2020.xml: Likewise.
* doc/xml/manual/status_cxx2023.xml: Likewise.
* doc/xml/manual/status_cxxtr1.xml: Likewise.
* doc/xml/manual/status_cxxtr24733.xml: Likewise.
* doc/html/manual/status.html: Regenerate.

Fix PR c/115587, uninitialized variable in c_parser_omp_loop_nest

This function had a reference to an uninitialized variable on the
error path. The problem was diagnosed by clang but not gcc. It seems
the cleanest solution is to initialize all the loop-clause variables
at the point of declaration rather than at different places in the
code.

The C++ front end didn't have this problem, but I've made similar
changes there to keep the code in sync.

gcc/c/ChangeLog:

PR c/115587
* c-parser.cc (c_parser_omp_loop_nest): Move initializations to
point of declaration.

gcc/cp/ChangeLog:

PR c/115587
* parser.cc (cp_parser_omp_loop_nest): Move initializations to
point of declaration.

(cherry picked from commit 21f1073d388af8af207183b0ed592e1cc47d20ab)

SPARC: fix internal error with -mv8plus on 64-bit Linux

This passes -m32 when -mv8plus is specified on Linux (like on Solaris).

gcc/
PR target/115608
* config/sparc/linux64.h (CC1_SPEC): Pass -m32 for -mv8plus.

c-family: Add Warning property to Wnrvo option [PR115624]

This was missing when Wnrvo was added in
r14-1594-g2ae5384d457b9c67586de012816dfc71a6943164 .

Pushed after a bootstrap/test on x86_64-linux-gnu.

gcc/c-family/ChangeLog:

PR c++/115624
* c.opt (Wnrvo): Add Warning property.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
(cherry picked from commit f7747210947a7c66e865c6ac571cce39e2b87caf)

Daily bump.

rs6000: Don't clobber return value when eh_return called [PR114846]

As the associated test case in PR114846 shows, currently
with eh_return involved some register restoring for EH
RETURN DATA in epilogue can clobber the one which holding
the return value. Referring to the existing handlings in
some other targets, this patch makes eh_return expander
call one new define_insn_and_split eh_return_internal which
directly calls rs6000_emit_epilogue with epilogue_type
EPILOGUE_TYPE_EH_RETURN instead of the previous treating
normal return with crtl->calls_eh_return specially.

PR target/114846

gcc/ChangeLog:

* config/rs6000/rs6000-logue.cc (rs6000_emit_epilogue): As
EPILOGUE_TYPE_EH_RETURN would be passed as epilogue_type directly
now, adjust the relevant handlings on it.
* config/rs6000/rs6000.md (eh_return expander): Append by calling
gen_eh_return_internal and emit_barrier.
(eh_return_internal): New define_insn_and_split, call function
rs6000_emit_epilogue with epilogue type EPILOGUE_TYPE_EH_RETURN.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr114846.c: New test.

(cherry picked from commit e5fc5d42d25c86ae48178db04ce64d340a834614)

Daily bump.

AArch64: Fix cpu features initialization [PR115342]

The CPU features initialization code uses CPUID registers (rather than
HWCAP).  The equality comparisons it uses are incorrect: for example FEAT_SVE
is not set if SVE2 is available.  Using HWCAPs for these is both simpler and
correct.  The initialization must also be done atomically to avoid multiple
threads causing corruption due to non-atomic RMW accesses to the global.

libgcc:
PR target/115342
* config/aarch64/cpuinfo.c (__init_cpu_features_constructor):
Use HWCAP where possible.  Use atomic write for initialization.
Fix FEAT_PREDRES comparison.
(__init_cpu_features_resolver): Use atomic load for correct
initialization.
(__init_cpu_features): Likewise.
(cherry picked from commit d7cbcfe7c33645eaf95f175f19884d443817857b)

libstdc++: Fix test on x86_64 and non-simd targets

* Running a test compiled with AVX512 instructions requires
avx512f_runtime not just avx512f.

* The 'reduce2' test violated an invariant of fixed_size_simd_mask and
thus failed on all targets without 16-Byte vector builtins enabled (in
bits/simd.h).

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/115575
* testsuite/experimental/simd/pr115454_find_last_set.cc: Require
avx512f_runtime. Don't memcpy fixed_size masks.

(cherry picked from commit 77f321435b4ac37992c2ed6737ca0caa1dd50551)

Build: Set gcc_cv_as_mips_explicit_relocs if gcc_cv_as_mips_explicit_relocs_pcrel

We check gcc_cv_as_mips_explicit_relocs if gcc_cv_as_mips_explicit_relocs_pcrel
only, while gcc_cv_as_mips_explicit_relocs is used by later code.

gcc
* configure.ac: Set gcc_cv_as_mips_explicit_relocs if
gcc_cv_as_mips_explicit_relocs_pcrel.
* configure: Regenerate.

(cherry picked from commit 573f11ec34eeb6a6c3bd3d7619738f927236727b)

tree-optimization/115278 - fix DSE in if-conversion wrt volatiles

The following adds the missing guard for volatile stores to the
embedded DSE in the loop if-conversion pass.

PR tree-optimization/115278
* tree-if-conv.cc (ifcvt_local_dce): Do not DSE volatile stores.

* g++.dg/vect/pr115278.cc: New testcase.

(cherry picked from commit 65dbe0ab7cdaf2aa84b09a74e594f0faacf1945c)

tree-optimization/115508 - fix ICE with SLP scheduling and extern vector

When there's a permute after an extern vector we can run into a case
that didn't consider the scheduled node being a permute which lacks
a representative.

PR tree-optimization/115508
* tree-vect-slp.cc (vect_schedule_slp_node): Guard check on
representative.

* gcc.target/i386/pr115508.c: New testcase.

(cherry picked from commit 65e72b95c63a5501cf1482f3814ae8c8e672bf06)

Avoid SLP_REPRESENTATIVE access for VEC_PERM in SLP scheduling

SLP permute nodes can end up without a SLP_REPRESENTATIVE now,
the following avoids touching it in this case in vect_schedule_slp_node.

* tree-vect-slp.cc (vect_schedule_slp_node): Avoid looking
at SLP_REPRESENTATIVE for VEC_PERM nodes.

(cherry picked from commit 31e9bae0ea5e5413abfa3ca9050e66cc6760553e)

Daily bump.

libstdc++: Fix find_last_set(simd_mask) to ignore padding bits

With the change to the AVX512 find_last_set implementation, the change
to AVX512 operator!= is unnecessary. However, the latter was not
producing optimal code and unnecessarily set the padding bits. In
theory, the compiler could determine that with the new !=
implementation, the bit operation for clearing the padding bits is a
no-op and can be elided.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/115454
* include/experimental/bits/simd_x86.h (_S_not_equal_to): Use
neq comparison instead of bitwise negation after eq.
(_S_find_last_set): Clear unused high bits before computing
bit_width.
* testsuite/experimental/simd/pr115454_find_last_set.cc: New
test.

(cherry picked from commit 1340ddea0158de3f49aeb75b4013e5fc313ff6f4)

vshuf-mem.C: Make -march=z14 depend on s390_vxe

gcc/testsuite/ChangeLog:

* g++.dg/torture/vshuf-mem.C: Use -march=z14 only, if the we are
on a machine which can actually run it.

(cherry picked from commit 7e59f0c05da840ca13ba73d25947df8a4eaf199e)

testsuite: Add -Wno-psabi to vshuf-mem.C test

The newly added test FAILs on i686-linux.
On x86_64-linux
make check-g++ RUNTESTFLAGS='--target_board=unix\{-m64,-m32/-msse2,-m32/-mno-sse/-mno-mmx\} dg-torture.exp=vshuf-mem.C'
shows that as well.

The problem is that without SSE2/MMX the vector is passed differently
than normally and so GCC warns about that.
-Wno-psabi is the usual way to shut it up.

Also wonder about the
// { dg-additional-options "-march=z14" { target s390*-*-* } }
line, doesn't that mean the test will FAIL on all pre-z14 HW?
Shouldn't it use some z14_runtime or similar effective target, or
check in main (in that case copied over to g++.target/s390) whether
z14 instructions can be actually used at runtime?

2024-06-14 Jakub Jelinek <jakub@redhat.com>

* g++.dg/torture/vshuf-mem.C: Add -Wno-psabi to dg-options.

(cherry picked from commit 1bb2535c7cb279e6aab731e79080d8486dd50cce)

IBM Z: Fix ICE in expand_perm_as_replicate

The current implementation assumes to always be invoked with register
operands. For memory operands we even have an instruction
though (vlrep). With the patch we try this first and only if it fails
force the input into a register and continue.

vec_splats generation fails for single element 128bit types which are
allowed for vec_splat. This is something to sort out with another
patch I guess.

gcc/ChangeLog:

* config/s390/s390.cc (expand_perm_as_replicate): Handle memory
operands.
* config/s390/vx-builtins.md (vec_splats<mode>): Turn into parameterized expander.
(@vec_splats<mode>): New expander.

gcc/testsuite/ChangeLog:

* g++.dg/torture/vshuf-mem.C: New test.

(cherry picked from commit 21fd8c67ad297212e3cb885883cc8df8611f3040)

bitint: Fix up lowering of COMPLEX_EXPR [PR115544]

We don't really support _Complex _BitInt(N), the only place we use
bitint complex types is for the .{ADD,SUB,MUL}_OVERFLOW internal function
results and COMPLEX_EXPR in the usual case should be either not present
yet because the ifns weren't folded and will be lowered, or optimized
into something simpler, because normally the complex bitint should be
used just for extracting the 2 subparts from it.
Still, with disabled optimizations it can occassionally happen that it
appears in the IL and that is why there is support for lowering those,
but it doesn't handle optimizing those too much, so if it uses SSA_NAME,
it relies on them having a backing VAR_DECL during the lowering.
This is normally achieves through the
                      && ((is_gimple_assign (use_stmt)
                           && (gimple_assign_rhs_code (use_stmt)
                               != COMPLEX_EXPR))
                          || gimple_code (use_stmt) == GIMPLE_COND)
hunk in gimple_lower_bitint, but as the following testcase shows, there
is one thing I've missed, the load optimization isn't guarded by the
above stuff.  So, either we'd need to add support for loads to
lower_complexexpr_stmt, or because they should be really rare, this
patch just disables the load optimization if at least one load use is
a COMPLEX_EXPR (like we do already for PHIs, calls, asm).

2024-06-19  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/115544
* gimple-lower-bitint.cc (gimple_lower_bitint): Disable optimizing
loads used by COMPLEX_EXPR operands.

* gcc.dg/bitint-107.c: New test.

(cherry picked from commit 25860fd2a674373a6476af5ff0bd92354fc53d06)

diagnostics: Fix add_misspelling_candidates [PR115440]

The option_map array for most entries contains just non-NULL opt0
    { "-Wno-", NULL, "-W", false, true },
    { "-fno-", NULL, "-f", false, true },
    { "-gno-", NULL, "-g", false, true },
    { "-mno-", NULL, "-m", false, true },
    { "--debug=", NULL, "-g", false, false },
    { "--machine-", NULL, "-m", true, false },
    { "--machine-no-", NULL, "-m", false, true },
    { "--machine=", NULL, "-m", false, false },
    { "--machine=no-", NULL, "-m", false, true },
    { "--machine", "", "-m", false, false },
    { "--machine", "no-", "-m", false, true },
    { "--optimize=", NULL, "-O", false, false },
    { "--std=", NULL, "-std=", false, false },
    { "--std", "", "-std=", false, false },
    { "--warn-", NULL, "-W", true, false },
    { "--warn-no-", NULL, "-W", false, true },
    { "--", NULL, "-f", true, false },
    { "--no-", NULL, "-f", false, true }
and so add_misspelling_candidates works correctly for it, but 3 out of
these,
    { "--machine", "", "-m", false, false },
    { "--machine", "no-", "-m", false, true },
and
    { "--std", "", "-std=", false, false },
use non-NULL opt1.  That says that
--machine foo
should map to
-mfoo
and
--machine no-foo
should map to
-mno-foo
and
--std c++17
should map to
-std=c++17
add_misspelling_canidates was not handling this, so it hapilly
registered say
--stdc++17
or
--machineavx512
(twice) as spelling alternatives, when those options aren't recognized.
Instead we support
--std c++17
or
--machine avx512
--machine no-avx512

The following patch fixes that.  On this particular testcase, we no longer
suggest anything, even when among the suggestion is say that
--std c++17
or
-std=c++17
etc.

2024-06-17  Jakub Jelinek  <jakub@redhat.com>

PR driver/115440
* opts-common.cc (add_misspelling_candidates): If opt1 is non-NULL,
add a space and opt1 to the alternative suggestion text.

* g++.dg/cpp1z/pr115440.C: New test.

(cherry picked from commit 96db57948b50f45235ae4af3b46db66cae7ea859)

Daily bump.

c-family: Fix -Warray-compare warning ICE [PR115290]

The warning code uses %D to print the ARRAY_REF first operands.
That works in the most common case where those operands are decls, but
as can be seen on the following testcase, they can be other expressions
with array type.
Just changing %D to %E isn't enough, because then the diagnostics can
suggest something like
note: use '&(x) != 0 ? (int (*)[32])&a : (int (*)[32])&b[0] == &(y) != 0 ? (int (*)[32])&a : (int (*)[32])&b[0]' to compare the addresses
which is a bad suggestion, the %E printing doesn't know that the
warning code will want to add & before it and [0] after it.
So, the following patch adds ()s around the operand as well, but does
that only for non-decls, for decls keeps it as &arr[0] like before.

2024-06-17 Jakub Jelinek <jakub@redhat.com>

PR c/115290
* c-warn.cc (do_warn_array_compare): Use %E rather than %D for
printing op0 and op1; if those operands aren't decls, also print
parens around them.

* c-c++-common/Warray-compare-3.c: New test.

(cherry picked from commit b63c7d92012f92e0517190cf263d29bbef8a06bf)

c++: Fix up floating point conversion rank comparison for _Float32 and float if float/double are same size [PR115511]

On AVR and SH with some options sizeof (float) == sizeof (double) and
the 2 types have the same set of values.
http://eel.is/c++draft/conv.rank#2.2 for this says that double still
has bigger rank than float and http://eel.is/c++draft/conv.rank#2.2
says that extended type with the same set of values as more than one
standard floating point type shall have the same rank as double.
I've implemented the latter rule as
   if (cnt > 1 && mv2 == long_double_type_node)
     return -2;
with the _Float64/double/long double case having same mode case (various
targets with -mlong-double-64) in mind.
But never thought there are actually targets where float and double
are the same, that needs handling too, if cnt > 1 (that is the extended
type mv1 has same set of values as 2 or 3 of float/double/long double)
and mv2 is float, we need to return 2, because mv1 in that case should
have same rank as double and double has bigger rank than float.

2024-06-17  Jakub Jelinek  <jakub@redhat.com>

PR target/111343
PR c++/115511
* typeck.cc (cp_compare_floating_point_conversion_ranks): If an
extended floating point type mv1 has same set of values as more
than one standard floating point type and mv2 is float, return 2.

* g++.dg/cpp23/ext-floating18.C: New test.

(cherry picked from commit 8584c98f370cd91647c184ce58141508ca478a12)

c++: undeclared identifier in requires-clause [PR99678]

Since the terms of a requires-clause are grammatically primary-expressions
and not e.g. postfix-expressions, it seems we need to explicitly handle
and diagnose the case where a term parses to a bare unresolved identifier,
like cp_parser_postfix_expression does, since cp_parser_primary_expression
leaves that up to its callers. Otherwise we incorrectly accept the first
three requires-clauses below.

Note that the only occurrences of primary-expression in the grammar are
postfix-expression and constraint-logical-and-expression, so it's not too
surprising that we need this special handling here.

PR c++/99678

gcc/cp/ChangeLog:

* parser.cc (cp_parser_constraint_primary_expression): Diagnose
a bare unresolved unqualified-id.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-requires38.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit d387ecb2b2f44f33fd6a7c5ec7eadaf6dd70efc9)

c++: ICE w/ ambig and non-strictly-viable cands [PR115239]

Here during overload resolution we have two strictly viable ambiguous
candidates #1 and #2, and two non-strictly viable candidates #3 and #4
which we hold on to ever since r14-6522. These latter candidates have
an empty second arg conversion since the first arg conversion was deemed
bad, and this trips up joust when called on #3 and #4 which assumes all
arg conversions are there.

We can fix this by making joust robust to empty arg conversions, but in
this situation we shouldn't need to compare #3 and #4 at all given that
we have a strictly viable candidate. To that end, this patch makes
tourney shortcut considering non-strictly viable candidates upon
encountering ambiguity between two strictly viable candidates (taking
advantage of the fact that the candidates list is sorted according to
viability via splice_viable).

PR c++/115239

gcc/cp/ChangeLog:

* call.cc (tourney): Don't consider a non-strictly viable
candidate as the champ if there was ambiguity between two
strictly viable candidates.

gcc/testsuite/ChangeLog:

* g++.dg/overload/error7.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 7fed7e9bbc57d502e141e079a6be2706bdbd4560)

c++: visibility wrt concept-id as targ [PR115283]

Like with alias templates, it seems we don't maintain visibility flags
for concepts either, so min_vis_expr_r should ignore them for now.
Otherwise after r14-6789 we may incorrectly give a function template that
uses a concept-id in its signature internal linkage.

PR c++/115283

gcc/cp/ChangeLog:

* decl2.cc (min_vis_expr_r) <case TEMPLATE_DECL>: Ignore
concepts.

gcc/testsuite/ChangeLog:

* g++.dg/template/linkage5.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit b1fe718cbe0c8883af89f52e0aad3ebf913683de)

s390: testsuite: Fix ifcvt-one-insn-bool.c

With the change of r15-787-g57e04879389f9c I forgot to also update this
test.

gcc/testsuite/ChangeLog:

* gcc.target/s390/ifcvt-one-insn-bool.c: Fix loc.

(cherry picked from commit ac66736bf2f8a10d2f43e83ed6377e4179027a39)

s390: Implement TARGET_NOCE_CONVERSION_PROFITABLE_P [PR109549]

Consider a NOCE conversion as profitable if there is at least one
conditional move.

gcc/ChangeLog:

PR target/109549
* config/s390/s390.cc (TARGET_NOCE_CONVERSION_PROFITABLE_P):
Define.
(s390_noce_conversion_profitable_p): Implement.

gcc/testsuite/ChangeLog:

* gcc.target/s390/ccor.c: Order of loads are reversed, now, as a
consequence the condition has to be reversed.

(cherry picked from commit 57e04879389f9c0d5d53f316b468ce1bddbab350)

Daily bump.

riscv: Allocate enough space to strcpy() string

I triggered an ICE on Ubuntu 24.04 when compiling code that uses
function attributes. Looking into the sources shows that we have
a systematic issue in the attribute handling code:
* we determine the length with strlen() (excluding the terminating null)
* we allocate a buffer with this length
* we copy the original string using strcpy() (incl. the terminating null)

To quote the man page of strcpy():
"The programmer is responsible for allocating a  destination  buffer
large  enough,  that  is, strlen(src)  + 1."

The ICE looks like this:

*** buffer overflow detected ***: terminated
xtheadmempair_bench.c:14:1: internal compiler error: Aborted
   14 | {
      | ^
0xaf3b99 crash_signal
        /home/ubuntu/src/gcc/scaleff/gcc/toplev.cc:319
0xe5b957 strcpy
        /usr/include/riscv64-linux-gnu/bits/string_fortified.h:79
0xe5b957 riscv_process_target_attr
        /home/ubuntu/src/gcc/scaleff/gcc/config/riscv/riscv-target-attr.cc:339
0xe5baaf riscv_process_target_attr
        /home/ubuntu/src/gcc/scaleff/gcc/config/riscv/riscv-target-attr.cc:314
0xe5bc5f riscv_option_valid_attribute_p(tree_node*, tree_node*, tree_node*, int)
        /home/ubuntu/src/gcc/scaleff/gcc/config/riscv/riscv-target-attr.cc:389
0x6a31e5 handle_target_attribute
        /home/ubuntu/src/gcc/scaleff/gcc/c-family/c-attribs.cc:5915
0x5d3a07 decl_attributes(tree_node**, tree_node*, int, tree_node*)
        /home/ubuntu/src/gcc/scaleff/gcc/attribs.cc:900
0x5db403 c_decl_attributes
        /home/ubuntu/src/gcc/scaleff/gcc/c/c-decl.cc:5501
0x5e8965 start_function(c_declspecs*, c_declarator*, tree_node*)
        /home/ubuntu/src/gcc/scaleff/gcc/c/c-decl.cc:10562
0x6318ed c_parser_declaration_or_fndef
        /home/ubuntu/src/gcc/scaleff/gcc/c/c-parser.cc:2914
0x63a8ad c_parser_external_declaration
        /home/ubuntu/src/gcc/scaleff/gcc/c/c-parser.cc:2048
0x63b219 c_parser_translation_unit
        /home/ubuntu/src/gcc/scaleff/gcc/c/c-parser.cc:1902
0x63b219 c_parse_file()
        /home/ubuntu/src/gcc/scaleff/gcc/c/c-parser.cc:27277
0x68fec5 c_common_parse_file()
        /home/ubuntu/src/gcc/scaleff/gcc/c-family/c-opts.cc:1311
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

gcc/ChangeLog:

* config/riscv/riscv-target-attr.cc (riscv_target_attr_parser::parse_arch):
Fix allocation size of buffer.
(riscv_process_one_target_attr): Likewise.
(riscv_process_target_attr): Likewise.

(cherry picked from commit 6762d5738b02d84ad3f51e89979b48acb68db65b)
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

Daily bump.

libstdc++: Fix declaration of posix_memalign for freestanding

Thanks to Jérôme Duval for noticing this.

libstdc++-v3/ChangeLog:

* libsupc++/new_opa.cc [!_GLIBCXX_HOSTED]: Fix declaration of
posix_memalign.

(cherry picked from commit 161efd677458f20d13ee1018a4d5e3964febd508)

Daily bump.

arm: Add .type and .size to __gnu_cmse_nonsecure_call [PR115360]

This patch adds missing assembly directives to the CMSE library wrapper to call
functions with attribute cmse_nonsecure_call. Without the .type directive the
linker will fail to produce the correct veneer if a call to this wrapper
function is to far from the wrapper itself. The .size was added for
completeness, though we don't necessarily have a usecase for it.

libgcc/ChangeLog:

PR target/115360
* config/arm/cmse_nonsecure_call.S: Add .type and .size directives.

(cherry picked from commit c559353af49fe5743d226ac3112a285b27a50f6a)

testsuite: Fix expand-return CMSE test for Armv8.1-M [PR115253]

For Armv8.1-M, the clearing of the registers is handled differently than
for Armv8-M, so update the test case accordingly.

gcc/testsuite/ChangeLog:

PR target/115253
* gcc.target/arm/cmse/extend-return.c: Update test case
condition for Armv8.1-M.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
Co-authored-by: Yvan ROUX <yvan.roux@foss.st.com>
(cherry picked from commit cf5f9171bae1f5f3034dc9a055b77446962f1a8c)

arm: Zero/Sign extends for CMSE security on Armv8-M.baseline [PR115253]

Properly handle zero and sign extension for Armv8-M.baseline as
Cortex-M23 can have the security extension active.
Currently, there is an internal compiler error on Cortex-M23 for the
epilog processing of sign extension.

This patch addresses the following CVE-2024-0151 for Armv8-M.baseline.

gcc/ChangeLog:

PR target/115253
* config/arm/arm.cc (cmse_nonsecure_call_inline_register_clear):
Sign extend for Thumb1.
(thumb1_expand_prologue): Add zero/sign extend.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
Co-authored-by: Yvan ROUX <yvan.roux@foss.st.com>
(cherry picked from commit 65bd0655ece268895e5018e393bafb769e201c78)

Daily bump.

Fix building JIT with musl libc [PR115442]

Just like r13-6662-g0e6f87835ccabf but this time for jit/jit-recording.cc.

Pushed as obvious after a quick build to make sure jit still builds.

gcc/jit/ChangeLog:

PR jit/115442
* jit-recording.cc: Define INCLUDE_SSTREAM before including
system.h and don't directly incldue sstream.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
(cherry picked from commit e4244b88d75124f6957bfa080c8ad34017364e53)

ira: Fix go_through_subreg offset calculation [PR115281]

go_through_subreg used:

  else if (!can_div_trunc_p (SUBREG_BYTE (x),
     REGMODE_NATURAL_SIZE (GET_MODE (x)), offset))

to calculate the register offset for a pseudo subreg x.  In the blessed
days before poly-int, this was:

    *offset = (SUBREG_BYTE (x) / REGMODE_NATURAL_SIZE (GET_MODE (x)));

But I think this is testing the wrong natural size.  If we exclude
paradoxical subregs (which will get an offset of zero regardless),
it's the inner register that is being split, so it should be the
inner register's natural size that we use.

This matters in the testcase because we have an SFmode lowpart
subreg into the last of three variable-sized vectors.  The
SUBREG_BYTE is therefore equal to the size of two variable-sized
vectors.  Dividing by the vector size gives a register offset of 2,
as expected, but dividing by the size of a scalar FPR would give
a variable offset.

I think something similar could happen for fixed-size targets if
REGMODE_NATURAL_SIZE is different for vectors and integers (say),
although that case would trade an ICE for an incorrect offset.

gcc/
PR rtl-optimization/115281
* ira-conflicts.cc (go_through_subreg): Use the natural size of
the inner mode rather than the outer mode.

gcc/testsuite/
PR rtl-optimization/115281
* gfortran.dg/pr115281.f90: New test.

(cherry picked from commit 46d931b3dd31cbba7c3355ada63f155aa24a4e2b)

Daily bump.

c++: lambda in pack expansion [PR115378]

Here find_parameter_packs_r is incorrectly treating the 'auto' return
type of a lambda as a parameter pack due to Concepts-TS specific logic
added in r6-4517, leading to confusion later when expanding the pattern.

Since we intend on removing Concepts TS support soon anyway, this patch
fixes this by restricting the problematic logic with flag_concepts_ts.
Doing so revealed that add_capture was relying on this logic to set
TEMPLATE_TYPE_PARAMETER_PACK for the 'auto' type of an pack expansion
init-capture, which we now need to do explicitly.

PR c++/115378

gcc/cp/ChangeLog:

* lambda.cc (lambda_capture_field_type): Set
TEMPLATE_TYPE_PARAMETER_PACK on the auto type of an init-capture
pack expansion.
* pt.cc (find_parameter_packs_r) <case TEMPLATE_TYPE_PARM>:
Restrict TEMPLATE_TYPE_PARAMETER_PACK promotion with
flag_concepts_ts.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/decltype-auto-103497.C: Adjust expected diagnostic.
* g++.dg/template/pr95672.C: Likewise.
* g++.dg/cpp2a/lambda-targ5.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 5c761395402a730535983a5e49ef1775561ebc61)

Fix crash on access-to-incomplete type

This just adds the missing guard.

gcc/ada/
PR ada/114708
* exp_util.adb (Finalize_Address): Add guard for incomplete types.

gcc/testsuite/
* gnat.dg/incomplete8.adb: New test.

Add testcase for PR ada/114398

gcc/testsuite/
PR ada/114398
* gnat.dg/access11.adb: New test.

ada: Storage_Error in indirect call to function returning limited type

At runtime the code generated by the compiler reports the
exception Storage_Error in an indirect call through an
access-to-subprogram variable that references a function
returning a limited tagged type object.

gcc/ada/

* sem_ch6.adb (Might_Need_BIP_Task_Actuals): Add support
for access-to-subprogram parameter types.
* exp_ch6.adb (Add_Task_Actuals_To_Build_In_Place_Call):
Add dummy BIP parameters to access-to-subprogram types
that may reference a function that has BIP parameters.

libgcc/aarch64: also provide AT_HWCAP2 fallback

Much like AT_HWCAP is already provided in case the platform headers
don't have the value (yet).

libgcc/

* config/aarch64/cpuinfo.c: Provide AT_HWCAP2.

libstdc++: Fix simd<char> conversion for -fno-signed-char for Clang

The special case for Clang in the trait producing a signed integer type
lead to the trait returning 'char' where it should have been 'signed
char'. This workaround was introduced because on Clang the return type
of vector compares was not convertible to '_SimdWrapper<
__int_for_sizeof_t<...' unless '__int_for_sizeof_t<char>' was an alias
for 'char'. In order to not rewrite the complete mask type code (there
is code scattered around the implementation assuming signed integers),
this needs to be 'signed char'; so the special case for Clang needs to
be removed.
The conversion issue is now solved in _SimdWrapper, which now
additionally allows conversion from vector types with compatible
integral type.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/115308
* include/experimental/bits/simd.h (__int_for_sizeof): Remove
special cases for __clang__.
(_SimdWrapper): Change constructor overload set to allow
conversion from vector types with integral conversions via bit
reinterpretation.

(cherry picked from commit 8e36cf4c5c9140915d0019999db132a900b48037)

libstdc++: Avoid MMX return types from __builtin_shufflevector

This resolves a regression on i686 that was introduced with
r15-429-gfb1649f8b4ad50.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/115247
* include/experimental/bits/simd.h (__as_vector): Don't use
vector_size(8) on __i386__.
(__vec_shuffle): Never return MMX vectors, widen to 16 bytes
instead.
(concat): Fix padding calculation to pick up widening logic from
__as_vector.

(cherry picked from commit 241a6cc88d866fb36bd35ddb3edb659453d6322e)

libstdc++: Use __builtin_shufflevector for simd split and concat

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/114958
* include/experimental/bits/simd.h (__as_vector): Return scalar
simd as one-element vector. Return vector from single-vector
fixed_size simd.
(__vec_shuffle): New.
(__extract_part): Adjust return type signature.
(split): Use __extract_part for any split into non-fixed_size
simds.
(concat): If the return type stores a single vector, use
__vec_shuffle (which calls __builtin_shufflevector) to produce
the return value.
* include/experimental/bits/simd_builtin.h
(__shift_elements_right): Removed.
(__extract_part): Return single elements directly. Use
__vec_shuffle (which calls __builtin_shufflevector) to for all
non-trivial cases.
* include/experimental/bits/simd_fixed_size.h (__extract_part):
Return single elements directly.
* testsuite/experimental/simd/pr114958.cc: New test.

(cherry picked from commit fb1649f8b4ad5043dd0e65e4e3a643a0ced018a9)

Daily bump.

Fortran: fix ALLOCATE with SOURCE=, zero-length character [PR83865]

gcc/fortran/ChangeLog:

PR fortran/83865
* trans-stmt.cc (gfc_trans_allocate): Restrict special case for
source-expression with zero-length character to rank 0, so that
the array shape is not discarded.

gcc/testsuite/ChangeLog:

PR fortran/83865
* gfortran.dg/allocate_with_source_32.f90: New test.

(cherry picked from commit 7f21aee0d4ef95eee7d9f7f42e9a056715836648)

Daily bump.

arm: Fix CASE_VECTOR_SHORTEN_MODE for thumb2.

The CASE_VECTOR_SHORTEN_MODE query is missing some equals signs
which causes suboptimal codegen due to missed optimisation
opportunities. This patch also adds a test for thumb2
switch statements as none exist currently.

gcc/ChangeLog:
PR target/115353
* config/arm/arm.h (enum arm_auto_incmodes):
Correct CASE_VECTOR_SHORTEN_MODE query.

gcc/testsuite/ChangeLog:

* gcc.target/arm/thumb2-switchstatement.c: New test.

(cherry picked from commit 2963c76e8e24d4ebaf2b1b4ac4d7ca44eb0a9025)

bitint: Fix up lower_addsub_overflow [PR115352]

The following testcase is miscompiled because of a flawed optimization.
If one changes the 65 in the testcase to e.g. 66, one gets:
...
  _25 = .USUBC (0, _24, _14);
  _12 = IMAGPART_EXPR <_25>;
  _26 = REALPART_EXPR <_25>;
  if (_23 >= 1)
    goto <bb 8>; [80.00%]
  else
    goto <bb 11>; [20.00%]

  <bb 8> :
  if (_23 != 1)
    goto <bb 10>; [80.00%]
  else
    goto <bb 9>; [20.00%]

  <bb 9> :
  _27 = (signed long) _26;
  _28 = _27 >> 1;
  _29 = (unsigned long) _28;
  _31 = _29 + 1;
  _30 = _31 > 1;
  goto <bb 11>; [100.00%]

  <bb 10> :
  _32 = _26 != _18;
  _33 = _22 | _32;

  <bb 11> :
  # _17 = PHI <_30(9), _22(7), _33(10)>
  # _19 = PHI <_29(9), _18(7), _18(10)>
...
so there is one path for limbs below the boundary (in this case there are
actually no limbs there, maybe we could consider optimizing that further,
say with simply folding that _23 >= 1 condition to 1 == 1 and letting
cfg cleanup handle it), another case where it is exactly the limb on the
boundary (that is the bb 9 handling where it extracts the interesting
bits (the first 3 statements) and then checks if it is zero or all ones and
finally the case of limbs above that where it compares the current result
limb against the previously recorded 0 or all ones and ors differences into
accumulated result.

Now, the optimization which the first hunk removes was based on the idea
that for that case the extraction of the interesting bits from the limb
don't need anything special, so the _27/_28/_29 statements above aren't
needed, the whole limb is interesting bits, so it handled the >= 1
case like the bb 9 above without the first 3 statements and bb 10 wasn't
there at all.  There are 2 problems with that, for the higher limbs it
only checks if the the result limb bits are all zeros or all ones, but
doesn't check if they are the same as the other extension bits, and
it forgets the previous flag whether there was an overflow.
First I wanted to fix it just by adding the _33 = _22 | _30; statement
to the end of bb 9 above, which fixed the originally filed huge testcase
and the first 2 foo calls in the testcase included in the patch, it no
longer forgets about previously checked differences from 0/1.
But as the last 2 foo calls show, it still didn't check whether each
even (or each odd depending on the exact position) result limb is
equal to the first one, so every second limb it could choose some other
0 vs. all ones value and as long as it repeated in another limb above it
it would be ok.

So, the optimization just can't work properly and the following patch
removes it.

2024-06-07  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/115352
* gimple-lower-bitint.cc (lower_addsub_overflow): Don't disable
single_comparison if cmp_code is GE_EXPR.

* gcc.dg/torture/bitint-71.c: New test.

(cherry picked from commit a47b1aaa7a76201da7e091d9f8d4488105786274)

Daily bump.

c: Fix up pointer types to may_alias structures [PR114493]

The following testcase ICEs in ipa-free-lang, because the
fld_incomplete_type_of
          gcc_assert (TYPE_CANONICAL (t2) != t2
                      && TYPE_CANONICAL (t2) == TYPE_CANONICAL (TREE_TYPE (t)));
assertion doesn't hold.
This is because t is a struct S * type which was created while struct S
was still incomplete and without the may_alias attribute (and TYPE_CANONICAL
of a pointer type is a type created with can_alias_all = false argument),
while later on on the struct definition may_alias attribute was used.
fld_incomplete_type_of then creates an incomplete distinct copy of the
structure (but with the original attributes) but pointers created for it
are because of the "may_alias" attribute TYPE_REF_CAN_ALIAS_ALL, including
their TYPE_CANONICAL, because while that is created with !can_alias_all
argument, we later set it because of the "may_alias" attribute on the
to_type.

This doesn't ICE with C++ since PR70512 fix because the C++ FE sets
TYPE_REF_CAN_ALIAS_ALL on all pointer types to the class type (and its
variants) when the may_alias is added.

The following patch does that in the C FE as well.

2024-06-06  Jakub Jelinek  <jakub@redhat.com>

PR c/114493
* c-decl.cc (c_fixup_may_alias): New function.
(finish_struct): Call it if "may_alias" attribute is
specified.

* gcc.dg/pr114493-1.c: New test.
* gcc.dg/pr114493-2.c: New test.

(cherry picked from commit d5a3c6d43acb8b2211d9fb59d59482d74c010f01)

aarch64: Add missing ACLE macro for NEON-SVE Bridge

__ARM_NEON_SVE_BRIDGE was missed in the original patch and is
added by this patch.

gcc/ChangeLog:

* config/aarch64/aarch64-c.cc (aarch64_define_unconditional_macros):
Add missing __ARM_NEON_SVE_BRIDGE.

(cherry picked from commit 43530bc40b1d0465911e493e56a6631202ce85b1)

Daily bump.

testsuite: i386: Require ifunc support in gcc.target/i386/avx10_1-25.c etc.

Two new AVX10.1 tests FAIL on Solaris/x86:

FAIL: gcc.target/i386/avx10_1-25.c (test for excess errors)
FAIL: gcc.target/i386/avx10_1-26.c (test for excess errors)

Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/avx10_1-25.c:6:9: error: the call requires 'ifunc', which is not supported by this target

Fixed by requiring ifunc support.

Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.

2024-06-04 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
* gcc.target/i386/avx10_1-25.c: Require ifunc support.
* gcc.target/i386/avx10_1-26.c: Likewise.

Daily bump.

libstdc++: Only define std::span::at for C++26 [PR115335]

In r14-5689-g1fa85dcf656e2f I added std::span::at and made the correct
changes to the __cpp_lib_span macro (with tests for the correct value in
C++20/23/26). But I didn't make the declaration of std::span::at
actually depend on the macro, so it was defined for C++20 and C++23, not
only for C++26. This fixes that oversight.

libstdc++-v3/ChangeLog:

PR libstdc++/115335
* include/std/span (span::at): Guard with feature test macro.

(cherry picked from commit 2197814011eec75022aa8550f10621409b69d4a1)

fold-const: Fix up CLZ handling in tree_call_nonnegative_warnv_p [PR115337]

The function currently incorrectly assumes all the __builtin_clz* and .CLZ
calls have non-negative result. That is the case of the former which is UB
on zero and has [0, prec-1] return value otherwise, and is the case of the
single argument .CLZ as well (again, UB on zero), but for two argument
.CLZ is the case only if the second argument is also nonnegative (or if we
know the argument can't be zero, but let's do that just in the ranger IMHO).

The following patch does that.

2024-06-04 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/115337
* fold-const.cc (tree_call_nonnegative_warnv_p) <CASE_CFN_CLZ>:
If arg1 is non-NULL, RECURSE on it, otherwise return true.

* gcc.dg/bitint-106.c: New test.

(cherry picked from commit b82a816000791e7a286c7836b3a473ec0e2a577b)

builtins: Force SAVE_EXPR for __builtin_{add,sub,mul}_overflow and __builtin{add,sub}c [PR108789]

The following testcase is miscompiled, because we use save_expr
on the .{ADD,SUB,MUL}_OVERFLOW call we are creating, but if the first
two operands are not INTEGER_CSTs (in that case we just fold it right away)
but are TREE_READONLY/!TREE_SIDE_EFFECTS, save_expr doesn't actually
create a SAVE_EXPR at all and so we lower it to
*arg2 = REALPART_EXPR (.ADD_OVERFLOW (arg0, arg1)), \
IMAGPART_EXPR (.ADD_OVERFLOW (arg0, arg1))
which evaluates the ifn twice and just hope it will be CSEd back.
As *arg2 aliases *arg0, that is not the case.
The builtins are really never const/pure as they store into what
the third arguments points to, so after handling the INTEGER_CST+INTEGER_CST
case, I think we should just always use SAVE_EXPR. Just building SAVE_EXPR
by hand and setting TREE_SIDE_EFFECTS on it doesn't work, because
c_fully_fold optimizes it away again, so the following patch marks the
ifn calls as TREE_SIDE_EFFECTS (but doesn't do it for the
__builtin_{add,sub,mul}_overflow_p case which were designed for use
especially in constant expressions and don't really evaluate the
realpart side, so we don't really need a SAVE_EXPR in that case).

2024-06-04 Jakub Jelinek <jakub@redhat.com>

PR middle-end/108789
* builtins.cc (fold_builtin_arith_overflow): For ovf_only,
don't call save_expr and don't build REALPART_EXPR, otherwise
set TREE_SIDE_EFFECTS on call before calling save_expr.
(fold_builtin_addc_subc): Set TREE_SIDE_EFFECTS on call before
calling save_expr.

* gcc.c-torture/execute/pr108789.c: New test.

(cherry picked from commit b8e28381cb5c0cddfe5201faf799d8b27f5d7d6c)

invoke.texi: Clarify -march=lujiazui

I was recently searching which exact CPUs are affected by the PR114576
wrong-code issue and went from the PTA_* bitmasks in GCC, so arrived
at the goldmont, goldmont-plus, tremont and lujiazui CPUs (as -march=
cases which do enable -maes and don't enable -mavx).
But when double-checking that against the invoke.texi documentation,
that was true for the first 3, but lujiazui said it supported AVX.
I was really confused by that, until I found the
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604407.html
explanation. So, seems the CPUs do have AVX and F16C but -march=lujiazui
doesn't enable those and even activelly attempts to filter those out from
the announced CPUID features, in glibc as well as e.g. in libgcc.

Thus, I think we should document what actually happens, otherwise
users could assume that
gcc -march=lujiazui predefines __AVX__ and __F16C__, which it doesn't.

2024-06-04 Jakub Jelinek <jakub@redhat.com>

* doc/invoke.texi (lujiazui): Clarify that while the CPUs do support
AVX and F16C, -march=lujiazui actually doesn't enable those.

(cherry picked from commit 09b4ab53155ea16e1fb12c2afcd9b6fe29a31c74)

rs6000: Fix up PCH in --enable-host-pie builds [PR115324]

PCH doesn't work properly in --enable-host-pie configurations on
powerpc*-linux*.
The problem is that the rs6000_builtin_info and rs6000_instance_info
arrays mix pointers to .rodata/.data (bifname and attr_string point
to string literals in .rodata section, and the next member is either NULL
or &rs6000_instance_info[XXX]) and GC member (tree fntype).
Now, for normal GC this works just fine, we emit
  {
    &rs6000_instance_info[0].fntype,
    1 * (RS6000_INST_MAX),
    sizeof (rs6000_instance_info[0]),
    &gt_ggc_mx_tree_node,
    &gt_pch_nx_tree_node
  },
  {
    &rs6000_builtin_info[0].fntype,
    1 * (RS6000_BIF_MAX),
    sizeof (rs6000_builtin_info[0]),
    &gt_ggc_mx_tree_node,
    &gt_pch_nx_tree_node
  },
GC roots which are strided and thus cover only the fntype members of all
the elements of the two arrays.
For PCH though it actually results in saving those huge arrays (one is
130832 bytes, another 81568 bytes) into the .gch files and loading them back
in full.  While the bifname and attr_string and next pointers are marked as
GTY((skip)), they are actually saved to point to the .rodata and .data
sections of the process which writes the PCH, but because cc1/cc1plus etc.
are position independent executables with --enable-host-pie, when it is
loaded from the PCH file, it can point in a completely different addresses
where nothing is mapped at all or some random different thing appears at.
While gengtype supports the callback option, that one is meant for
relocatable function pointers and doesn't work in the case of GTY arrays
inside of .data section anyway.

So, either we'd need to add some further GTY extensions, or the following
patch instead reworks it such that the fntype members which were the only
reason for PCH in those arrays are moved to separate arrays.

Size-wise in .data sections it is (in bytes):

                             vanilla    patched
rs6000_builtin_info          130832     110704
rs6000_instance_info          81568      40784
rs6000_overload_info           7392       7392
rs6000_builtin_info_fntype        0      10064
rs6000_instance_info_fntype       0      20392
sum                          219792     189336

where previously we saved/restored for PCH those 130832+81568 bytes, now we
save/restore just 10064+20392 bytes, so this change is beneficial for the
data section size.

Unfortunately, it grows the size of the rs6000_init_generated_builtins
function, vanilla had 218328 bytes, patched has 228668.

When I applied
void
rs6000_init_generated_builtins ()
{
+  bifdata *rs6000_builtin_info_p;
+  tree *rs6000_builtin_info_fntype_p;
+  ovlddata *rs6000_instance_info_p;
+  tree *rs6000_instance_info_fntype_p;
+  ovldrecord *rs6000_overload_info_p;
+  __asm ("" : "=r" (rs6000_builtin_info_p) : "0" (rs6000_builtin_info));
+  __asm ("" : "=r" (rs6000_builtin_info_fntype_p) : "0" (rs6000_builtin_info_fntype));
+  __asm ("" : "=r" (rs6000_instance_info_p) : "0" (rs6000_instance_info));
+  __asm ("" : "=r" (rs6000_instance_info_fntype_p) : "0" (rs6000_instance_info_fntype));
+  __asm ("" : "=r" (rs6000_overload_info_p) : "0" (rs6000_overload_info));
+  #define rs6000_builtin_info rs6000_builtin_info_p
+  #define rs6000_builtin_info_fntype rs6000_builtin_info_fntype_p
+  #define rs6000_instance_info rs6000_instance_info_p
+  #define rs6000_instance_info_fntype rs6000_instance_info_fntype_p
+  #define rs6000_overload_info rs6000_overload_info_p
+
hack by hand, the size of the function is 209700 though, so if really
wanted, we could add __attribute__((__noipa__)) to the function when
building with recent enough GCC and pass pointers to the first elements
of the 5 arrays to the function as arguments.  If you want such a change,
could that be done incrementally?

2024-06-03  Jakub Jelinek  <jakub@redhat.com>

PR target/115324
* config/rs6000/rs6000-gen-builtins.cc (write_decls): Remove
GTY markup from struct bifdata and struct ovlddata and remove their
fntype members.  Change next member in struct ovlddata and
first_instance member of struct ovldrecord to have int type rather
than struct ovlddata *.  Remove GTY markup from rs6000_builtin_info
and rs6000_instance_info arrays, declare new
rs6000_builtin_info_fntype and rs6000_instance_info_fntype arrays,
which have GTY markup.
(write_bif_static_init): Adjust for the above changes.
(write_ovld_static_init): Likewise.
(write_init_bif_table): Likewise.
(write_init_ovld_table): Likewise.
* config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Likewise.
* config/rs6000/rs6000-c.cc (find_instance): Likewise.  Make static.
(altivec_resolve_overloaded_builtin): Adjust for the above changes.

(cherry picked from commit 4cf2de9b5268224816a3d53fdd2c3d799ebfd9c8)

combine: Fix up simplify_compare_const [PR115092]

The following testcases are miscompiled (with tons of GIMPLE
optimization disabled) because combine sees GE comparison of
1-bit sign_extract (i.e. something with [-1, 0] value range)
with (const_int -1) (which is always true) and optimizes it into
NE comparison of 1-bit zero_extract ([0, 1] value range) against
(const_int 0).
The reason is that simplify_compare_const first (correctly)
simplifies the comparison to
GE (ashift:SI something (const_int 31)) (const_int -2147483648)
and then an optimization for when the second operand is power of 2
triggers. That optimization is fine for power of 2s which aren't
the signed minimum of the mode, or if it is NE, EQ, GEU or LTU
against the signed minimum of the mode, but for GE or LT optimizing
it into NE (or EQ) against const0_rtx is wrong, those cases
are always true or always false (but the function doesn't have
a standardized way to tell callers the comparison is now unconditional).

The following patch just disables the optimization in that case.

2024-05-15 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/114902
PR rtl-optimization/115092
* combine.cc (simplify_compare_const): Don't optimize
GE op0 SIGNED_MIN or LT op0 SIGNED_MIN into NE op0 const0_rtx or
EQ op0 const0_rtx.

* gcc.dg/pr114902.c: New test.
* gcc.dg/pr115092.c: New test.

(cherry picked from commit 0b93a0ae153ef70a82ff63e67926a01fdab9956b)

testsuite: gm2: Remove timeout overrides [PR114886]

A large number of gm2 tests are timing out even on current Solaris/SPARC
systems.  As detailed in the PR, the problem is that the gm2 testsuite
artificially lowers many timeouts way below the DejaGnu default of 300
seconds, often as short as 10 seconds.  The problem lies both in the
values (they may be appropriate for some targets, but too low for
others, especially under high load) and the fact that it uses absolute
values, overriding e.g. settings from a build-wide site.exp.

Therefore this patch removes all those overrides, restoring the
defaults.

Tested on sparc-sun-solaris2.11 (where all the previous timeouts are
gone) and i386-pc-solaris2.11.

2024-04-29  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
PR modula2/114886
* lib/gm2.exp: Don't load timeout-dg.exp.
Don't set gm2_previous_timeout.
Don't call dg-timeout.
(gm2_push_timeout, gm2_pop_timeout): Remove.
(gm2_init): Don't call dg-timeout.
* lib/gm2-torture.exp: Don't load timeout-dg.exp.
Don't set gm2_previous_timeout.
Don't call dg-timeout.
(gm2_push_timeout, gm2_pop_timeout): Remove.

* gm2/coroutines/pim/run/pass/coroutines-pim-run-pass.exp: Don't
load timeout-dg.exp.
Don't call gm2_push_timeout, gm2_pop_timeout.
* gm2/examples/map/pass/examples-map-pass.exp: Don't call
gm2_push_timeout, gm2_pop_timeout.
* gm2/iso/run/pass/iso-run-pass.exp: Don't load timeout-dg.exp.
Don't call gm2_push_timeout, gm2_pop_timeout.
* gm2/pimlib/base/run/pass/pimlib-base-run-pass.exp: Don't load
timeout-dg.exp.
Don't call gm2_push_timeout, gm2_pop_timeout.
* gm2/projects/iso/run/pass/halma/projects-iso-run-pass-halma.exp:
Don't call gm2_push_timeout, gm2_pop_timeout.
* gm2/switches/whole-program/pass/run/switches-whole-program-pass-run.exp:
Don't load timeout-dg.exp.
Don't call gm2_push_timeout, gm2_pop_timeout.

(cherry picked from commit aff63ac11099d100b6891f3bcc3dc6cbc4fad654)

libstdc++: Build libbacktrace and 19_diagnostics/stacktrace with -funwind-tables [PR111641]

Several of the 19_diagnostics/stacktrace tests FAIL on Solaris/SPARC (32
and 64-bit), Solaris/x86 (32-bit only), and several other targets:

FAIL: 19_diagnostics/stacktrace/current.cc  -std=gnu++23 execution test
FAIL: 19_diagnostics/stacktrace/current.cc  -std=gnu++26 execution test
FAIL: 19_diagnostics/stacktrace/entry.cc  -std=gnu++23 execution test
FAIL: 19_diagnostics/stacktrace/entry.cc  -std=gnu++26 execution test
FAIL: 19_diagnostics/stacktrace/output.cc  -std=gnu++23 execution test
FAIL: 19_diagnostics/stacktrace/output.cc  -std=gnu++26 execution test
FAIL: 19_diagnostics/stacktrace/stacktrace.cc  -std=gnu++23 execution test
FAIL: 19_diagnostics/stacktrace/stacktrace.cc  -std=gnu++26 execution test

As it turns out, both the copy of libbacktrace in libstdc++ and the
testcases proper need to compiled with -funwind-tables, as is done for
libbacktrace itself.

This isn't an issue on Linux/x86_64 and Solaris/amd64 since 64-bit x86
always defaults to -funwind-tables.  32-bit x86 does, too, when
-fomit-frame-pointer is enabled as on Linux/i686, but unlike
Solaris/i386.

So this patch always enables the option both for the libbacktrace copy
and the testcases.

Tested on i386-pc-solaris2.11, sparc-sun-solaris2.11, and
x86_64-pc-linux-gnu.

2024-05-23  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

libstdc++-v3:
PR libstdc++/111641
* src/libbacktrace/Makefile.am (AM_CFLAGS): Add -funwind-tables.
* src/libbacktrace/Makefile.in: Regenerate.

* testsuite/19_diagnostics/stacktrace/current.cc (dg-options): Add
-funwind-tables.
* testsuite/19_diagnostics/stacktrace/entry.cc: Likewise.
* testsuite/19_diagnostics/stacktrace/hash.cc: Likewise.
* testsuite/19_diagnostics/stacktrace/output.cc: Likewise.
* testsuite/19_diagnostics/stacktrace/stacktrace.cc: Likewise.

(cherry picked from commit a99ebb88f8f25e76ebed5afc22e64fa77a2f0d3f)

Daily bump.

libstdc++: Fix -Wstringop-overflow warning coming from std::vector [PR109849]

libstdc++-v3/ChangeLog:

PR libstdc++/109849
* include/bits/vector.tcc
(std::vector<>::_M_range_insert(iterator, _FwdIt, _FwdIt,
forward_iterator_tag))[__cplusplus < 201103L]: Add __builtin_unreachable
expression to tell the compiler that the allocated buffer is large enough to
receive current elements plus the elements of the range to insert.

(cherry picked from commit 0426be454448f8cfb9db21f4f669426afb7b57c8)

Add AVX10.1 target_clones support

Since AVX10 is the first major ISA introduced after AVX-512, we propose
to add target_clones support for it.

Although AVX10.1-256 won't cover 512-bit part of AVX512F, but since
it is only for priority but not for implication, it won't be an issue.

gcc/ChangeLog:

* common/config/i386/i386-common.cc: Change Granite Rapids
series CPU type to P_PROC_AVX10_1_512.
* common/config/i386/i386-cpuinfo.h (enum feature_priority):
Revise comment part. Add P_AVX10_1_256, P_AVX10_1_512,
P_PROC_AVX10_1_512.
* common/config/i386/i386-isas.h: Link to avx10.1-256, avx10.1-512.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx10_1-25.c: New test.
* gcc.target/i386/avx10_1-26.c: Ditto.

Daily bump.

AVR: target/115317 - Make isinf(-Inf) return -1.

PR target/115317
libgcc/config/avr/libf7/
* libf7-asm.sx (__isinf): Map -Inf to -1.

gcc/testsuite/
* gcc.target/avr/torture/pr115317-isinf.c: New test.

(cherry picked from commit f12454278dc725fec3520a5d870e967d79292ee6)

libstdc++: Replace link to gcc-4.3.2 docs in manual [PR115269]

Link to the docs for GCC trunk instead. For the release branches, the
link should be to the docs for appropriate release branch.

Also replace the incomplete/outdated list of explicit -std options with
a single entry for the -std option.

libstdc++-v3/ChangeLog:

PR libstdc++/115269
* doc/xml/manual/using.xml: Replace link to gcc-4.3.2 docs.
Replace list of -std=... options with a single entry for -std.
* doc/html/manual/using.html: Regenerate.

(cherry picked from commit b460ede64f9471589822831e04eecff4a3dbecf2)

AVR: tree-optimization/115307 - Work around isinf bloat from early passes.

PR tree-optimization/115307
gcc/
* config/avr/avr.md (SFDF): New mode iterator.
(isinf<mode>2) [sf, df]: New expanders.

gcc/testsuite/
* gcc.target/avr/torture/pr115307-isinf.c: New test.

(cherry picked from commit fabb545026f714b7d1cbe586f4c5bbf6430bdde3)

Daily bump.

alpha: Fix invalid RTX in divmodsi insn patterns [PR115297]

any_divmod instructions are modelled with invalid RTX:

  [(set (match_operand:DI 0 "register_operand" "=c")
        (sign_extend:DI (match_operator:SI 3 "divmod_operator"
                        [(match_operand:DI 1 "register_operand" "a")
                         (match_operand:DI 2 "register_operand" "b")])))
   (clobber (reg:DI 23))
   (clobber (reg:DI 28))]

where SImode divmod_operator (div,mod,udiv,umod) has DImode operands.

Wrap input operand with truncate:SI to make machine modes consistent.

PR target/115297

gcc/ChangeLog:

* config/alpha/alpha.md (<any_divmod:code>si3): Wrap DImode
operands 3 and 4 with truncate:SI RTX.
(*divmodsi_internal_er): Ditto for operands 1 and 2.
(*divmodsi_internal_er_1): Ditto.
(*divmodsi_internal): Ditto.
* config/alpha/constraints.md ("b"): Correct register
number in the description.

gcc/testsuite/ChangeLog:

* gcc.target/alpha/pr115297.c: New test.

(cherry picked from commit 0ac802064c2a018cf166c37841697e867de65a95)

vect: Fix access size alignment assumption [PR115192]

create_intersect_range_checks checks whether two access ranges
a and b are alias-free using something equivalent to:

  end_a <= start_b || end_b <= start_a

It has two ways of doing this: a "vanilla" way that calculates
the exact exclusive end pointers, and another way that uses the
last inclusive aligned pointers (and changes the comparisons
accordingly).  The comment for the latter is:

      /* Calculate the minimum alignment shared by all four pointers,
then arrange for this alignment to be subtracted from the
exclusive maximum values to get inclusive maximum values.
This "- min_align" is cumulative with a "+ access_size"
in the calculation of the maximum values.  In the best
(and common) case, the two cancel each other out, leaving
us with an inclusive bound based only on seg_len.  In the
worst case we're simply adding a smaller number than before.

The problem is that the associated code implicitly assumed that the
access size was a multiple of the pointer alignment, and so the
alignment could be carried over to the exclusive end pointer.

The testcase started failing after g:9fa5b473b5b8e289b6542
because that commit improved the alignment information for
the accesses.

gcc/
PR tree-optimization/115192
* tree-data-ref.cc (create_intersect_range_checks): Take the
alignment of the access sizes into account.

gcc/testsuite/
PR tree-optimization/115192
* gcc.dg/vect/pr115192.c: New test.

(cherry picked from commit a0fe4fb1c8d7804515845dd5d2a814b3c7a1ccba)

i386: Fix ix86_option override after change [PR 113719]

In ix86_override_options_after_change, calls to ix86_default_align
and ix86_recompute_optlev_based_flags will cause mismatched target
opt_set when doing cl_optimization_restore. Move them back to
ix86_option_override_internal to solve the issue.

gcc/ChangeLog:

PR target/113719
* config/i386/i386-options.cc (ix86_override_options_after_change):
Remove call to ix86_default_align and
ix86_recompute_optlev_based_flags.
(ix86_option_override_internal): Call ix86_default_align and
ix86_recompute_optlev_based_flags.

(cherry picked from commit 499d00127d39ba894b0f7216d73660b380bdc325)

Daily bump.

MIPS16: Mark $2/$3 as clobbered if GP is used

PR Target/84790.
The gp init sequence
        li      $2,%hi(_gp_disp)
        addiu   $3,$pc,%lo(_gp_disp)
        sll     $2,16
        addu    $2,$3
is generated directly in `mips_output_function_prologue`, and does
not appear in the RTL.

So the IRA/IPA passes are not aware that $2/$3 have been clobbered,
so they may be used for cross (local) function call.

Let's mark $2/$3 clobber both:
  - Just after the UNSPEC_GP RTL of a function;
  - Just after a function call.

Reported-by: Matthias Schiffer <mschiffer@universe-factory.net>
Origin-Patch-by: Felix Fietkau <nbd@nbd.name>.
gcc
* config/mips/mips.cc(mips16_gp_pseudo_reg): Mark
MIPS16_PIC_TEMP and MIPS_PROLOGUE_TEMP clobbered.
(mips_emit_call_insn): Mark MIPS16_PIC_TEMP and
MIPS_PROLOGUE_TEMP clobbered if MIPS16 and CALL_CLOBBERED_GP.

(cherry picked from commit 915440eed21de367cb41857afb5273aff5bcb737)

Daily bump.