gcc.gnu.org Git - gcc.git/log

LoongArch: Merge template got_load_tls_{ld/gd/le/ie}.

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_load_tls):
Load all types of tls symbols through one function.
(loongarch_got_load_tls_gd): Delete.
(loongarch_got_load_tls_ld): Delete.
(loongarch_got_load_tls_ie): Delete.
(loongarch_got_load_tls_le): Delete.
(loongarch_call_tls_get_addr): Modify the called function name.
(loongarch_legitimize_tls_address): Likewise.
* config/loongarch/loongarch.md (@got_load_tls_gd<mode>): Delete.
(@load_tls<mode>): New template.
(@got_load_tls_ld<mode>): Delete.
(@got_load_tls_le<mode>): Delete.
(@got_load_tls_ie<mode>): Delete.

LoongArch: Modify the address calculation logic for obtaining array element values through fp.

Modify address calculation logic from (((a x C) + fp) + offset) to ((fp + offset) + a x C).
Thereby modifying the register dependencies and optimizing the code.
The value of C is 2 4 or 8.

The following is the assembly code before and after a loop modification in spec2006 401.bzip:

                 old                      |                 new
735 .L71:                                |  735 .L71:
736         slli.d  $r12,$r15,2          |  736         slli.d  $r12,$r15,2
737         ldx.w   $r13,$r22,$r12       |  737         ldx.w   $r13,$r22,$r12
738         addi.d  $r15,$r15,-1         |  738         addi.d  $r15,$r15,-1
739         slli.w  $r16,$r15,0          |  739         slli.w  $r16,$r15,0
740         addi.w  $r13,$r13,-1         |  740         addi.w  $r13,$r13,-1
741         slti    $r14,$r13,0          |  741         slti    $r14,$r13,0
742         add.w   $r12,$r26,$r13       |  742         add.w   $r12,$r26,$r13
743         maskeqz $r12,$r12,$r14       |  743         maskeqz $r12,$r12,$r14
744         masknez $r14,$r13,$r14       |  744         masknez $r14,$r13,$r14
745         or      $r12,$r12,$r14       |  745         or      $r12,$r12,$r14
746         ldx.bu  $r14,$r30,$r12       |  746         ldx.bu  $r14,$r30,$r12
747         lu12i.w $r13,4096>>12        |  747         alsl.d  $r14,$r14,$r18,2
748         ori     $r13,$r13,432        |  748         ldptr.w $r13,$r14,0
749         add.d   $r13,$r13,$r3        |  749         addi.w  $r17,$r13,-1
750         alsl.d  $r14,$r14,$r13,2     |  750         stptr.w $r17,$r14,0
751         ldptr.w $r13,$r14,-1968      |  751         slli.d  $r13,$r13,2
752         addi.w  $r17,$r13,-1         |  752         stx.w   $r12,$r22,$r13
753         st.w    $r17,$r14,-1968      |  753         ldptr.w $r12,$r19,0
754         slli.d  $r13,$r13,2          |  754         blt     $r12,$r16,.L71
755         stx.w   $r12,$r22,$r13       |  755         .align  4
756         ldptr.w $r12,$r18,-2048      |  756
757         blt     $r12,$r16,.L71       |  757
758         .align  4                    |  758

This patch is ported from riscv's commit r14-3111.

gcc/ChangeLog:

* config/loongarch/loongarch.cc (mem_shadd_or_shadd_rtx_p): New function.
(loongarch_legitimize_address): Add logical transformation code.

Daily bump.

c++: -Wdangling-reference tweak to unbreak aarch64

My recent -Wdangling-reference change to not warn on std::span-like classes
unfortunately caused a new warning: extending reference_like_class_p also
opens the door to new warnings since we use reference_like_class_p for
checking the return type of the function: either it must be a reference
or a reference_like_class_p.

We can consider even non-templates as std::span-like to get rid of the
warning here.

gcc/cp/ChangeLog:

* call.cc (reference_like_class_p): Consider even non-templates for
std::span-like classes.

gcc/ChangeLog:

* doc/invoke.texi: Update -Wdangling-reference documentation.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference21.C: New test.

i386: Improve *cmp<dwi>_doubleword splitter [PR113701]

The fix for PR70321 introduced a splitter that split a doubleword
comparison into a pair of XORs followed by an IOR to set the (zero)
flags register.  To help the reload, splitter forced SUBREG pieces of
double-word input values to a pseudo, but this regressed
gcc.target/i386/pr82580.c:

int f0 (U x, U y) { return x == y; }

from:
xorq    %rdx, %rdi
xorq    %rcx, %rsi
xorl    %eax, %eax
orq     %rsi, %rdi
sete    %al
ret

to:
xchgq   %rdi, %rsi
movq    %rdx, %r8
movq    %rcx, %rax
movq    %rsi, %rdx
movq    %rdi, %rcx
xorq    %rax, %rcx
xorq    %r8, %rdx
xorl    %eax, %eax
orq     %rcx, %rdx
sete    %al
ret

To mitigate the regression, remove this legacy heuristic (workaround?).
There have been many incremental changes and improvements to x86 TImode
and register allocation, so this legacy workaround is not only no longer
useful, but it actually hurts register allocation.  The patched compiler
now produces:

        xchgq   %rdi, %rsi
        xorl    %eax, %eax
        xorq    %rsi, %rdx
        xorq    %rdi, %rcx
        orq     %rcx, %rdx
        sete    %al
        ret

PR target/113701

gcc/ChangeLog:

* config/i386/i386.md (*cmp<dwi>_doubleword):
Do not force SUBREG pieces to pseudos.

libgcc: Avoid warnings on __gcc_nested_func_ptr_created [PR113402]

I'm seeing hundreds of
In file included from ../../../libgcc/libgcc2.c:56:
../../../libgcc/libgcc2.h:32:13: warning: conflicting types for built-in function ‘__gcc_nested_func_ptr_created’; expected ‘void(void *, void *, void *)’
+[-Wbuiltin-declaration-mismatch]
   32 | extern void __gcc_nested_func_ptr_created (void *, void *, void **);
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
warnings.

Either we need to add like in r14-6218
  #pragma GCC diagnostic ignored "-Wbuiltin-declaration-mismatch"
(but in that case because of the libgcc2.h prototype (why is it there?)
it would need to be also with #pragma GCC diagnostic push/pop around),
or we could go with just following how the builtins are prototyped on the
compiler side and only cast to void ** when dereferencing (which is in
a single spot in each TU).

2024-02-01  Jakub Jelinek  <jakub@redhat.com>

PR libgcc/113402
* libgcc2.h (__gcc_nested_func_ptr_created): Change type of last
argument from void ** to void *.
* config/i386/heap-trampoline.c (__gcc_nested_func_ptr_created):
Change type of dst from void ** to void * and cast dst to void **
before dereferencing it.
* config/aarch64/heap-trampoline.c (__gcc_nested_func_ptr_created):
Likewise.

libgcc: Fix up i386/t-heap-trampoline [PR113403]

I'm seeing
../../../libgcc/shared-object.mk:14: warning: overriding recipe for target 'heap-trampoline.o'
../../../libgcc/shared-object.mk:14: warning: ignoring old recipe for target 'heap-trampoline.o'
../../../libgcc/shared-object.mk:17: warning: overriding recipe for target 'heap-trampoline_s.o'
../../../libgcc/shared-object.mk:17: warning: ignoring old recipe for target 'heap-trampoline_s.o'

This patch fixes that.

2024-02-01 Jakub Jelinek <jakub@redhat.com>

PR libgcc/113403
* config/i386/t-heap-trampoline: Add to LIB2ADDEHSHARED
i386/heap-trampoline.c rather than aarch64/heap-trampoline.c.

libstdc++: Implement P2165R4 changes to std::pair/tuple/etc [PR113309]

This implements the C++23 paper P2165R4 Compatibility between tuple,
pair and tuple-like objects, which builds upon many changes from the
earlier C++23 paper P2321R2 zip.

Some declarations had to be moved around so that they're visible from
<bits/stl_pair.h> without introducing new includes and bloating the
header. In the end, the only new include is for <bits/utility.h> from
<bits/stl_iterator.h>, for tuple_element_t.

PR libstdc++/113309
PR libstdc++/109203

libstdc++-v3/ChangeLog:

* include/bits/ranges_util.h (__detail::__pair_like): Don't
define in C++23 mode.
(__detail::__pair_like_convertible_from): Adjust as per P2165R4.
(__detail::__is_subrange<subrange>): Moved from <ranges>.
(__detail::__is_tuple_like_v<subrange>): Likewise.
* include/bits/stl_iterator.h: Include <bits/utility.h> for
C++23.
(__different_from): Move to <concepts>.
(__iter_key_t): Adjust for C++23 as per P2165R4.
(__iter_val_t): Likewise.
* include/bits/stl_pair.h (pair, array): Forward declare.
(get): Forward declare all overloads relevant to P2165R4
tuple-like constructors.
(__is_tuple_v): Define for C++23.
(__is_tuple_like_v): Define for C++23.
(__tuple_like): Define for C++23 as per P2165R4.
(__pair_like): Define for C++23 as per P2165R4.
(__eligibile_tuple_like): Define for C++23.
(__eligibile_pair_like): Define for C++23.
(pair::_S_constructible_from_pair_like): Define for C++23.
(pair::_S_convertible_from_pair_like): Define for C++23.
(pair::_S_dangles_from_pair_like): Define for C++23.
(pair::pair): Define overloads taking a tuple-like type for
C++23 as per P2165R4.
(pair::_S_assignable_from_tuple_like): Define for C++23.
(pair::_S_const_assignable_from_tuple_like): Define for C++23.
(pair::operator=): Define overloads taking a tuple-like type for
C++23 as per P2165R4.
* include/bits/utility.h (ranges::__detail::__is_subrange):
Moved from <ranges>.
* include/bits/version.def (tuple_like): Define for C++23.
* include/bits/version.h: Regenerate.
* include/std/concepts (__different_from): Moved from
<bits/stl_iterator.h>.
(ranges::__swap::__adl_swap): Clarify which __detail namespace.
* include/std/map (__cpp_lib_tuple_like): Define C++23.
* include/std/ranges (__detail::__is_subrange): Moved to
<bits/utility.h>.
(__detail::__is_subrange<subrange>): Moved to <bits/ranges_util.h>
(__detail::__has_tuple_element): Adjust for C++23 as per P2165R4.
(__detail::__tuple_or_pair): Remove as per P2165R4. Replace all
uses with plain tuple as per P2165R4.
* include/std/tuple (__cpp_lib_tuple_like): Define for C++23.
(__tuple_like_tag_t): Define for C++23.
(__tuple_cmp): Forward declare for C++23.
(_Tuple_impl::_Tuple_impl): Define overloads taking
__tuple_like_tag_t and a tuple-like type for C++23.
(_Tuple_impl::_M_assign): Likewise.
(tuple::__constructible_from_tuple_like): Define for C++23.
(tuple::__convertible_from_tuple_like): Define for C++23.
(tuple::__dangles_from_tuple_like): Define for C++23.
(tuple::tuple): Define overloads taking a tuple-like type for
C++23 as per P2165R4.
(tuple::__assignable_from_tuple_like): Define for C++23.
(tuple::__const_assignable_from_tuple_like): Define for C++23.
(tuple::operator=): Define overloads taking a tuple-like type
for C++23 as per P2165R4.
(tuple::__tuple_like_common_comparison_category): Define for C++23.
(tuple::operator<=>): Define overload taking a tuple-like type
for C++23 as per P2165R4.
(array, get): Forward declarations moved to <bits/stl_pair.h>.
(tuple_cat): Constrain with __tuple_like for C++23 as per P2165R4.
(apply): Likewise.
(make_from_tuple): Likewise.
(__tuple_like_common_reference): Define for C++23.
(basic_common_reference): Adjust as per P2165R4.
(__tuple_like_common_type): Define for C++23.
(common_type): Adjust as per P2165R4.
* include/std/unordered_map (__cpp_lib_tuple_like): Define for
C++23.
* include/std/utility (__cpp_lib_tuple_like): Define for C++23.
* testsuite/std/ranges/zip/1.cc (test01): Adjust to handle pair
and 2-tuple interchangeably.
(test05): New test.
* testsuite/20_util/pair/p2165r4.cc: New test.
* testsuite/20_util/tuple/p2165r4.cc: New test.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++/pair: Factor out const-assignability helper for C++20

This is consistent with std::tuple's __const_assignable helper, and will
be useful for implementing the new pair::operator= overloads from P2165R4.

libstdc++-v3/ChangeLog:

* include/bits/stl_pair.h (pair::_S_const_assignable): Define,
factored out from ...
(pair::operator=): ... the constraints of the const overloads.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

Set num_threads to 50 on 32-bit hppa in two libgomp loop tests

We support a maximum of 50 threads on 32-bit hppa.

2024-02-01 John David Anglin <danglin@gcc.gnu.org>

libgomp/ChangeLog:

* testsuite/libgomp.c++/loop-3.C: Set num_threads to 50
on 32-bit hppa.
* testsuite/libgomp.c/omp-loop03.c: Likewise.

xfail gnat.dg/trampoline3.adb scan-assembler-not check on hppa*-*-*

We still require an executable stack for trampolines on hppa*-*-*.

2024-02-01 John David Anglin <danglin@gcc.gnu.org>

gcc/testsuite/ChangeLog:

* gnat.dg/trampoline3.adb: xfail scan-assembler-not
check on hppa*-*-*.

hppa: Fix bug in atomic_storedi_1 pattern

The first alternative stores the floating-point status register
in the destination. It should store zero. We need to copy %fr0
to another floating-point register to initialize it to zero.

2024-02-01 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa.md (atomic_storedi_1): Fix bug in
alternative 1.

c++: ttp TEMPLATE_DECL equivalence [PR112737]

Here during declaration matching we undesirably consider the two TT{42}
CTAD expressions to be non-equivalent ultimately because for CTAD
placeholder equivalence we compare the TEMPLATE_DECLs via pointer identity,
and here the corresponding TEMPLATE_DECLs for TT are different since
they're from different scopes.  On the other hand, the corresponding
TEMPLATE_TEMPLATE_PARMs are deemed equivalent according to cp_tree_equal
(since they have the same position and template parameters).  This turns
out to be the root cause of some of the xtreme-header modules regressions.

So this patch relaxes ttp CTAD placeholder equivalence accordingly, by
comparing the TEMPLATE_TEMPLATE_PARM instead of the TEMPLATE_DECL.  It
turns out this issue also affects function template-id equivalence as
with g<TT> in the second testcase, so it makes sense to relax TEMPLATE_DECL
equivalence more generally in cp_tree_equal.  In passing this patch
improves ctp_hasher::hash for CTAD placeholders, so that they don't
all get the same hash.

PR c++/112737

gcc/cp/ChangeLog:

* pt.cc (iterative_hash_template_arg) <case TEMPLATE_DECL>:
Adjust hashing to match cp_tree_equal.
(ctp_hasher::hash): Also hash CLASS_PLACEHOLDER_TEMPLATE.
* tree.cc (cp_tree_equal) <case TEMPLATE_DECL>: Return true
for ttp TEMPLATE_DECLs if their TEMPLATE_TEMPLATE_PARMs are
equivalent.
* typeck.cc (structural_comptypes) <case TEMPLATE_TYPE_PARM>:
Use cp_tree_equal to compare CLASS_PLACEHOLDER_TEMPLATE.

gcc/testsuite/ChangeLog:

* g++.dg/template/ttp42.C: New test.
* g++.dg/template/ttp43.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

AVR: Tabify avr.cc

gcc/
* config/avr/avr.cc: Tabify.

middle-end: Fix ICE in poly-int.h due to SLP.

Adds a check to ensure that the input vector arguments
to a function are not variable length. Previously, only the
output vector of a function was checked.

The ICE in question is within the neon-sve-bridge.c test,
and is related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111268

gcc/ChangeLog:
PR tree-optimization/111268
* tree-vect-slp.cc (vectorizable_slp_permutation_1):
Add variable-length check for vector input arguments
to a function.

libstdc++: Do not use def-file-line for each macro in <bits/version.h>

These line markers are not needed, because searching <bits/version.def>
for a macro name works fine. Removing them means that small changes to
<bits/version.def> do not result in large diffs to <bits/version.h>
because of all the changed line numbers.

libstdc++-v3/ChangeLog:

* include/bits/version.tpl: Do not use def-file-line for each
macro being defined.
* include/bits/version.h: Regenerate.

libstdc++: Update expected error for debug/constexpr*_neg.cc tests

We no longer hit a __builtin_unreachable() in these tests, so we need to
update the dg-error patterns to match _Error_formatter::_M_error().

We can also remove some dg-prune-output directives matching notes saying
"in 'constexpr' expansion" because that's done globally in prune.exp.

libstdc++-v3/ChangeLog:

* testsuite/25_algorithms/copy/debug/constexpr_neg.cc: Adjust
dg-error pattern.
* testsuite/25_algorithms/copy_backward/debug/constexpr_neg.cc:
Likewise.
* testsuite/25_algorithms/equal/debug/constexpr_neg.cc:
Likewise.
* testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_neg.cc:
Likewise.
* testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_pred_neg.cc:
Likewise.
* testsuite/25_algorithms/lower_bound/debug/constexpr_valid_range_neg.cc:
Likewise.
* testsuite/25_algorithms/upper_bound/debug/constexpr_partitioned_neg.cc:
Likewise.
* testsuite/25_algorithms/upper_bound/debug/constexpr_partitioned_pred_neg.cc:
Likewise.
* testsuite/25_algorithms/upper_bound/debug/constexpr_valid_range_neg.cc:
Likewise.

libstdc++: Fix -Wdeprecated warning about implicit capture of 'this'

In C++20 it's deprecated for a [=] lambda capture to capture the 'this'
pointer. Using resize_and_overwrite with a lambda seems like overkill to
write three chars to the string anyway. Just resize the string and
overwrite the end of it directly.

libstdc++-v3/ChangeLog:

* include/experimental/internet (network_v4::to_string()):
Remove lambda and use of resize_and_overwrite.

GCN: Don't hard-code number of SGPR/VGPR/AVGPR registers

Also add 'STATIC_ASSERT's for number of SGPR/VGPR/AVGPR registers (in
'#ifndef USED_FOR_TARGET', as otherwise 'STATIC_ASSERT' isn't available).

gcc/
* config/gcn/gcn.cc (gcn_hsa_declare_function_name): Don't
hard-code number of SGPR/VGPR/AVGPR registers.
* config/gcn/gcn.h: Add a 'STATIC_ASSERT's for number of
SGPR/VGPR/AVGPR registers.

libcpp: Stabilize the location for macros restored after PCH load [PR105608]

libcpp currently lacks the infrastructure to assign correct locations to
macros that were defined prior to loading a PCH and then restored
afterwards. While I plan to address that fully for GCC 15, this patch
improves things by using at least a valid location, even if it's not the
best one. Without this change, libcpp uses pfile->directive_line as the
location for the restored macros, but this location_t applies to the old
line map, not the one that was just restored from the PCH, so the resulting
location is unpredictable and depends on what was stored in the line maps
before. With this change, all restored macros get assigned locations at the
line of the #include that triggered the PCH restore. A future patch will
store the actual file name and line number of each definition and then
synthesize locations in the new line map pointing to the right place.

gcc/c-family/ChangeLog:

PR preprocessor/105608
* c-pch.cc (c_common_read_pch): Adjust line map so that libcpp
assigns a location to restored macros which is the same location
that triggered the PCH include.

libcpp/ChangeLog:

PR preprocessor/105608
* pch.cc (cpp_read_state): Set a valid location for restored
macros.

c++: ICE with throw inside concept [PR112437]

We crash in the loop at the end of treat_lvalue_as_rvalue_p for code
like

template <class T>
concept Throwable = requires(T x) { throw x; };

because the code assumes that we eventually reach sk_function_parms or
sk_try and bail, but in a concept we're in a sk_namespace.

We're already checking sk_try so we don't crash in a function-try-block,
but I've added a test anyway.

PR c++/112437

gcc/cp/ChangeLog:

* typeck.cc (treat_lvalue_as_rvalue_p): Bail out on sk_namespace in
the move on throw of parms loop.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-throw1.C: New test.
* g++.dg/eh/throw4.C: New test.

RISC-V: Support scheduling for sifive p600 series

Add sifive p600 series scheduler module. For more information
see https://www.sifive.com/cores/performance-p650-670.
Add sifive-p650, sifive-p670 for mcpu option will come in separate patches.

gcc/ChangeLog:

* config/riscv/riscv.md: Add "fcvt_i2f", "fcvt_f2i" type
attribute, and include sifive-p600.md.
* config/riscv/generic-ooo.md: Update type attribute.
* config/riscv/generic.md: Update type attribute.
* config/riscv/sifive-7.md: Update type attribute.
* config/riscv/sifive-p600.md: New file.
* config/riscv/riscv-cores.def (RISCV_TUNE): Add parameter.
* config/riscv/riscv-opts.h (enum riscv_microarchitecture_type):
Add sifive_p600.
* config/riscv/riscv.cc (sifive_p600_tune_info): New.
* config/riscv/riscv.h (TARGET_SFB_ALU): Update.
* doc/invoke.texi (RISC-V Options): Add sifive-p600-series

RISC-V: Add minimal support for 7 new unprivileged extensions

The RISC-V Profiles specification here:
https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc#7-new-isa-extensions

These extensions don't add any new features but
describe existing features. So this patch only adds parsing.

Za64rs: Reservation set size of 64 bytes
Za128rs: Reservation set size of 128 bytes
Ziccif: Main memory supports instruction fetch with atomicity requirement
Ziccrse: Main memory supports forward progress on LR/SC sequences
Ziccamoa: Main memory supports all atomics in A
Zicclsm: Main memory supports misaligned loads/stores
Zic64b: Cache block size isf 64 bytes

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add Za64rs, Za128rs,
Ziccif, Ziccrse, Ziccamoa, Zicclsm, Zic64b items.
* config/riscv/riscv.opt: New macro for 7 new unprivileged
extensions.
* doc/invoke.texi (RISC-V Options): Add Za64rs, Za128rs,
Ziccif, Ziccrse, Ziccamoa, Zicclsm, Zic64b extensions.
gcc/testsuite/ChangeLog:

* gcc.target/riscv/za-ext.c: New test.
* gcc.target/riscv/zi-ext.c: New test.

Link shared libasan with -z now on Solaris

g++.dg/asan/default-options-1.C FAILs on Solaris/SPARC and x86:

FAIL: g++.dg/asan/default-options-1.C   -O0  execution test
FAIL: g++.dg/asan/default-options-1.C   -O1  execution test
FAIL: g++.dg/asan/default-options-1.C   -O2  execution test
FAIL: g++.dg/asan/default-options-1.C   -O2 -flto  execution test
FAIL: g++.dg/asan/default-options-1.C   -O2 -flto -flto-partition=none  execution test
FAIL: g++.dg/asan/default-options-1.C   -O3 -g  execution test
FAIL: g++.dg/asan/default-options-1.C   -Os  execution test

The failure is always the same:

AddressSanitizer: CHECK failed: asan_rtl.cpp:397 "((!AsanInitIsRunning() && "ASan init calls itself!")) != (0)" (0x0, 0x0) (tid=1)

This happens because libasan makes unportable assumptions about
initialization order that don't hold on Solaris.  The problem has
already been fixed in clang by

[Driver] Link shared asan runtime lib with -z now on Solaris/x86
https://reviews.llvm.org/D156325

where it was way more prevalent.

This patch applies the same fix to gcc.

Tested on i386-pc-solaris2.11 (ld and gld) and sparc-sun-solaris2.11.

2024-01-30  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc:
* config/sol2.h (LIBASAN_EARLY_SPEC): Add -z now unless
-static-libasan.  Add missing whitespace.

testsuite: i386: Fix gcc.target/i386/pr38534-1.c etc. on Solaris/x86

The gcc.target/i386/pr38534-1.c etc. tests FAIL on 32 and 64-bit
Solaris/x86:

FAIL: gcc.target/i386/pr38534-1.c scan-assembler-not push
FAIL: gcc.target/i386/pr38534-2.c scan-assembler-not push
FAIL: gcc.target/i386/pr38534-3.c scan-assembler-not push
FAIL: gcc.target/i386/pr38534-4.c scan-assembler-not push

The tests assume the Linux/x86 default of -fomit-frame-pointer, while
Solaris/x86 defaults to -fno-omit-frame-pointer.

Fixed by specifying -fomit-frame-pointer explicitly.

Tested on i386-pc-solaris2.11 and i686-pc-linux-gnu.

2024-01-30 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
* gcc.target/i386/pr38534-1.c: Add -fomit-frame-pointer to
dg-options.
* gcc.target/i386/pr38534-2.c: Likewise.
* gcc.target/i386/pr38534-3.c: Likewise.
* gcc.target/i386/pr38534-4.c: Likewise.

testsuite: i386: Fix gcc.target/i386/no-callee-saved-1.c etc. on Solaris/x86

The gcc.target/i386/no-callee-saved-[12].c tests FAIL on Solaris/x86:

FAIL: gcc.target/i386/no-callee-saved-1.c scan-assembler-not push
FAIL: gcc.target/i386/no-callee-saved-2.c scan-assembler-not push

In both cases, the test expect the Linux/x86 default of
-fomit-frame-pointer, while Solaris/x86 defaults to
-fno-omit-frame-pointer.

So this patch explicitly specifies -fomit-frame-pointer.

Tested on i386-pc-solaris2.11 (as and gas) and i686-pc-linux-gnu.

2024-01-30 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
* gcc.target/i386/no-callee-saved-1.c: Add -fomit-frame-pointer to
dg-options.
* gcc.target/i386/no-callee-saved-2.c: Likewise.

testsuite: i386: Fix gcc.target/i386/avx512vl-stv-rotatedi-1.c on 32-bit Solaris/x86

gcc.target/i386/avx512vl-stv-rotatedi-1.c FAILs on 32-bit Solaris/x86
since its introduction in

commit 4814b63c3c2326cb5d7baa63882da60ac011bd97
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Mon Jul 10 09:04:29 2023 +0100

    i386: Add AVX512 support for STV of SI/DImode rotation by constant.

FAIL: gcc.target/i386/avx512vl-stv-rotatedi-1.c scan-assembler-times vpro[lr]q 29

While the test depends on -mstv, 32-bit Solaris/x86 defaults to
-mstackrealign which is incompatible.

The patch thus specifies -mstv -mno-stackrealign explicitly.

Tested on i386-pc-solaris2.11 and i686-pc-linux-gnu.

2024-01-23  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
* gcc.target/i386/avx512vl-stv-rotatedi-1.c: Add -mstv
-mno-stackrealign to dg-options.

testsuite: i386: Fix gcc.target/i386/pr70321.c on 32-bit Solaris/x86

gcc.target/i386/pr70321.c FAILs on 32-bit Solaris/x86 since its
introduction in

commit 43201f2c2173894bf7c423cad6da1c21567e06c0
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Mon May 30 21:20:09 2022 +0100

    PR target/70321: Split double word equality/inequality after STV on x86.

FAIL: gcc.target/i386/pr70321.c scan-assembler-times mov 1

The failure happens because 32-bit Solaris/x86 defaults to
-fno-omit-frame-pointer.

Fixed by specifying -fomit-frame-pointer explicitly.

Tested on i386-pc-solaris2.11 and i686-pc-linux-gnu.

2024-01-23  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
* gcc.target/i386/pr70321.c: Add -fomit-frame-pointer to
dg-options.

c++: Fix g++.dg/ext/attr-section2.C etc. with Solaris/SPARC as

The new g++.dg/ext/attr-section2*.C tests FAIL on Solaris/SPARC with the
native assembler:

+FAIL: g++.dg/ext/attr-section2.C -std=c++14 scan-assembler
.(section|csect)[ \\\\t]+.foo
+FAIL: g++.dg/ext/attr-section2.C -std=c++17 scan-assembler
.(section|csect)[ \\\\t]+.foo
+FAIL: g++.dg/ext/attr-section2.C -std=c++20 scan-assembler
.(section|csect)[ \\\\t]+.foo

The problem is that the SPARC assembler requires the section name to be
double-quoted, like

        .section        ".foo%_Z3varIiE",#alloc,#write,#progbits

This patch allows for that.  At the same time, it quotes literal dots in
the REs.

Tested on sparc-sun-solaris2.11 (as and gas) and i386-pc-solaris2.11 (as
and gas).

2024-01-18  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
* g++.dg/ext/attr-section2.C (scan-assembler): Quote dots.  Allow
for double-quoted section name.
* g++.dg/ext/attr-section2a.C: Likewise.
* g++.dg/ext/attr-section2b.C: Likewise.

Daily bump.

GCN: Remove 'FIRST_{SGPR,VGPR,AVGPR}_REG', 'LAST_{SGPR,VGPR,AVGPR}_REG' from machine description

They're not used there, and we avoid potentially out-of-sync definitions.

gcc/
* config/gcn/gcn.md (FIRST_SGPR_REG, LAST_SGPR_REG)
(FIRST_VGPR_REG, LAST_VGPR_REG, FIRST_AVGPR_REG, LAST_AVGPR_REG):
Don't 'define_constants'.

GCN: Remove 'SGPR_OR_VGPR_REGNO_P' definition

..., which was always (a) unused, and (b) bogus: always-false.

gcc/
* config/gcn/gcn.h (SGPR_OR_VGPR_REGNO_P): Remove.

GCN, RDNA 3: Adjust 'sync_compare_and_swap<mode>_lds_insn'

For OpenACC/GCN '-march=gfx1100', a lot of libgomp OpenACC test cases FAIL:

    /tmp/ccGfLJ8a.mkoffload.2.s:406:2: error: instruction not supported on this GPU
            ds_cmpst_rtn_b32 v0, v0, v4, v3
            ^

In RDNA 3, 'ds_cmpst_[...]' has been replaced by 'ds_cmpstore_[...]', and the
notes for 'ds_cmpst_[...]' in pre-RDNA 3 ISA manuals:

    Caution, the order of src and cmp are the *opposite* of the BUFFER_ATOMIC_CMPSWAP opcode.

..., have been resolved for 'ds_cmpstore_[...]' in the RDNA 3 ISA manual:

    In this architecture the order of src and cmp agree with the BUFFER_ATOMIC_CMPSWAP opcode.

..., and therefore '%2', '%3' now swapped with regards to GCC operand order.
Most of the affected libgomp OpenACC test cases then PASS their execution test.

gcc/
* config/gcn/gcn.md (sync_compare_and_swap<mode>_lds_insn)
[TARGET_RDNA3]: Adjust.

PR modula2/111627 defend against ICE

Although PR 111627 can be fixed by renaming testsuite modules it
highlighted that a possible ICE can occur if a malformed
implementation module is actually a program module. This small
patch defends against this ICE and checks to see whether the module
is a DefImp before testing IsDefinitionForC.

gcc/m2/ChangeLog:

PR modula2/111627
PR modula2/112506
* gm2-compiler/M2Comp.mod (Pass0CheckMod): Test IsDefImp before
checking IsDefinitionForC.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

tree-optimization/113693 - LC SSA and region VN

The following fixes LC SSA preserving with region VN which was broken
when availability checking was enhanced to treat not visited value
numbers as available. The following makes sure to honor availability
data we put in place for LC SSA preserving instead.

PR tree-optimization/113693
* tree-ssa-sccvn.cc (rpo_elim::eliminate_avail): Honor avail
data when available.

* gcc.dg/pr113693.c: New testcase.

aarch64: libgcc: Cleanup ELF marking in asm

Use aarch64-asm.h in asm code consistently, this was started in

  commit c608ada288ced0268bbbbc1fd4136f56c34b24d4
  Author:     Zac Walker <zacwalker@microsoft.com>
  CommitDate: 2024-01-23 15:32:30 +0000

  Ifdef `.hidden`, `.type`, and `.size` pseudo-ops for `aarch64-w64-mingw32` target

But that commit failed to remove some existing markings from asm files,
which means some objects got double marked with gnu property notes.

libgcc/ChangeLog:

* config/aarch64/crti.S: Remove stack marking.
* config/aarch64/crtn.S: Remove stack marking, include aarch64-asm.h
* config/aarch64/lse.S: Remove stack and GNU property markings.

gimple-low: Remove .ASAN_MARK calls on TREE_STATIC variables [PR113531]

Since the r14-1500-g4d935f52b0d5c0 commit we promote an initializer_list
backing array to static storage where appropriate, but this happens after
we decided to add it to asan_poisoned_variables.  As a result we add
unpoison/poison for it to the gimple.  But then sanopt removes the unpoison.
So the second time we call the function and want to load from the array asan
still considers it poisoned.

The following patch fixes it by removing the .ASAN_MARK internal calls
during gimple lowering if they refer to TREE_STATIC vars.

2024-02-01  Jakub Jelinek  <jakub@redhat.com>
    Jason Merrill  <jason@redhat.com>

PR c++/113531
* gimple-low.cc (lower_stmt): Remove .ASAN_MARK calls
on variables which were promoted to TREE_STATIC.

* g++.dg/asan/initlist1.C: New test.

Co-authored-by: Jason Merrill <jason@redhat.com>

PR target/113560: Enhance is_widening_mult_rhs_p.

This patch resolves PR113560, a code quality regression from GCC12
affecting x86_64, by enhancing the middle-end's tree-ssa-math-opts.cc
to recognize more instances of widening multiplications.

The widening multiplication perception code identifies cases like:

_1 = (unsigned __int128) x;
__res = _1 * 100;

but in the reported test case, the original input looks like:

_1 = (unsigned long long) x;
_2 = (unsigned __int128) _1;
__res = _2 * 100;

which gets optimized by constant folding during tree-ssa to:

_2 = x & 18446744073709551615;  // x & 0xffffffffffffffff
__res = _2 * 100;

where the BIT_AND_EXPR hides (has consumed) the extension operation.
This reveals the more general deficiency (missed optimization
opportunity) in widening multiplication perception that additionally
both

__int128 foo(__int128 x, __int128 y) {
  return (x & 1000) * (y & 1000)
}

and

unsigned __int128 bar(unsigned __int128 x, unsigned __int128) {
  return (x >> 80) * (y >> 80);
}

should be recognized as widening multiplications.  Hence rather than
test explicitly for BIT_AND_EXPR (as in the first version of this patch)
the more general solution is to make use of range information, as
provided by tree_non_zero_bits.

As a demonstration of the observed improvements, function foo above
currently with -O2 compiles on x86_64 to:

foo: movq    %rdi, %rsi
        movq    %rdx, %r8
        xorl    %edi, %edi
        xorl    %r9d, %r9d
        andl    $1000, %esi
        andl    $1000, %r8d
        movq    %rdi, %rcx
        movq    %r9, %rdx
        imulq   %rsi, %rdx
        movq    %rsi, %rax
        imulq   %r8, %rcx
        addq    %rdx, %rcx
        mulq    %r8
        addq    %rdx, %rcx
        movq    %rcx, %rdx
        ret

with this patch, GCC recognizes the *w and instead generates:

foo:    movq    %rdi, %rsi
        movq    %rdx, %r8
        andl    $1000, %esi
        andl    $1000, %r8d
        movq    %rsi, %rax
        imulq   %r8
        ret

which is perhaps easier to understand at the tree-level where

__int128 foo (__int128 x, __int128 y)
{
  __int128 _1;
  __int128 _2;
  __int128 _5;

  <bb 2> [local count: 1073741824]:
  _1 = x_3(D) & 1000;
  _2 = y_4(D) & 1000;
  _5 = _1 * _2;
  return _5;

}

gets transformed to:

__int128 foo (__int128 x, __int128 y)
{
  __int128 _1;
  __int128 _2;
  __int128 _5;
  signed long _7;
  signed long _8;

  <bb 2> [local count: 1073741824]:
  _1 = x_3(D) & 1000;
  _2 = y_4(D) & 1000;
  _7 = (signed long) _1;
  _8 = (signed long) _2;
  _5 = _7 w* _8;
  return _5;
}

2023-02-01  Roger Sayle  <roger@nextmovesoftware.com>
    Richard Biener  <rguenther@suse.de>

gcc/ChangeLog
PR target/113560
* tree-ssa-math-opts.cc (is_widening_mult_rhs_p): Use range
information via tree_non_zero_bits to check if this operand
is suitably extended for a widening (or highpart) multiplication.
(convert_mult_to_widen): Insert explicit casts if the RHS or LHS
isn't already of the claimed type.

gcc/testsuite/ChangeLog
PR target/113560
* g++.target/i386/pr113560.C: New test case.
* gcc.target/i386/pr113560.c: Likewise.
* gcc.dg/pr87954.c: Update test case.

Revert "RISC-V: Add non-vector types to dfa pipelines"

This reverts commit 26c34b809cd1a6249027730a8b52bbf6a1c0f4a8.

Revert "RISC-V: Add vector related pipelines"

This reverts commit e56fb037d9d265682f5e7217d8a4c12a8d3fddf8.

Revert "RISC-V: Use default cost model for insn scheduling"

This reverts commit 4b799a16ae59fc0f508c5931ebf1851a3446b707.

Revert "RISC-V: Enable assert for insn_has_dfa_reservation"

This reverts commit 23cd2961bd2ff63583f46e3499a07bd54491d45c.

RISC-V: Enable assert for insn_has_dfa_reservation

Enables assert that every typed instruction is associated with a
dfa reservation

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_sched_variable_issue): enable assert

RISC-V: Use default cost model for insn scheduling

Use default cost model scheduling on these test cases. All these tests
introduce scan dump failures with -mtune generic-ooo. Since the vector
cost models are the same across all three tunes, some of the tests
in PR113249 will be fixed with this patch series.

PR target/113249

gcc/testsuite/ChangeLog:

* g++.target/riscv/rvv/base/bug-1.C: use default scheduling
* gcc.target/riscv/rvv/autovec/reduc/reduc_call-2.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-102.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-108.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-114.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-119.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-12.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-16.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-17.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-19.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-21.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-23.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-25.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-27.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-29.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-31.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-33.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-35.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-4.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-40.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-44.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-50.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-56.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-62.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-68.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-74.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-79.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-8.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-84.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-90.c: ditto
* gcc.target/riscv/rvv/base/binop_vx_constraint-96.c: ditto
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-30.c: ditto
* gcc.target/riscv/rvv/base/pr108185-1.c: ditto
* gcc.target/riscv/rvv/base/pr108185-2.c: ditto
* gcc.target/riscv/rvv/base/pr108185-3.c: ditto
* gcc.target/riscv/rvv/base/pr108185-4.c: ditto
* gcc.target/riscv/rvv/base/pr108185-5.c: ditto
* gcc.target/riscv/rvv/base/pr108185-6.c: ditto
* gcc.target/riscv/rvv/base/pr108185-7.c: ditto
* gcc.target/riscv/rvv/base/shift_vx_constraint-1.c: ditto
* gcc.target/riscv/rvv/vsetvl/pr111037-3.c: ditto
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-28.c: ditto
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-29.c: ditto
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-32.c: ditto
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-33.c: ditto
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-17.c: ditto
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-18.c: ditto
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-19.c: ditto
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-10.c: ditto
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-11.c: ditto
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-12.c: ditto
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-4.c: ditto
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-5.c: ditto
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-6.c: ditto
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-7.c: ditto
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-8.c: ditto
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-9.c: ditto
* gfortran.dg/vect/vect-8.f90: ditto

Signed-off-by: Edwin Lu <ewlu@rivosinc.com>

RISC-V: Add vector related pipelines

Creates new generic vector pipeline file common to all cpu tunes.
Moves all vector related pipelines from generic-ooo to generic-vector-ooo.
Creates new vector crypto related insn reservations.

gcc/ChangeLog:

* config/riscv/generic-ooo.md (generic_ooo): Move reservation
(generic_ooo_vec_load): ditto
(generic_ooo_vec_store): ditto
(generic_ooo_vec_loadstore_seg): ditto
(generic_ooo_vec_alu): ditto
(generic_ooo_vec_fcmp): ditto
(generic_ooo_vec_imul): ditto
(generic_ooo_vec_fadd): ditto
(generic_ooo_vec_fmul): ditto
(generic_ooo_crypto): ditto
(generic_ooo_perm): ditto
(generic_ooo_vec_reduction): ditto
(generic_ooo_vec_ordered_reduction): ditto
(generic_ooo_vec_idiv): ditto
(generic_ooo_vec_float_divsqrt): ditto
(generic_ooo_vec_mask): ditto
(generic_ooo_vec_vesetvl): ditto
(generic_ooo_vec_setrm): ditto
(generic_ooo_vec_readlen): ditto
* config/riscv/riscv.md: include generic-vector-ooo
* config/riscv/generic-vector-ooo.md: New file. to here

Signed-off-by: Edwin Lu <ewlu@rivosinc.com>
Co-authored-by: Robin Dapp <rdapp.gcc@gmail.com>

RISC-V: Add non-vector types to dfa pipelines

This patch adds non-vector related insn reservations and updates/creates
new insn reservations so all non-vector typed instructions have a reservation.

gcc/ChangeLog:

* config/riscv/generic-ooo.md (generic_ooo_sfb_alu): Add reservation
(generic_ooo_branch): ditto
* config/riscv/generic.md (generic_sfb_alu): ditto
(generic_fmul_half): ditto
* config/riscv/riscv.md: Remove cbo, pushpop, and rdfrm types
* config/riscv/sifive-7.md (sifive_7_hfma):Add reservation
(sifive_7_popcount): ditto
* config/riscv/vector.md: change rdfrm to fmove
* config/riscv/zc.md: change pushpop to load/store

Signed-off-by: Edwin Lu <ewlu@rivosinc.com>

aarch64: -mstrict-align vs __arm_data512_t [PR113657]

After r14-1187-gd6b756447cd58b, simplify_gen_subreg can return
NULL for "unaligned" memory subreg. Since V8DI has an alignment of 8 bytes,
using TImode causes simplify_gen_subreg to return NULL.
This fixes the issue by using DImode instead for the loop. And then we will have
later on the STP/LDP pass combine it back into STP/LDP if needed.
Since strict align is less important (usually used for firmware and early boot only),
not doing LDP/STP here is ok.

Built and tested for aarch64-linux-gnu with no regressions.

PR target/113657

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (split for movv8di):
For strict aligned mode, use DImode instead of TImode.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/acle/ls64_strict_align.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

analyzer: fix skipping of debug stmts [PR113253]

PR analyzer/113253 reports a case where the analyzer output varied
with and without -g enabled.

The root cause was that debug stmts were in the
FOR_EACH_IMM_USE_FAST list for SSA names, leading to the analyzer's
state purging logic differing between the -g and non-debugging cases,
and thus leading to differences in the exploration of the user's code.

Fix by skipping such stmts in the state-purging logic, and removing
debug stmts when constructing the supergraph.

gcc/analyzer/ChangeLog:
PR analyzer/113253
* region-model.cc (region_model::on_stmt_pre): Add gcc_unreachable
for debug statements.
* state-purge.cc
(state_purge_per_ssa_name::state_purge_per_ssa_name): Skip any
debug stmts in the FOR_EACH_IMM_USE_FAST list.
* supergraph.cc (supergraph::supergraph): Don't add debug stmts
to the supernodes.

gcc/testsuite/ChangeLog:
PR analyzer/113253
* gcc.dg/analyzer/deref-before-check-pr113253.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

c: Fix ICE for nested enum redefinitions with/without fixed underlying type [PR112571]

Bug 112571 reports an ICE-on-invalid for cases where an enum is
defined, without a fixed underlying type, inside the enum type
specifier for a definition of that same enum with a fixed underlying
type.

The ultimate cause is attempting to access ENUM_UNDERLYING_TYPE in a
case where it is NULL. Avoid this by clearing
ENUM_FIXED_UNDERLYING_TYPE_P in thie case of inconsistent definitions.

Bootstrapped wth no regressions for x86_64-pc-linux-gnu.

PR c/112571

gcc/c/
* c-decl.cc (start_enum): Clear ENUM_FIXED_UNDERLYING_TYPE_P when
defining without a fixed underlying type an enumeration previously
declared with a fixed underlying type.

gcc/testsuite/
* gcc.dg/c23-enum-9.c, gcc.dg/c23-enum-10.c: New tests.

match: Fix vcond into conditional op folding [PR113607].

In PR113607 we see an invalid fold of

  _429 = .COND_SHL (mask_patt_205.47_276, vect_cst__262, vect_cst__262, { 0, ... });
  vect_prephitmp_129.51_282 = _429;
  vect_iftmp.55_287 = VEC_COND_EXPR <mask_patt_209.54_286, vect_prephitmp_129.51_282, vect_cst__262>;

to

  Applying pattern match.pd:9607, gimple-match-10.cc:3817
  gimple_simplified to vect_iftmp.55_287 = .COND_SHL (mask_patt_205.47_276, vect_cst__262, vect_cst__262, { 0, ... });

where we essentially use COND_SHL's else instead of VEC_COND_EXPR's.

This patch adjusts the corresponding match.pd pattern and makes it only
match when the else values are the same.

That, however, causes the exact test case for which this pattern was
introduced for to fail.  Therefore XFAIL it for now.

gcc/ChangeLog:

PR middle-end/113607

* match.pd: Make sure else values match when folding a
vec_cond into a conditional operation.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/pre_cond_share_1.c: XFAIL.
* gcc.target/riscv/rvv/autovec/pr113607-run.c: New test.
* gcc.target/riscv/rvv/autovec/pr113607.c: New test.

c++: add deprecation notice for -fconcepts-ts

We plan to remove -fconcepts-ts in GCC 15 and thus remove the flag_concepts_ts
code. This note is an admonishing reminder to convert the Concepts TS
code to C++20 Concepts.

gcc/c-family/ChangeLog:

* c-opts.cc (c_common_post_options): Add an inform saying that
-fconcepts-ts is deprecated and will be removed in GCC 15.

gcc/ChangeLog:

* doc/invoke.texi: Mention that -fconcepts-ts was deprecated in GCC 14.

Fix ICE with -g and -std=c23 when forming composite types [PR113438]

Set TYPE_STUB_DECL to an artificial decl when creating a new structure
as a composite type.

PR c/113438

gcc/c/
* c-typeck.cc (composite_type_internal): Set TYPE_STUB_DECL.

gcc/testsuite/
* gcc.dg/pr113438.c: New test.

uninit-pr108968-register.c: use __UINTPTR_TYPE__ for LLP64

Ensure sp variable is long enough by using __UINTPTR_TYPE__ for
rsp.

gcc/testsuite/ChangeLog:

* c-c++-common/analyzer/uninit-pr108968-register.c:
Use __UINTPTR_TYPE__ instead of unsigned long for LLP64.

modula2: tidyup patch

This patch improves a comment and also adds the location tokenno to
possibly exported idents as they are encountered.

gcc/m2/ChangeLog:

* gm2-compiler/M2Comp.mod (Pass0CheckMod): Tidy up comment.
* gm2-compiler/P1Build.bnf (PossiblyExportIdent): Replace
PushTF with PushTFtok.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

PR modula2/111627 Excess test fails with a case-preserving-case-insensitive source tree

This patch renames gm2 testsuite modules whose names conflict with library
modules. The conflict is not seen on case preserving case sensitive file
systems.

gcc/testsuite/ChangeLog:

PR modula2/111627
* gm2/pim/pass/stdio.mod: Moved to...
* gm2/pim/pass/teststdio.mod: ...here.
* gm2/pim/run/pass/builtins.mod: Moved to...
* gm2/pim/run/pass/testbuiltins.mod: ...here.
* gm2/pim/run/pass/math.mod: Moved to...
* gm2/pim/run/pass/testmath.mod: ...here.
* gm2/pim/run/pass/math2.mod: Moved to...
* gm2/pim/run/pass/testmath2.mod: ...here.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

AArch64: relax cbranch tests to accepted inverted branches [PR113502]

Recently something in the midend had started inverting the branches by inverting
the condition and the branches.

While this is fine, it makes it hard to actually test.  In RTL I disable
scheduling and BB reordering to prevent this.  But in GIMPLE there seems to be
nothing I can do.  __builtin_expect seems to have no impact on the change since
I suspect this is happening during expand where conditions can be flipped
regardless of probability during compare_and_branch.

Since the mid-end has plenty of correctness tests, this weakens the backend
tests to just check that a correct looking sequence is emitted.

gcc/testsuite/ChangeLog:

PR testsuite/113502
* gcc.target/aarch64/sve/vect-early-break-cbranch.c: Ignore exact branch.
* gcc.target/aarch64/vect-early-break-cbranch.c: Likewise.

hwasan: Remove testsuite check for a complaint message [PR112644]

With recent updates to hwasan runtime libraries, the error reporting for
this particular check is has been reworked.

I would question why it has lost this message. To me it looks strange
that num_descriptions_printed is incremented whenever we call
PrintHeapOrGlobalCandidate whether that function prints anything or not.
(See PrintAddressDescription in libsanitizer/hwasan/hwasan_report.cpp).

The message is no longer printed because we increment this
num_descriptions_printed variable indicating that we have found some
description.

I would like to question this upstream, but it doesn't look that much of
a problem and if pressed for time we should just change our testsuite.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

gcc/testsuite/ChangeLog:

PR sanitizer/112644
* c-c++-common/hwasan/hwasan-thread-clears-stack.c: Update testcase.

hwasan: instrument new memory and string functions [PR112644]

Recent libhwasan updates[1] intercept various string and memory functions.
These functions have checking in them, which means there's no need to
inline the checking.

This patch marks said functions as intercepted, and adjusts a testcase
to handle the difference.  It also looks for HWASAN in a check in
expand_builtin.  This check originally is there to avoid using expand to
inline the behaviour of builtins like memset which are intercepted by
ASAN and hence which we rely on the function call staying as a function
call.  With the new reliance on function calls in HWASAN we need to do
the same thing for HWASAN too.

HWASAN and ASAN don't seem to however instrument the same functions.

Looking into libsanitizer/sanitizer_common/sanitizer_common_interceptors_memintrinsics.inc
it looks like the common ones are memset, memmove and memcpy.

The rest of the routines for asan seem to be defined in
compiler-rt/lib/asan/asan_interceptors.h however compiler-rt/lib/hwasan/
does not have such a file but it does have
compiler-rt/lib/hwasan/hwasan_platform_interceptors.h which it looks like is
forcing off everything but memset, memmove, memcpy, memcmp and bcmp.

As such I've taken those as the final list that hwasan currently supports.
This also means that on future updates this list should be cross checked.

[1] https://discourse.llvm.org/t/hwasan-question-about-the-recent-interceptors-being-added/75351

gcc/ChangeLog:

PR sanitizer/112644
* asan.h (asan_intercepted_p): Incercept memset, memmove, memcpy and
memcmp.
* builtins.cc (expand_builtin): Include HWASAN when checking for
builtin inlining.

gcc/testsuite/ChangeLog:

PR sanitizer/112644
* c-c++-common/hwasan/builtin-special-handling.c: Update testcase.

Co-Authored-By: Matthew Malcomson <matthew.malcomson@arm.com>

libsanitizer: Sync fixes for asan interceptors from upstream

This cherry-picks and squashes the differences between commits

d3e5c20ab846303874a2a25e5877c72271fc798b..76e1e45922e6709392fb82aac44bebe3dbc2ea63
from LLVM upstream from compiler-rt/lib/hwasan/ to GCC on the changes relevant
for GCC.

This is required to fix the linked PR.

As mentioned in the PR the last sync brought in a bug from upstream[1] where
operations became non-recoverable and as such the tests in AArch64 started
failing. This cherry picks the fix and there are minor updates needed to GCC
after this to fix the cases.

[1] https://github.com/llvm/llvm-project/pull/74000

PR sanitizer/112644
Cherry-pick llvm-project revision
672b71cc1003533460a82f06b7d24fbdc02ffd58,
5fcf3bbb1acfe226572474636714ede86fffcce8,
3bded112d02632209bd55fb28c6c5c234c23dec3 and
76e1e45922e6709392fb82aac44bebe3dbc2ea63.

middle-end/110176 - wrong zext (bool) <= (int) 4294967295u folding

The following fixes a wrong pattern that didn't match the behavior
of the original fold_widened_comparison in that get_unwidened
returned a constant always in the wider type. But here we're
using (int) 4294967295u without the conversion applied. Fixed
by doing as earlier in the pattern - matching constants only
if the conversion was actually applied.

PR middle-end/110176
* match.pd (zext (bool) <= (int) 4294967295u): Make sure
to match INTEGER_CST only without outstanding conversion.

* gcc.dg/torture/pr110176.c: New testcase.

aarch64: Avoid out-of-range shrink-wrapped saves [PR111677]

The PR shows us ICEing due to an unrecognizable TFmode save emitted by
aarch64_process_components.  The problem is that for T{I,F,D}mode we
conservatively require mems to be in range for x-register ldp/stp.  That
is because (at least for TImode) it can be allocated to both GPRs and
FPRs, and in the GPR case that is an x-reg ldp/stp, and the FPR case is
a q-register load/store.

As Richard pointed out in the PR, aarch64_get_separate_components
already checks that the offsets are suitable for a single load, so we
just need to choose a mode in aarch64_reg_save_mode that gives the full
q-register range.  In this patch, we choose V16QImode as an alternative
16-byte "bag-of-bits" mode that doesn't have the artificial range
restrictions imposed on T{I,F,D}mode.

For T{F,D}mode in GCC 15 I think we could consider relaxing the
restriction imposed in aarch64_classify_address, as typically T{F,D}mode
should be allocated to FPRs.  But such a change seems too invasive to
consider for GCC 14 at this stage (let alone backports).

Fortunately the new flexible load/store pair patterns in GCC 14 allow
this mode change to work without further changes.  The backports are
more involved as we need to adjust the load/store pair handling to cater
for V16QImode in a few places.

Note that for the testcase we are relying on the torture options to add
-funroll-loops at -O3 which is necessary to trigger the ICE on trunk
(but not on the 13 branch).

gcc/ChangeLog:

PR target/111677
* config/aarch64/aarch64.cc (aarch64_reg_save_mode): Use
V16QImode for the full 16-byte FPR saves in the vector PCS case.

gcc/testsuite/ChangeLog:

PR target/111677
* gcc.target/aarch64/torture/pr111677.c: New test.

testsuite: i386: Disable .eh_frame in gcc.target/i386/auto-init-5.c etc.

The gcc.target/i386/auto-init-5.c and gcc.target/i386/auto-init-6.c
tests FAIL on 64-bit Solaris/x86 with the native assembler:

FAIL: gcc.target/i386/auto-init-5.c scan-assembler-times \\\\.long\\t0 14
FAIL: gcc.target/i386/auto-init-6.c scan-assembler-times long\\t0 8

/bin/as doesn't fully support the CFI directives, so the .eh_frame
sections are emitted directly and contain .long. Since .eh_frame
doesn't matter for those tests, this patch disables its generation in
the first place.

Tested on i386-pc-solaris2.11 (as and gas) and i686-pc-linux-gnu.

2024-01-30 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
* gcc.target/i386/auto-init-5.c: Add
-fno-asynchronous-unwind-tables to dg-options.
* gcc.target/i386/auto-init-6.c: Likewise.

tree-optimization/111444 - avoid insertions when skipping defs

The following avoids inserting expressions for IPA CP discovered
equivalences into the VN hashtables when we are optimistically
skipping may-defs in the attempt to prove it's redundant.

PR tree-optimization/111444
* tree-ssa-sccvn.cc (vn_reference_lookup_3): Do not use
vn_reference_lookup_2 when optimistically skipping may-defs.

* gcc.dg/torture/pr111444.c: New testcase.

testsuite: Require ucn in g++.dg/cpp0x/udlit-extended-id-1.C

g++.dg/cpp0x/udlit-extended-id-1.C FAILs on Solaris/SPARC and x86 with
the native assembler:

UNRESOLVED: g++.dg/cpp0x/udlit-extended-id-1.C  -std=c++14 compilation failed to produce executable
FAIL: g++.dg/cpp0x/udlit-extended-id-1.C  -std=c++17 (test for excess errors)
UNRESOLVED: g++.dg/cpp0x/udlit-extended-id-1.C  -std=c++17 compilation failed to produce executable
FAIL: g++.dg/cpp0x/udlit-extended-id-1.C  -std=c++20 (test for excess errors)
UNRESOLVED: g++.dg/cpp0x/udlit-extended-id-1.C  -std=c++20 compilation failed to produce executable

/bin/as doesn't support UCN identifiers:

/usr/ccs/bin/as: "/var/tmp//ccCl_9fa.s", line 4: error: invalid character (0xcf)
/usr/ccs/bin/as: "/var/tmp//ccCl_9fa.s", line 4: error: invalid character (0x80)
/usr/ccs/bin/as: "/var/tmp//ccCl_9fa.s", line 4: error: statement syntax
/usr/ccs/bin/as: "/var/tmp//ccCl_9fa.s", line 4: error: statement syntax
[...]

To avoid this, this patch requires ucn support.

Tested on i386-pc-solaris2.11 (as and gas), sparc-sun-solaris2.11 (as
and gas), and i686-pc-linux-gnu.

2024-01-30  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
* g++.dg/cpp0x/udlit-extended-id-1.C: Require ucn support.

tree-optimization/113630 - invalid code hoisting

The following avoids code hoisting (but also PRE insertion) of
expressions that got value-numbered to another one that are not
a valid replacement (but still compute the same value). This time
because the access path ends in a structure with different size,
meaning we consider a related access as not trapping because of the
size of the base of the access.

PR tree-optimization/113630
* tree-ssa-pre.cc (compute_avail): Avoid registering a
reference with a representation with not matching base
access size.

* gcc.dg/torture/pr113630.c: New testcase.

simplify-rtx: Fix up last argument to simplify_gen_unary [PR113656]

When simplifying e.g. (float_truncate:SF (float_truncate:DF (reg:XF))
or (float_truncate:SF (float_extend:XF (reg:DF)) etc. into
(float_truncate:SF (reg:XF)) or (float_truncate:SF (reg:DF)) we call
simplify_gen_unary with incorrect op_mode argument, it should be
the argument's mode, but we call it with the outer mode instead.
As these are all floating point operations, the argument always
has non-VOIDmode and so we can just use that mode (as done in similar
simplifications a few lines later), but neither FLOAT_TRUNCATE nor
FLOAT_EXTEND are operations that should have the same modes of operand
and result. This bug hasn't been a problem for years because normally
op_mode is used only if the mode of op is VOIDmode, otherwise it is
redundant, but r10-2139 added an assertion in some spots that op_mode
is right even in such cases.

2024-01-31 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/113656
* simplify-rtx.cc (simplify_context::simplify_unary_operation_1)
<case FLOAT_TRUNCATE>: Fix up last argument to simplify_gen_unary.

* gcc.target/i386/pr113656.c: New test.

dwarf2out: Fix ICE on large _BitInt in loc_list_from_tree_1 [PR113637]

This spot uses SCALAR_INT_TYPE_MODE which obviously ICEs for large/huge
BITINT_TYPE types which have BLKmode. But such large BITINT_TYPEs certainly
don't fit into DWARF2_ADDR_SIZE either, so we can just assume it would be
false if type has BLKmode.

2024-01-31 Jakub Jelinek <jakub@redhat.com>

PR debug/113637
* dwarf2out.cc (loc_list_from_tree_1): Assume integral types
with BLKmode are larger than DWARF2_ADDR_SIZE.

* gcc.dg/bitint-80.c: New test.

lower-bitint: Fix up VIEW_CONVERT_EXPR handling in handle_operand_addr [PR113639]

Yet another spot where we need to treat VIEW_CONVERT_EXPR differently
from NOP_EXPR/CONVERT_EXPR.

2024-01-31 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/113639
* gimple-lower-bitint.cc (bitint_large_huge::handle_operand_addr):
For VIEW_CONVERT_EXPR set rhs1 to its operand.

* gcc.dg/bitint-79.c: New test.

libstdc++: Enable std::text_encoding for darwin and FreeBSD

The <xlocale.h> header is needed for newlocale and locale_t on these
targets.

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_CHECK_TEXT_ENCODING): Use <xlocale.h> if
needed for newlocale.
* configure: Regenerate.
* src/c++26/text_encoding.cc: Use <xlocale.h>.

Reviewed-by: Iain Sandoe <iain@sandoe.co.uk>

libstdc++: Add "ASCII" as an alias for std::text_encoding::id::ASCII

As noted in LWG 4043, "ASCII" is not an alias for any known registered
character encoding, so std::text_encoding("ASCII").mib() == id::other.
Add the alias "ASCII" to the implementation-defined superset of aliases
for that encoding.

libstdc++-v3/ChangeLog:

* include/bits/text_encoding-data.h: Regenerate.
* scripts/gen_text_encoding_data.py: Add extra_aliases dict
containing "ASCII".
* testsuite/std/text_encoding/cons.cc: Check "ascii" is known.

Co-authored-by: Ewan Higgs <ewan.higgs@gmail.com>
Signed-off-by: Ewan Higgs <ewan.higgs@gmail.com>

libstdc++: Add all supported headers to lists in the manual

libstdc++-v3/ChangeLog:

* doc/xml/manual/using.xml: Update tables of supported headers.
* doc/html/*: Regenerate.

libstdc++: Fix -Wshift-count-overflow warning in std::bitset

This shift only happens if the unsigned long long type is wider than
unsigned long but the compiler warns when it sees the shift, without
caring if it's reachable.

Use the preprocessor to compare the sizes and just reuse _M_to_ulong()
if sizeof(long) == sizeof(long long).

libstdc++-v3/ChangeLog:

* include/std/bitset (_Base_bitset::_M_do_to_ullong): Avoid
-Wshift-count-overflow warning.

tree-optimization/113670 - gather/scatter to/from hard registers

The following makes sure we're not taking the address of hard
registers when vectorizing appearant gathers or scatters to/from
them.

PR tree-optimization/113670
* tree-vect-data-refs.cc (vect_check_gather_scatter):
Make sure we can take the address of the reference base.

* gcc.target/i386/pr113670.c: New testcase.

AVR: Add AVR64DU and some older devices.

gcc/
* config/avr/avr-mcus.def: Add AVR64DU28, AVR64DU32, ATA5787,
ATA5835, ATtiny64AUTO, ATA5700M322.
* doc/avr-mmcu.texi: Rebuild.

strub: drop nonaliased parm from build_ref_type_for [PR113394]

Variant type copies can't have their own alias sets any more, and it's
not like setting them affected the pointed-to objects anyway.

for  gcc/ChangeLog

PR debug/113394
* ipa-strub.cc (build_ref_type_for): Drop nonaliased.  Adjust
caller.

for  gcc/testsuite/ChangeLog

PR debug/113394
* gcc.dg/strub-internal-pr113394.c: New.

0From: Alexandre Oliva <oliva@adacore.com>

strub: introduce STACK_ADDRESS_OFFSET

Since STACK_POINTER_OFFSET is not necessarily at the boundary between
caller- and callee-owned stack, as desired by
__builtin_stack_address(), and using it as if it were or not causes
problems, introduce a new macro so that ports can define it suitably,
without modifying STACK_POINTER_OFFSET.

for gcc/ChangeLog

PR middle-end/112917
PR middle-end/113100
* builtins.cc (expand_builtin_stack_address): Use
STACK_ADDRESS_OFFSET.
* doc/extend.texi (__builtin_stack_address): Adjust.
* config/sparc/sparc.h (STACK_ADDRESS_OFFSET): Define.
* doc/tm.texi.in (STACK_ADDRESS_OFFSET): Document.
* doc/tm.texi: Rebuilt.

c: Fix ICEs casting expressions with integer constant operands to bool [PR111059, PR111911]

C front-end bugs 111059 and 111911 both report ICEs with conversions
to boolean of expressions with integer constant operands that can
appear in an integer constant expression as long as they are not
evaluated (such as division by zero).

The issue is a nested C_MAYBE_CONST_EXPR, with the inner one generated
in build_binary_op to indicate that a subexpression has been fully
folded and should not be folded again, and the outer one in
build_c_cast to indicate that the expression has integer constant
operands. To avoid the inner one from build_binary_op,
c_objc_common_truthvalue_conversion should be given an argument
properly marked as having integer constant operands rather than that
information having been removed by the caller - but because c_convert
would then also wrap a C_MAYBE_CONST_EXPR with a NOP_EXPR converting
to boolean, it seems most convenient to have
c_objc_common_truthvalue_conversion produce the NE_EXPR directly in
the desired type (boolean in this case), before generating any
C_MAYBE_CONST_EXPR there, rather than it always producing a comparison
in integer_type_node and doing a conversion to boolean in the caller.

The same issue as in those PRs also applies for conversion to enums
with a boolean fixed underlying type; that case is also fixed and
tests added for it. Note that not all the tests added failed before
the patch (in particular, the issue was specific to casts and did not
apply for implicit conversions, but some tests of those are added as
well).

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

PR c/111059
PR c/111911

gcc/c/
* c-tree.h (c_objc_common_truthvalue_conversion): Add third
argument.
* c-convert.cc (c_convert): For conversions to boolean, pass third
argument to c_objc_common_truthvalue_conversion rather than
converting here.
* c-typeck.cc (build_c_cast): Ensure arguments with integer
operands are marked as such for conversion to boolean.
(c_objc_common_truthvalue_conversion): Add third argument TYPE.

gcc/testsuite/
* gcc.c-torture/compile/pr111059-1.c,
gcc.c-torture/compile/pr111059-2.c,
gcc.c-torture/compile/pr111059-3.c,
gcc.c-torture/compile/pr111059-4.c,
gcc.c-torture/compile/pr111059-5.c,
gcc.c-torture/compile/pr111059-6.c,
gcc.c-torture/compile/pr111059-7.c,
gcc.c-torture/compile/pr111059-8.c,
gcc.c-torture/compile/pr111059-9.c,
gcc.c-torture/compile/pr111059-10.c,
gcc.c-torture/compile/pr111059-11.c,
gcc.c-torture/compile/pr111059-12.c,
gcc.c-torture/compile/pr111911-1.c,
gcc.c-torture/compile/pr111911-2.c: New tests.

analyzer: handle null "var" in state_change_event::get_desc [PR113509]

Avoid ICE with -fanalyzer-verbose-state-changes when
region_model::get_representative_tree returns nullptr in
state_change_event::get_desc.

gcc/analyzer/ChangeLog:
PR analyzer/113509
* checker-event.cc (state_change_event::get_desc): Don't assume
"var" is non-NULL.

gcc/testsuite/ChangeLog:
PR analyzer/113509
* c-c++-common/analyzer/stdarg-pr113509.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

RISC-V: Fix VSETLV PASS compile-time issue

The compile time issue was discovered in SPEC 2017 wrf:

Use time and -ftime-report to analyze the profile data of SPEC 2017 wrf compilation .

Before this patch (Lazy vsetvl):

scheduling                         : 121.89 ( 15%)   0.53 ( 11%) 122.72 ( 15%)    13M (  1%)
machine dep reorg                  : 424.61 ( 53%)   1.84 ( 37%) 427.44 ( 53%)  5290k (  0%)
real    13m27.074s
user    13m19.539s
sys     0m5.180s

Simple vsetvl:

machine dep reorg                  :   0.10 (  0%)   0.00 (  0%)   0.11 (  0%)  4138k (  0%)
real    6m5.780s
user    6m2.396s
sys     0m2.373s

The machine dep reorg is the compile time of VSETVL PASS (424 seconds) which counts 53% of
the compilation time, spends much more time than scheduling.

After investigation, the critical patch of VSETVL pass is compute_lcm_local_properties which
is called every iteration of phase 2 (earliest fusion) and phase 3 (global lcm).

This patch optimized the codes of compute_lcm_local_properties to reduce the compilation time.

After this patch:

scheduling                         : 117.51 ( 27%)   0.21 (  6%) 118.04 ( 27%)    13M (  1%)
machine dep reorg                  :  80.13 ( 18%)   0.91 ( 26%)  81.26 ( 18%)  5290k (  0%)
real    7m25.374s
user    7m20.116s
sys     0m3.795s

The optimization of this patch is very obvious, lazy VSETVL PASS: 424s (53%) -> 80s (18%) which
spend less time than scheduling.

Tested on both RV32 and RV64 no regression.  Ok for trunk ?

PR target/113495

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (extract_single_source): Remove.
(pre_vsetvl::compute_vsetvl_def_data): Fix compile time issue.
(pre_vsetvl::compute_transparent): New function.
(pre_vsetvl::compute_lcm_local_properties): Fix compile time time issue.

Daily bump.

i386: Add "Ws" constraint for symbolic address/label reference [PR105576]

Printing the raw symbol is useful in inline asm (e.g. in C++ to get the
mangled name). Similar constraints are available in other targets (e.g.
"S" for aarch64/riscv, "Cs" for m68k).

There isn't a good way for x86 yet, e.g. "i" doesn't work for
PIC/-mcmodel=large. This patch adds "Ws". Here are possible use cases:

```
namespace ns { extern int var; }
asm (".pushsection .xxx,\"aw\"; .dc.a %0; .popsection" :: "Ws"(&var));
asm (".reloc ., BFD_RELOC_NONE, %0" :: "Ws"(&var));
```

gcc/ChangeLog:

PR target/105576
* config/i386/constraints.md: Define constraint "Ws".
* doc/md.texi: Document it.

gcc/testsuite/ChangeLog:

PR target/105576
* gcc.target/i386/asm-raw-symbol.c: New testcase.

c++: avoid -Wdangling-reference for std::span-like classes [PR110358]

Real-world experience shows that -Wdangling-reference triggers for
user-defined std::span-like classes a lot.  We can easily avoid that
by considering classes like

    template<typename T>
    struct Span {
      T* data_;
      std::size len_;
    };

to be std::span-like, and not warning for them.  Unlike the previous
patch, this one considers a non-union class template that has a pointer
data member and a trivial destructor as std::span-like.

PR c++/110358
PR c++/109640

gcc/cp/ChangeLog:

* call.cc (reference_like_class_p): Don't warn for std::span-like
classes.

gcc/ChangeLog:

* doc/invoke.texi: Update -Wdangling-reference description.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference18.C: New test.
* g++.dg/warn/Wdangling-reference19.C: New test.
* g++.dg/warn/Wdangling-reference20.C: New test.

xtensa: Make full transition to LRA

gcc/ChangeLog:

* config/xtensa/constraints.md (R, T, U):
Change define_constraint to define_memory_constraint.
* config/xtensa/predicates.md (move_operand): Don't check that a
constant pool operand size is a multiple of UNITS_PER_WORD.
* config/xtensa/xtensa.cc
(xtensa_lra_p, TARGET_LRA_P): Remove.
(xtensa_emit_move_sequence): Remove "if (reload_in_progress)"
clause as it can no longer be true.
(fixup_subreg_mem): Drop function.
(xtensa_output_integer_literal_parts): Consider 16-bit wide
constants.
(xtensa_legitimate_constant_p): Add short-circuit path for
integer load instructions. Don't check that mode size is
at least UNITS_PER_WORD.
* config/xtensa/xtensa.md (movsf): Use can_create_pseudo_p()
rather reload_in_progress and reload_completed.
(doloop_end): Drop operand 2.
(movhi_internal): Add alternative loading constant from a
literal pool.
(define_split for DI register_operand): Don't limit to
!TARGET_AUTO_LITPOOLS.
* config/xtensa/xtensa.opt (mlra): Change to no effect.

c++: add original testcase [PR67898]

The original testcase from this PR (fixed by r14-8291) seems rather
different from the others, so let's add it to the testsuite.

PR c++/67898

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/temp_default8.C: New test.

testsuite: fix anon6 mangling [PR112846]

As with r14-6796-g2fa122cae50cd8, avoid mangling compatibility aliases in
mangling tests, and test the new mangling.

PR c++/112846

gcc/testsuite/ChangeLog:

* g++.dg/abi/anon6.C: Specify ABI v18.
* g++.dg/abi/anon6a.C: New test for ABI v19.

testsuite: mangle-reparm1a options [PR113451]

When I added -fabi-compat-version=8 to avoid mangling aliases it also
suppressed the -Wabi warning.

PR c++/113451

gcc/testsuite/ChangeLog:

* g++.dg/abi/mangle-regparm1a.C: Use -Wabi=0.

c++: duplicated side effects of xobj arg [PR113640]

We miscompile the below testcase because keep_unused_object_arg thinks
the object argument of an xobj member function is unused, and so it ends
up duplicating the argument's side effects.

PR c++/113640

gcc/cp/ChangeLog:

* call.cc (keep_unused_object_arg): Punt for an xobj member
function.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/explicit-obj-lambda14.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

c++: unifying integer parm with type-dep arg [PR113644]

Here when trying to unify P=42 A=T::value we ICE due to the latter's
empty type, which same_type_p dislikes.

PR c++/113644

gcc/cp/ChangeLog:

* pt.cc (unify) <case INTEGER_CST>: Handle NULL_TREE type.

gcc/testsuite/ChangeLog:

* g++.dg/template/nontype30.C: New test.

libstdc++: Fix check in testsuite/std/time/clock/gps/io.cc

The test_format() function contained an incorrect assertion but wasn't
actually being called from main.

libstdc++-v3/ChangeLog:

* testsuite/std/time/clock/gps/io.cc: Fix expected result in
assertion and call test_format() from main.

RISC-V: Bugfix for vls mode aggregated in GPR calling convention

According to the issue as below.

https://hub.fgit.cf/riscv-non-isa/riscv-elf-psabi-doc/pull/416

When the mode size of vls integer mode is less than 2 * XLEN, we will
take the gpr for both the args and the return values. Instead of the
reference. For example the below code:

typedef short v8hi __attribute__ ((vector_size (16)));

v8hi __attribute__((noinline))
add (v8hi a, v8hi b)
{
  v8hi r = a + b;
  return r;
}

Before this patch:
add:
  vsetivli zero,8,e16,m1,ta,ma
  vle16.v  v1,0(a1) <== arg by reference
  vle16.v  v2,0(a2) <== arg by reference
  vadd.vv  v1,v1,v2
  vse16.v  v1,0(a0) <== return by reference
  ret

After this patch:
add:
  addi     sp,sp,-32
  sd       a0,0(sp)  <== arg by register a0 - a3
  sd       a1,8(sp)
  sd       a2,16(sp)
  sd       a3,24(sp)
  addi     a5,sp,16
  vsetivli zero,8,e16,m1,ta,ma
  vle16.v  v2,0(sp)
  vle16.v  v1,0(a5)
  vadd.vv  v1,v1,v2
  vse16.v  v1,0(sp)
  ld       a0,0(sp)  <== return by a0 - a1.
  ld       a1,8(sp)
  addi     sp,sp,32
  jr       ra

For vls floating point, we take the same rules as integer and passed by
the gpr or reference.

The riscv regression passed for this patch.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_v_vls_mode_aggregate_gpr_count): New function to
calculate the gpr count required by vls mode.
(riscv_v_vls_to_gpr_mode): New function convert vls mode to gpr mode.
(riscv_pass_vls_aggregate_in_gpr): New function to return the rtx of gpr
for vls mode.
(riscv_get_arg_info): Add vls mode handling.
(riscv_pass_by_reference): Return false if arg info has no zero gpr count.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Add new helper macro.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-10.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-8.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-9.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-run-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-run-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-run-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-run-6.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

analyzer: fix -Wanalyzer-allocation-size false +ve on Linux kernel's round_up macro [PR113654]

gcc/analyzer/ChangeLog:
PR analyzer/113654
* region-model.cc (is_round_up): New.
(is_multiple_p): New.
(is_dubious_capacity): New.
(region_model::check_region_size): Move usage of size_visitor into
is_dubious_capacity.

gcc/testsuite/ChangeLog:
PR analyzer/113654
* c-c++-common/analyzer/allocation-size-pr113654-1.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: add SARIF property bag to -Wanalyzer-allocation-size

This is useful for debugging the analyzer.

gcc/analyzer/ChangeLog:
* region-model.cc
(dubious_allocation_size::dubious_allocation_size): Add
"capacity_sval" param. Drop unused ctor.
(dubious_allocation_size::maybe_add_sarif_properties): New.
(dubious_allocation_size::m_capacity_sval): New field.
(region_model::check_region_size): Pass capacity svalue to
dubious_allocation_size ctor.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

gccrs: Fix output line ending patterns.

gcc/testsuite/ChangeLog:

* rust/execute/torture/builtin_macros1.rs: Fix output pattern.
* rust/execute/torture/coercion3.rs: Likewise.
* rust/execute/torture/issue-2080.rs: Likewise.
* rust/execute/torture/issue-2179.rs: Likewise.
* rust/execute/torture/issue-2180.rs: Likewise.
* rust/execute/torture/iter1.rs: Likewise.

gccrs: Remove TraitImplItem

gcc/rust/ChangeLog:

* ast/rust-ast-full-decls.h
(class TraitImplItem): Remove forward declaration.
(class AssociatedItem): Add forward declaration.
* ast/rust-ast.h
(class TraitImplItem): Remove.
(class TraitItem): Inherit from AssociatedItem.
(SingleASTNode::take_trait_impl_item):
Return std::unique_ptr<AssociatedItem> instead of
std::unique_ptr<TraitImplItem>.
* ast/rust-item.h
(class Function): Inherit from AssociatedItem instead of
TraitImplItem.
(class TypeAlias): Likewise.
(class ConstantItem): Likewise.
(class TraitImpl): Store items as AssociatedItem.
* expand/rust-derive-clone.cc
(DeriveClone::clone_fn): Return std::unique_ptr<AssociatedItem>.
(DeriveClone::clone_impl): Take std::unique_ptr<AssociatedItem>.
* expand/rust-derive-clone.h
(DeriveClone::clone_fn): Return std::unique_ptr<AssociatedItem>.
(DeriveClone::clone_impl): Take std::unique_ptr<AssociatedItem>.
* expand/rust-expand-visitor.cc
(ExpandVisitor::visit): Handle changes to
SingleASTNode::take_trait_impl_item.
* parse/rust-parse-impl.h
(Parser::parse_impl): Parse TraitImpl as containing AssociatedItem.
(Parser::parse_trait_impl_item): Return
std::unique_ptr<AssociatedItem>.
(Parser::parse_trait_impl_function_or_method): Likewise.
* parse/rust-parse.h
(Parser::parse_trait_impl_item): Return
std::unique_ptr<AssociatedItem>.
(Parser::parse_trait_impl_function_or_method): Likewise.

Signed-off-by: Owen Avery <powerboat9.gamer@gmail.com>

gccrs: Add improved error when no fields in initializer

If a struct type with a variant that has fields is initialized with some fields the expression HIR StructExprStructFields is checked that all the fields are assigned. However, if no fields are initialized the HIR StructExprStruct is generated. This doesn't check if the struct is a unit during typechekc and only fails at the compile stage with a ICE.

Add a check at the typecheck stage that makes sure the struct does not have a variant with fields and give an error message based on the rustc one.

We have also updated the message given in the case where one field was present to list the missing fields and match more closely the new message.

gcc/rust/ChangeLog:

* typecheck/rust-hir-type-check-expr.cc (TypeCheckExpr::visit) Add additional check
* typecheck/rust-hir-type-check-struct-field.h: A helper method to make error added
* typecheck/rust-hir-type-check-struct.cc (TypeCheckStructExpr::resolve) Update message

gcc/testsuite/ChangeLog:

* rust/compile/missing_constructor_fields.rs: Added case with no initializers

Signed-off-by: Robert Goss <goss.robert@gmail.com>

gccrs: Test: check implemented for lifetime handling

gcc/testsuite/ChangeLog:

* rust/compile/for_lifetimes.rs: New test.

Signed-off-by: Jakub Dupak <dev@jakubdupak.com>

gccrs: AST: Fix for lifetime lowering

gcc/rust/ChangeLog:

* hir/rust-ast-lower-type.cc (ASTLoweringTypeBounds::visit): fix for lifetimes
(ASTLowerWhereClauseItem::visit): fix for lifetimes

Signed-off-by: Jakub Dupak <dev@jakubdupak.com>

gccrs: AST: Fix for lifetime parsing

gcc/rust/ChangeLog:

* parse/rust-parse-impl.h (Parser::parse_where_clause): fix parsing
(Parser::parse_where_clause_item): fix parsing
(Parser::parse_type_bound_where_clause_item): fix parsing
(Parser::parse_trait_bound): fix parsing
* parse/rust-parse.h: fix parsing

Signed-off-by: Jakub Dupak <dev@jakubdupak.com>

gccrs: Test: fix missing lifetime in a test

This test did not compile with rustc.

gcc/testsuite/ChangeLog:

* rust/compile/torture/utf8_identifiers.rs: add mising lifetime

Signed-off-by: Jakub Dupak <dev@jakubdupak.com>

gccrs: Added newline to get more readable lexdump

Fixes #2783

gcc/rust/ChangeLog:

* lex/rust-lex.cc (Lexer::dump_and_skip):
Changed " " to '\n'

Signed-off-by: Kushal Pal <kushalpal109@gmail.com>