gcc.gnu.org Git - gcc.git/log

libstdc++-v3: do not duplicate some math functions when using newlib

When running the libstdc++ testsuite on AArch64 RTEMS, we noticed
that about 25 tests are failing during the link, due to the "sqrtl"
function being defined twice:
  - once inside RTEMS' libm;
  - once inside our libstdc++.

One test that fails, for instance, would be 26_numerics/complex/13450.cc.

In comparing libm and libstdc++, we found that libstc++ also
duplicates "hypotf", and "hypotl".

For "sqrtl" and "hypotl", the symbosl come a unit called
from math_stubs_long_double.cc, while "hypotf" comes from
the equivalent unit for the float version, called math_stubs_float.cc.
Those units are always compiled in libstdc++ and provide our own
version of various math routines when those are missing from
the target system. The definition of those symbols is predicated
on the existance of various macros provided by c++config.h, which
themselves are predicated by the corresponding HAVE_xxx macros
in config.h.

One key element behind what's happening, here, is that the target
uses newlib, and therefore GCC was configured --with-newlib.
The section of libstdc++v3's configure script that handles which math
functions are available has a newlib-specific section, and that
section provides a hardcoded list of symbols.

For "hypotf", this commit fixes the issue by doing the same
as for the other routines already declared in that section.
I verified by inspection in the newlib code that this function
should always be present, so hardcoding it in our configure
script should not be an issue.

For the math routines handling doubles ("sqrtl" and "hypotl"),
however, I do not believe we can assume that newlib's libm
will always provide them. Therefore, this commit fixes that
part of the issue by ading a compile-check for "sqrtl" and "hypotl".
And while at it, we also include checks for all the other math
functions that math_stubs_long_double.cc re-implements, allowing
us to be resilient to future newlib enhancements adding support
for more functions.

libstdc++-v3/ChangeLog:

* configure.ac ["x${with_newlib}" = "xyes"]: Define
HAVE_HYPOTF.  Add compile-checks for various long double
math functions as well.
* configure: Regenerate.

c: add name hints to c_parser_declspecs [PR107583]

PR c/107583 notes that we weren't issuing a hint for

  struct foo {
    time_t mytime; /* missing <time.h> include should trigger fixit */
  };

in the C frontend.

The root cause is that one of the "unknown type name" diagnostics
was missing logic to emit hints, which this patch fixes.

gcc/c/ChangeLog:
PR c/107583
* c-parser.cc (c_parser_declspecs): Add hints to "unknown type
name" error.

gcc/testsuite/ChangeLog:
PR c/107583
* c-c++-common/spellcheck-pr107583.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

Daily bump.

configure: Implement --enable-host-pie

[ This is my third attempt to add this configure option.  The first
version was approved but it came too late in the development cycle.
The second version was also approved, but I had to revert it:
<https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607082.html>.
I've fixed the problem (by moving $(PICFLAG) from INTERNAL_CFLAGS to
ALL_COMPILERFLAGS).  Another change is that since r13-4536 I no longer
need to touch Makefile.def, so this patch is simplified. ]

This patch implements the --enable-host-pie configure option which
makes the compiler executables PIE.  This can be used to enhance
protection against ROP attacks, and can be viewed as part of a wider
trend to harden binaries.

It is similar to the option --enable-host-shared, except that --e-h-s
won't add -shared to the linker flags whereas --e-h-p will add -pie.
It is different from --enable-default-pie because that option just
adds an implicit -fPIE/-pie when the compiler is invoked, but the
compiler itself isn't PIE.

Since r12-5768-gfe7c3ecf, PCH works well with PIE, so there are no PCH
regressions.

When building the compiler, the build process may use various in-tree
libraries; these need to be built with -fPIE so that it's possible to
use them when building a PIE.  For instance, when --with-included-gettext
is in effect, intl object files must be compiled with -fPIE.  Similarly,
when building in-tree gmp, isl, mpfr and mpc, they must be compiled with
-fPIE.

With this patch and --enable-host-pie used to configure gcc:

$ file gcc/cc1{,plus,obj,gm2} gcc/f951 gcc/lto1 gcc/cpp gcc/go1 gcc/rust1 gcc/gnat1
gcc/cc1:     ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=98e22cde129d304aa6f33e61b1c39e144aeb135e, for GNU/Linux 3.2.0, with debug_info, not stripped
gcc/cc1plus: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=859d1ea37e43dfe50c18fd4e3dd9a34bb1db8f77, for GNU/Linux 3.2.0, with debug_info, not stripped
gcc/cc1obj:  ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=1964f8ecee6163182bc26134e2ac1f324816e434, for GNU/Linux 3.2.0, with debug_info, not stripped
gcc/cc1gm2:  ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=a396672c7ff913d21855829202e7b02ecf42ff4c, for GNU/Linux 3.2.0, with debug_info, not stripped
gcc/f951:    ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=59c523db893186547ac75c7a71f48be0a461c06b, for GNU/Linux 3.2.0, with debug_info, not stripped
gcc/lto1:    ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=084a7b77df7be2d63c2d4c655b5bbc3fcdb6038d, for GNU/Linux 3.2.0, with debug_info, not stripped
gcc/cpp:     ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=3503bf8390d219a10d6653b8560aa21158132168, for GNU/Linux 3.2.0, with debug_info, not stripped
gcc/go1:     ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=988cc673af4fba5dcb482f4b34957b99050a68c5, for GNU/Linux 3.2.0, with debug_info, not stripped
gcc/rust1:   ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=b6a5d3d514446c4dcdee0707f086ab9b274a8a3c, for GNU/Linux 3.2.0, with debug_info, not stripped
gcc/gnat1:   ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=bb11ccdc2c366fe3fe0980476bcd8ca19b67f9dc, for GNU/Linux 3.2.0, with debug_info, not stripped

I plan to add an option to link with -Wl,-z,now.

Bootstrapped on x86_64-pc-linux-gnu with --with-included-gettext
--enable-host-pie as well as without --enable-host-pie.  Also tested
on a Debian system where the system gcc was configured with
--enable-default-pie.

Co-Authored by: Iain Sandoe  <iain@sandoe.co.uk>

ChangeLog:

* configure.ac (--enable-host-pie): New check.  Set PICFLAG after this
check.
* configure: Regenerate.

c++tools/ChangeLog:

* Makefile.in: Rename PIEFLAG to PICFLAG.  Set LD_PICFLAG.  Use it.
Use pic/libiberty.a if PICFLAG is set.
* configure.ac (--enable-default-pie): Set PICFLAG instead of PIEFLAG.
(--enable-host-pie): New check.
* configure: Regenerate.

fixincludes/ChangeLog:

* Makefile.in: Set and use PICFLAG and LD_PICFLAG.  Use the "pic"
build of libiberty if PICFLAG is set.
* configure.ac:
* configure: Regenerate.

gcc/ChangeLog:

* Makefile.in: Set LD_PICFLAG.  Use it.  Set enable_host_pie.
Remove NO_PIE_CFLAGS and NO_PIE_FLAG.  Pass LD_PICFLAG to
ALL_LINKERFLAGS.  Use the "pic" build of libiberty if --enable-host-pie.
* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG and LD_PICFLAG after this
check.
* configure: Regenerate.
* doc/install.texi: Document --enable-host-pie.

gcc/ada/ChangeLog:

* gcc-interface/Make-lang.in (ALL_ADAFLAGS): Remove NO_PIE_CFLAGS.  Add
PICFLAG.  Use PICFLAG when building ada/b_gnat1.o and ada/b_gnatb.o.
* gcc-interface/Makefile.in: Use pic/libiberty.a if PICFLAG is set.
Remove NO_PIE_FLAG.

gcc/m2/ChangeLog:

* Make-lang.in: New var, GM2_PICFLAGS.  Use it.

gcc/d/ChangeLog:

* Make-lang.in: Remove NO_PIE_CFLAGS.

intl/ChangeLog:

* Makefile.in: Use @PICFLAG@ in COMPILE as well.
* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG after this check.
* configure: Regenerate.

libcody/ChangeLog:

* Makefile.in: Pass LD_PICFLAG to LDFLAGS.
* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG and LD_PICFLAG after this
check.
* configure: Regenerate.

libcpp/ChangeLog:

* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG after this check.
* configure: Regenerate.

libdecnumber/ChangeLog:

* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG after this check.
* configure: Regenerate.

libiberty/ChangeLog:

* configure.ac: Also set shared when enable_host_pie.
* configure: Regenerate.

zlib/ChangeLog:

* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG after this check.
* configure: Regenerate.

cprop_hardreg: Enable propagation of the stack pointer if possible

Propagation of the stack pointer in cprop_hardreg is currenty
forbidden in all cases, due to maybe_mode_change returning NULL.
Relax this restriction and allow propagation when no mode change is
requested.

gcc/ChangeLog:

* regcprop.cc (maybe_mode_change): Enable stack pointer
propagation.

Add another testcase for PR 110266

Since the combining of sin/cos into cexpi is depedent
on the target, this adds another testcase which had failed (earlier in
evpr rather than vrp2) that will fail on all targets rather than
ones which have sincos or C99 math functions.

Committed as obvious after a quick test.

gcc/testsuite/ChangeLog:

PR tree-optimization/110266
* gcc.c-torture/compile/pr110266.c: New test.

Check for integer only complex.

With the expanded capabilities of range-op dispatch, floating point
complex objects can appear when folding, whic they couldn't before.
In the processig for extracting integers from complex ints, make sure it
is an integer complex.

PR tree-optimization/110266
gcc/
* gimple-range-fold.cc (adjust_imagpart_expr): Check for integer
complex type.
(adjust_realpart_expr): Ditto.

gcc/testsuite/
* gcc.dg/pr110266.c: New.

libcpp: Diagnose #include after failed __has_include [PR80753]

As can be seen in the testcase, we don't diagnose #include/#include_next
of a non-existent header if __has_include/__has_include_next is done for
that header first.
The problem is that we normally error the first time some header is not
found, but in the _cpp_FFK_HAS_INCLUDE case obviously don't want to diagnose
it, just expand it to 0. And libcpp caches both successful includes and
unsuccessful ones.

The following patch fixes that by remembering that we haven't diagnosed
error when using __has_include* on it, and diagnosing it when using the
cache entry in normal mode the first time.

I think _cpp_FFK_NORMAL is the only mode in which we normally diagnose
errors, for _cpp_FFK_PRE_INCLUDE that open_file_failed isn't reached
and for _cpp_FFK_FAKE neither.

2023-06-15 Jakub Jelinek <jakub@redhat.com>

PR preprocessor/80753
libcpp/
* files.cc (struct _cpp_file): Add deferred_error bitfield.
(_cpp_find_file): When finding a file in cache with deferred_error
set in _cpp_FFK_NORMAL mode, call open_file_failed and clear the flag.
Set deferred_error in _cpp_FFK_HAS_INCLUDE mode if open_file_failed
hasn't been called.
gcc/testsuite/
* c-c++-common/missing-header-5.c: New test.

libgomp: Extend OMP_ALLOCATOR, add affinity env var doc

Support OpenMP 5.1's syntax for OMP_ALLOCATOR as well,
which permits besides predefined allocators also
predefined memspaces optionally followed by traits.

Additionally, this commit adds the previously lacking
documentation for OMP_ALLOCATOR, OMP_AFFINITY_FORMAT
and OMP_DISPLAY_AFFINITY.

libgomp/ChangeLog:

* env.c (gomp_def_allocator_envvar): New var.
(parse_allocator): Handle OpenMP 5.1 syntax.
(cleanup_env): New.
(omp_display_env): Output gomp_def_allocator_envvar
for an allocator with traits.
* libgomp.texi (OMP_ALLOCATOR, OMP_AFFINITY_FORMAT,
OMP_DISPLAY_AFFINITY): New.
* testsuite/libgomp.c/allocator-1.c: New test.
* testsuite/libgomp.c/allocator-2.c: New test.
* testsuite/libgomp.c/allocator-3.c: New test.
* testsuite/libgomp.c/allocator-4.c: New test.
* testsuite/libgomp.c/allocator-5.c: New test.
* testsuite/libgomp.c/allocator-6.c: New test.

x86/AVX512: use VMOVDDUP for broadcast to V2DF

Like is already the case for the AVX/AVX2 form, VMOVDDUP - acting on
double precision floating values - is more appropriate to use here, and
it can also result in shorter insn encodings when source is memory or
%xmm0...%xmm7, and no masking is applied (in allowing a 2-byte VEX
prefix then instead of a 3-byte one).

gcc/

* config/i386/sse.md (<avx512>_vec_dup<mode><mask_name>): Use
vmovddup.

x86: add Bk and Br to comment list B's sub-chars

gcc/

* config/i386/constraints.md: Mention k and r for B.

LoongArch: Avoid non-returning indirect jumps through $ra [PR110136]

Micro-architecture unconditionally treats a "jr $ra" as "return from subroutine",
hence doing "jr $ra" would interfere with both subroutine return prediction and
the more general indirect branch prediction.

Therefore, a problem like PR110136 can cause a significant increase in branch error
prediction rate and affect performance. The same problem exists with "indirect_jump".

gcc/ChangeLog:

PR target/110136
* config/loongarch/loongarch.md: Modify the register constraints for template
"jumptable" and "indirect_jump" from "r" to "e".

Co-authored-by: Andrew Pinski <apinski@marvell.com>

ada: Remove unused files

gcc/ada/ChangeLog:

* vxworks7-cert-rtp-base-link.spec: Removed.
* vxworks7-cert-rtp-base-link__ppc64.spec: Removed.
* vxworks7-cert-rtp-base-link__x86.spec: Removed.
* vxworks7-cert-rtp-base-link__x86_64.spec: Removed.
* vxworks7-cert-rtp-link.spec: Removed.
* vxworks7-cert-rtp-link__ppcXX.spec: Removed.

ada: Fix wrong code for ACATS cd1c03i on Morello target

gcc/ada/

* gcc-interface/utils2.cc (build_binary_op) <MODIFY_EXPR>: Do not
remove a VIEW_CONVERT_EXPR on the LHS if it is also on the RHS.

ada: Fix wrong finalization for double subtype of bounded vector

The special handling of temporaries created for return values and subject
to a renaming needs to be restricted to the top level, where it is needed
to prevent dangling references to the frame of the elaboration routine from
being created, because, at a lower level, the front-end may create implicit
renamings of objects as these temporaries, so a copy is not allowed.

gcc/ada/

* gcc-interface/decl.cc (gnat_to_gnu_entity) <E_Variable>: Restrict
the special handling of temporaries created for return values and
subject to a renaming to the top level.

ada: Make minor improvements to user's guide

gcc/ada/

* doc/gnat_ugn/about_this_guide.rst: Fix typo. Uniformize punctuation.
* doc/gnat_ugn/the_gnat_compilation_model.rst: Uniformize punctuation.
Fix capitalization. Fix indentation of code block. Fix RST formatting
syntax errors.
* gnat_ugn.texi: Regenerate.

ada: Reject Loop_Entry inside prefix of Loop_Entry

This rule was incompletely stated in SPARK RM and not checked.
This is now fixed.

gcc/ada/

* sem_attr.adb (Analyze_Attribute): Reject case of Loop_Entry
inside the prefix of Loop_Entry, as per SPARK RM 5.5.3.1(4,8).

ada: Fix too small secondary stack allocation for returned conversion

The previous fix did not address a latent issue whereby the allocation
would be made using the (static) subtype of the conversion instead of
the (dynamic) subtype of the return object, so this change rewrites the
code responsible for determining the type used for the allocation, and
also contains a small improvement to the Has_Tag_Of_Type predicate.

gcc/ada/

* exp_ch3.adb (Make_Allocator_For_Return): Rewrite the logic that
determines the type used for the allocation and add assertions.
* exp_util.adb (Has_Tag_Of_Type): Also return true for extension
aggregates.

ada: Fix internal error on loop iterator filter with -gnatVa

The problem is that the condition of the iterator filter is expanded early,
before it is integrated into an if statement of the loop body, so there is
no place to attach the actions generated by this expansion.

This happens only for simple loops, i.e. with a parameter specification, so
the fix uses the same approach for them as for loops based on iterators.

gcc/ada/

* sinfo.ads (Iterator_Filter): Document field.
* sem_ch5.adb (Analyze_Iterator_Specification): Move comment around.
(Analyze_Loop_Parameter_Specification): Only preanalyze the iterator
filter, if any.
* exp_ch5.adb (Expand_N_Loop_Statement): Analyze the new list built
when an iterator filter is present.

ada: Revert latest change to Find_Hook_Context

The issue is that, if an aggregate is both below a conditional expression
and above another conditional expression in the tree, we have currently no
place to put the finalization actions generated by the innermost expression
in the context of the aggregate before it is expanded, so they end up being
placed after the outermost expression.

But it is not clear whether that's really problematic because this does not
seem to happen for array aggregates with multiple or others choices: in this
case the aggregate is expanded first and the code path is not taken.

gcc/ada/

* exp_util.adb (Find_Hook_Context): Revert latest change.

ada: Fix too small secondary stack allocation for returned aggregate

This restores the specific treatment of aggregates that are returned through
an extended return statement in a function returning a class-wide type, and
which was incorrectly dropped in an earlier change.

gcc/ada/

* exp_ch3.adb (Make_Allocator_For_Return): Deal again specifically
with an aggregate returned through an object of a class-wide type.

ada: Remove dead code in Expand_Iterator_Loop_Over_Container

The Condition_Actions field can only be populated for while loops.

gcc/ada/

* exp_ch5.adb (Expand_Iterator_Loop_Over_Container): Do not insert
an always empty list. Remove unused parameter Isc.
(Expand_Iterator_Loop): Adjust call to above procedure.

ada: Add escape hatch to configurable run-time

Before this patch, the fact that Restrictions pragmas had to fit on
a single line in system.ads was difficult to reconcile with the
80-character line limit that is enforced in that file.

The special rules for pragmas in system.ads made it impossible to us
the Style_Checks pragma to allow long Restrictions pragmas. This patch
relaxes those rules so the Style_Checks pragma can be used in
system.ads.

gcc/ada/

* targparm.adb: Allow pragma Style_Checks in some forms.
* targparm.ads: Document new pragma permission.

ada: Fix missing finalization for aggregates nested in conditional expressions

The finalization actions for the components of the aggregates are blocked
by Expand_Ctrl_Function_Call, which sets Is_Ignored_Transient on all the
temporaries generated from within a conditional expression whatever the
intermediate constructs. Now aggregates and their expansion in the form
of block and loop statements are "impenetrable" as far as temporaries are
concerned, i.e. the lifetime of temporaries generated within them does
not extend beyond them, so their finalization must not be blocked there.

gcc/ada/

* exp_util.ads (Within_Case_Or_If_Expression): Adjust description.
* exp_util.adb (Find_Hook_Context): Stop the search for the topmost
conditional expression, if within one, at contexts where temporaries
may be contained.
(Within_Case_Or_If_Expression): Return false upon first encoutering
contexts where temporaries may be contained.

ada: Adjust QNX Ada priorities to match QNX system priorities

The Ada priority range of the QNX runtime started from 0, differing from
the QNX system priorities range starting from 1. As this may cause
confusion, especially if used in a mixed language environment, the Ada
priority range now starts at 1.

The default priority of Ada tasks as mandated is the middle of the
priority range. On QNX this means the default priority of Ada tasks is
30. This is much higher than the default QNX priority of 10 and may
cause unexpected system interruptions when Ada tasks take a lot of CPU time.

gcc/ada/

* libgnarl/s-osinte__qnx.adb: Adjust priority conversion function.
* libgnat/system-qnx-arm.ads: Adjust priority range and default
priority.

ada: Adjust comments in targparm.ads

This patch removes a few dangling references to the late front-end
implementation of exceptions from the comments of targparm.ads, and
also fixes a thinko there.

gcc/ada/

* targparm.ads: Remove references to front-end-based exceptions. Fix
thinko.

ada: Accept aspect Always_Terminates on packages

The recently added aspect Always_Terminates is now allowed on packages
and generic packages, but only when it has no arguments. The intuitive
meaning is that all subprograms declared in such a package are always
terminating.

gcc/ada/

* contracts.adb (Add_Contract_Item): Add pragma Always_Terminates to
package contract.
* sem_prag.adb (Analyze_Pragma): Accept pragma Always_Terminates on
packages and generic packages, but only when it has no arguments.

ada: Accept aspect Always_Terminates on entries

The recently added aspect Always_Terminates is allowed on both
procedures and entries.

gcc/ada/

* sem_prag.adb (Analyze_Pragma): Accept pragma Always_Terminates when
it applies to an entry.

ada: Reject aspect Always_Terminates on functions and generic functions

The recently added aspect Always_Terminates is only allowed on
procedures.

gcc/ada/

* sem_prag.adb (Analyze_Pragma): Reject pragma Always_Terminates when
it applies to a function or generic function.

ada: Fix missing error on function call returning incomplete view

Testing for the presence of Non_Limited_View is not sufficient to detect
whether the nonlimited view has been analyzed because Build_Limited_Views
always sets the field on the limited view. Instead the discriminant is
whether this nonlimited view is itself an incomplete type.

gcc/ada/

* sem_ch4.adb (Analyze_Call): Adjust the test to detect the presence
of an incomplete view of a type on a function call.

ada: Fix minor issues in comments

The package Ttypef has been removed but a reference to it was left
over in a comment. This patch removes that reference, and also fixes
a typo.

gcc/ada/

* ttypes.ads: Remove reference to Ttypef in comment. Fix typo in
comment.

ada: Remove Ttypes.Max_Unaligned_Field

This constant has been unused for ages. The corresponding getter function
is also removed from the Get_Targ package, but the corresponding constant
declared in Set_Targ is preserved for the sake of backward compatibility
of the target file format.

gcc/ada/

* get_targ.ads (Get_Max_Unaligned_Field): Delete.
* ada_get_targ.adb (Get_Max_Unaligned_Field): Likewise.
* get_targ.adb (Get_Max_Unaligned_Field): Likewise.
* set_targ.ads (Max_Unaligned_Field): Adjust comment.
* set_targ.adb: Set Max_Unaligned_Field to 1 during elaboration.
* ttypes.ads (Max_Unaligned_Field): Delete.

ada: Fix inverted implementation of RM 8.4(10) clause for operators

The comment is correct but the code implements the opposite outcome.

gcc/ada/

* sem_type.adb (Disambiguate): Fix pasto in the implementation of
the RM 8.4(10) clause for operators.

ada: Accept aspect Always_Terminates without expression

The recently added aspect Always_Terminates is now accepted without
explicit boolean expression, where a missing expression implicitly means
True, similar to aspects Async_Readers, Async_Writers, etc.

gcc/ada/

* aspects.adb
(Base_Aspect): Fix layout.
* aspects.ads
(Aspect_Argument): Expression for Always_Terminates is optional.
* sem_prag.adb
(Analyze_Always_Terminates_In_Decl_Part): Only analyze expression when
pragma argument is present.
(Analyze_Pragma): Argument for Always_Terminates is optional; fix
whitespace for Async_Readers.

ada: Crash on C++ constructor of private type

The compiler crashes compiling a function that has pragma
CPP_constructor when its return type is a private type.

gcc/ada/

* sem_util.adb
(Is_CPP_Constructor_Call): Add missing support for calls to
functions returning a private type.

ada: Remove obsolete references for Build_Transient_Object_Statements

gcc/ada/

* exp_util.ads (Build_Transient_Object_Statements): Remove obsolete
references to array and record aggregates in documentation.

ada: Fix aspect Linker_Section ignored on subprogram body

The compiler is waiting for the freeze node of the body, but it is never
generated since the freezing of the body is not delayed. The change also
removes an obsolete piece of code.

gcc/ada/

* sem_ch13.adb (Analyze_Aspect_Specifications): Add missing items
in the list of aspects handled by means of Insert_Pragma.
<Aspect_Linker_Section>: Remove obsolete code. Do not delay the
processing of the aspect if the entity is already frozen.

ada: Cleanup analysis of iterated component association

Cleanups related to analysis of iterated component association for
GNATprove.

gcc/ada/

* sem_aggr.adb
(Resolve_Array_Aggregate): Simplify comment.
(Resolve_Iterated_Component_Association): Tune comment; change variable
to constant.

LoongArch: Set default alignment for functions and labels with -mtune

The LA464 micro-architecture is sensitive to alignment of code.  The
Loongson team has benchmarked various combinations of function, the
results [1] show that 16-byte label alignment together with 32-byte
function alignment gives best results in terms of SPEC score.

Add a mtune-based table-driven mechanism to set the default of
-falign-{functions,labels}.  As LA464 is the first (and the only for
now) uarch supported by GCC, the same setting is also used for
the "generic" -mtune=loongarch64.  In the future we may set different
settings for LA{2,3,6}64 once we add the support for them.

Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk?

gcc/ChangeLog:

* config/loongarch/loongarch-tune.h (loongarch_align): New
struct.
* config/loongarch/loongarch-def.h (loongarch_cpu_align): New
array.
* config/loongarch/loongarch-def.c (loongarch_cpu_align): Define
the array.
* config/loongarch/loongarch.cc
(loongarch_option_override_internal): Set the value of
-falign-functions= if -falign-functions is enabled but no value
is given.  Likewise for -falign-labels=.

Fix 'dg-warning' in 'c-c++-common/Wfree-nonheap-object-3.c' for C++

    [...]/c-c++-common/Wfree-nonheap-object-3.c:57:24: warning: 'malloc (dealloc_float)' attribute ignored with deallocation functions declared 'inline' [-Wattributes]
    [...]/c-c++-common/Wfree-nonheap-object-3.c:51:1: note: deallocation function declared here
    [...]/c-c++-common/Wfree-nonheap-object-3.c: In function 'void test_nowarn_int(int)':
    [...]/c-c++-common/Wfree-nonheap-object-3.c:25:20: warning: 'void __builtin_free(void*)' called on pointer 'p' with nonzero offset 4 [-Wfree-nonheap-object]
    [...]/c-c++-common/Wfree-nonheap-object-3.c:24:24: note: returned from 'int* alloc_int(int)'
    [...]/c-c++-common/Wfree-nonheap-object-3.c: In function 'void test_nowarn_long(int)':
    [...]/c-c++-common/Wfree-nonheap-object-3.c:45:18: warning: 'void dealloc_long(long int*)' called on pointer '<unknown>' with nonzero offset 8 [-Wfree-nonheap-object]
    [...]/c-c++-common/Wfree-nonheap-object-3.c:44:26: note: returned from 'long int* alloc_long(int)'
    In function 'void dealloc_float(float*)',
        inlined from 'void test_nowarn_float(int)' at [...]/c-c++-common/Wfree-nonheap-object-3.c:68:19:
    [...]/c-c++-common/Wfree-nonheap-object-3.c:53:18: warning: 'void __builtin_free(void*)' called on pointer '<unknown>' with nonzero offset 8 [-Wfree-nonheap-object]
    [...]/c-c++-common/Wfree-nonheap-object-3.c: In function 'void test_nowarn_float(int)':
    [...]/c-c++-common/Wfree-nonheap-object-3.c:67:28: note: returned from 'float* alloc_float(int)'
    PASS: c-c++-common/Wfree-nonheap-object-3.c  -std=gnu++98  (test for warnings, line 25)
    FAIL: c-c++-common/Wfree-nonheap-object-3.c  -std=gnu++98  (test for warnings, line 45)
    PASS: c-c++-common/Wfree-nonheap-object-3.c  -std=gnu++98  (test for warnings, line 51)
    PASS: c-c++-common/Wfree-nonheap-object-3.c  -std=gnu++98  (test for warnings, line 53)
    PASS: c-c++-common/Wfree-nonheap-object-3.c  -std=gnu++98  (test for warnings, line 57)
    FAIL: c-c++-common/Wfree-nonheap-object-3.c  -std=gnu++98 (test for excess errors)
    Excess errors:
    [...]/c-c++-common/Wfree-nonheap-object-3.c:45:18: warning: 'void dealloc_long(long int*)' called on pointer '<unknown>' with nonzero offset 8 [-Wfree-nonheap-object]

..., that is: decorated 'void dealloc_long(long int*)' instead of plain
'dealloc_long' -- similar to how all the other 'dg-warning's allow for the
decorated function signature in addition to the plain one.

This issue was latent since the test case was added in
commit fe7f75cf16783589eedbab597e6d0b8d35d7e470
"Correct/improve maybe_emit_free_warning (PR middle-end/98166, PR c++/57111, PR middle-end/98160)",
and was finally exposed by my recent
commit 9c03391ba447ff86038d6a34c90ae737c3915b5f
"Tighten 'dg-warning' alternatives in 'c-c++-common/Wfree-nonheap-object{,-2,-3}.c'".

gcc/testsuite/
* c-c++-common/Wfree-nonheap-object-3.c: Fix 'dg-warning' for C++.

middle-end, i386: Pattern recognize add/subtract with carry [PR79173]

The following patch introduces {add,sub}c5_optab and pattern recognizes
various forms of add with carry and subtract with carry/borrow, see
pr79173-{1,2,3,4,5,6}.c tests on what is matched.
Primarily forms with 2 __builtin_add_overflow or __builtin_sub_overflow
calls per limb (with just one for the least significant one), for
add with carry even when it is hand written in C (for subtraction
reassoc seems to change it too much so that the pattern recognition
doesn't work).  __builtin_{add,sub}_overflow are standardized in C23
under ckd_{add,sub} names, so it isn't any longer a GNU only extension.

Note, clang has for these (IMHO badly designed)
__builtin_{add,sub}c{b,s,,l,ll} builtins which don't add/subtract just
a single bit of carry, but basically add 3 unsigned values or
subtract 2 unsigned values from one, and result in carry out of 0, 1, or 2
because of that.  If we wanted to introduce those for clang compatibility,
we could and lower them early to just two __builtin_{add,sub}_overflow
calls and let the pattern matching in this patch recognize it later.

I've added expanders for this on ix86 and in addition to that
added various peephole2s (in preparation patches for this patch) to make
sure we get nice (and small) code for the common cases.  I think there are
other PRs which request that e.g. for the _{addcarry,subborrow}_u{32,64}
intrinsics, which the patch also improves.

Would be nice if support for these optabs was added to many other targets,
arm/aarch64 and powerpc* certainly have such instructions, I'd expect
in fact that most targets do.

The _BitInt support I'm working on will also need this to emit reasonable
code.

2023-06-15  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/79173
* internal-fn.def (UADDC, USUBC): New internal functions.
* internal-fn.cc (expand_UADDC, expand_USUBC): New functions.
(commutative_ternary_fn_p): Return true also for IFN_UADDC.
* optabs.def (uaddc5_optab, usubc5_optab): New optabs.
* tree-ssa-math-opts.cc (uaddc_cast, uaddc_ne0, uaddc_is_cplxpart,
match_uaddc_usubc): New functions.
(math_opts_dom_walker::after_dom_children): Call match_uaddc_usubc
for PLUS_EXPR, MINUS_EXPR, BIT_IOR_EXPR and BIT_XOR_EXPR unless
other optimizations have been successful for those.
* gimple-fold.cc (gimple_fold_call): Handle IFN_UADDC and IFN_USUBC.
* fold-const-call.cc (fold_const_call): Likewise.
* gimple-range-fold.cc (adjust_imagpart_expr): Likewise.
* tree-ssa-dce.cc (eliminate_unnecessary_stmts): Likewise.
* doc/md.texi (uaddc<mode>5, usubc<mode>5): Document new named
patterns.
* config/i386/i386.md (uaddc<mode>5, usubc<mode>5): New
define_expand patterns.
(*setcc_qi_addqi3_cconly_overflow_1_<mode>, *setccc): Split
into NOTE_INSN_DELETED note rather than nop instruction.
(*setcc_qi_negqi_ccc_1_<mode>, *setcc_qi_negqi_ccc_2_<mode>):
Likewise.

* gcc.target/i386/pr79173-1.c: New test.
* gcc.target/i386/pr79173-2.c: New test.
* gcc.target/i386/pr79173-3.c: New test.
* gcc.target/i386/pr79173-4.c: New test.
* gcc.target/i386/pr79173-5.c: New test.
* gcc.target/i386/pr79173-6.c: New test.
* gcc.target/i386/pr79173-7.c: New test.
* gcc.target/i386/pr79173-8.c: New test.
* gcc.target/i386/pr79173-9.c: New test.
* gcc.target/i386/pr79173-10.c: New test.

i386: Add peephole2 patterns to improve subtract with borrow with memory destination [PR79173]

This patch adds subborrow<mode> alternative so that it can have memory
destination and adds various peephole2s which help to match it.

2023-06-15 Jakub Jelinek <jakub@redhat.com>

PR middle-end/79173
* config/i386/i386.md (subborrow<mode>): Add alternative with
memory destination and add for it define_peephole2
TARGET_READ_MODIFY_WRITE/-Os patterns to prefer using memory
destination in these patterns.

i386: Add peephole2 patterns to improve add with carry or subtract with borrow with memory destination [PR79173]

This patch adds various peephole2s which help to recognize add with
carry or subtract with borrow with memory destination.

2023-06-14 Jakub Jelinek <jakub@redhat.com>

PR middle-end/79173
* config/i386/i386.md (*sub<mode>_3, @add<mode>3_carry,
addcarry<mode>, @sub<mode>3_carry, *add<mode>3_cc_overflow_1): Add
define_peephole2 TARGET_READ_MODIFY_WRITE/-Os patterns to prefer
using memory destination in these patterns.

middle-end: Move constant args folding of .UBSAN_CHECK_* and .*_OVERFLOW into fold-const-call.cc

Here is an incremental patch to handle constant folding of these
in fold-const-call.cc rather than gimple-fold.cc.
Not really sure if that is the way to go because it is replacing 28
lines of former code with 65 of new code, for the overall benefit that say
int
foo (long long *p)
{
  int one = 1;
  long long max = __LONG_LONG_MAX__;
  return __builtin_add_overflow (one, max, p);
}
can be now fully folded already in ccp1 pass while before it was only
cleaned up in forwprop1 pass right after it.

On Wed, Jun 14, 2023 at 12:25:46PM +0000, Richard Biener wrote:
> I think that's still very much desirable so this followup looks OK.
> Maybe you can re-base it as prerequesite though?

Rebased then (of course with the UADDC/USUBC handling removed from this
first patch, will be added in the second one).

2023-06-15  Jakub Jelinek  <jakub@redhat.com>

* gimple-fold.cc (gimple_fold_call): Move handling of arg0
as well as arg1 INTEGER_CSTs for .UBSAN_CHECK_{ADD,SUB,MUL}
and .{ADD,SUB,MUL}_OVERFLOW calls from here...
* fold-const-call.cc (fold_const_call): ... here.

AArch64: New RTL for ABD

This patch adds new RTL and tests for sabd and uabd

PR tree-optimization/109156

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (aarch64_<su>abd<mode>):
Rename to <su>abd<mode>3.
* config/aarch64/aarch64-sve.md (<su>abd<mode>_3): Rename
to <su>abd<mode>3.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/abd.h: New file.
* gcc.target/aarch64/abd_2.c: New test.
* gcc.target/aarch64/abd_3.c: New test.
* gcc.target/aarch64/abd_4.c: New test.
* gcc.target/aarch64/abd_none_2.c: New test.
* gcc.target/aarch64/abd_none_3.c: New test.
* gcc.target/aarch64/abd_none_4.c: New test.
* gcc.target/aarch64/abd_run_1.c: New test.
* gcc.target/aarch64/sve/abd_1.c: New test.
* gcc.target/aarch64/sve/abd_none_1.c: New test.
* gcc.target/aarch64/sve/abd_2.c: New test.
* gcc.target/aarch64/sve/abd_none_2.c: New test.

Missed opportunity to use [SU]ABD

This adds a recognition pattern for the non-widening
absolute difference (ABD).

gcc/ChangeLog:

* doc/md.texi (sabd, uabd): Document them.
* internal-fn.def (ABD): Use new optab.
* optabs.def (sabd_optab, uabd_optab): New optabs,
* tree-vect-patterns.cc (vect_recog_absolute_difference):
Recognize the following idiom abs (a - b).
(vect_recog_sad_pattern): Refactor to use
vect_recog_absolute_difference.
(vect_recog_abd_pattern): Use patterns found by
vect_recog_absolute_difference to build a new ABD
internal call.

LoongArch: Change the default value of LARCH_CALL_RATIO to 6.

During the regression testing of the LoongArch architecture GCC, it was found
that the tests in the pr90883.C file failed. The problem was modulated and
found that the error was caused by setting the macro LARCH_CALL_RATIO to a too
large value. Combined with the actual LoongArch architecture, the different
thresholds for meeting the test conditions were tested using the engineering method
(SPEC CPU 2006), and the results showed that its optimal threshold should be set
to 6.

gcc/ChangeLog:

* config/loongarch/loongarch.h (LARCH_CALL_RATIO): Modify the value
of macro LARCH_CALL_RATIO on LoongArch to make it perform optimally.

RISC-V: Use merge approach to optimize vector permutation

This patch is to optimize the permuation case that is suiteable use
merge approach.

Consider this following case:
typedef int8_t vnx16qi __attribute__((vector_size (16)));

void __attribute__ ((noipa))
merge0 (vnx16qi x, vnx16qi y, vnx16qi *out)
{
  vnx16qi v = __builtin_shufflevector ((vnx16qi) x, (vnx16qi) y, MASK_16);
  *(vnx16qi*)out = v;
}

The gimple IR:
v_3 = VEC_PERM_EXPR <x_1(D), y_2(D), { 0, 17, 2, 19, 4, 21, 6, 23, 8, 9, 10, 27, 12, 29, 14, 31 }>;

Selector = { 0, 17, 2, 19, 4, 21, 6, 23, 8, 9, 10, 27, 12, 29, 14, 31 }, the common expression:
{ 0, nunits + 1, 2, nunits + 3, 4, nunits + 5, ...  }

For this selector, we can use vmsltu + vmerge to optimize the codegen.

Before this patch:
merge0:
        addi    a5,sp,16
        vl1re8.v        v3,0(a5)
        li      a5,31
        vsetivli        zero,16,e8,m1,ta,mu
        vmv.v.x v2,a5
        lui     a5,%hi(.LANCHOR0)
        addi    a5,a5,%lo(.LANCHOR0)
        vl1re8.v        v1,0(a5)
        vl1re8.v        v4,0(sp)
        vand.vv v1,v1,v2
        vmsgeu.vi       v0,v1,16
        vrgather.vv     v2,v4,v1
        vadd.vi v1,v1,-16
        vrgather.vv     v2,v3,v1,v0.t
        vs1r.v  v2,0(a0)
        ret

After this patch:
merge0:
        addi    a5,sp,16
        vl1re8.v        v1,0(a5)
        lui     a5,%hi(.LANCHOR0)
        addi    a5,a5,%lo(.LANCHOR0)
        vsetivli        zero,16,e8,m1,ta,ma
        vl1re8.v        v0,0(a5)
        vl1re8.v        v2,0(sp)
        vmsltu.vi       v0,v0,16
        vmerge.vvm      v1,v1,v2,v0
        vs1r.v  v1,0(a0)
        ret

The key of this optimization is that:
1. mask = vmsltu (selector, nunits)
2. result = vmerge (op0, op1, mask)

gcc/ChangeLog:

* config/riscv/riscv-v.cc (shuffle_merge_patterns): New pattern.
(expand_vec_perm_const_1): Add merge optmization.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-7.c: New test.

RISC-V: Ensure vector args and return use function stack to pass [PR110119]

The V2 patch address comments from Juzhe, thanks.

Hi,

The reason for this bug is that in the case where the vector register is set
to a fixed length (with `--param=riscv-autovec-preference=fixed-vlmax` option),
TARGET_PASS_BY_REFERENCE thinks that variables of type vint32m1 can be passed
through two scalar registers, but when GCC calls FUNCTION_VALUE (call function
riscv_get_arg_info inside) it returns NULL_RTX. These two functions are not
unified. The current treatment is to pass all vector arguments and returns
through the function stack, and a new calling convention for vector registers
will be added in the future.

https://github.com/riscv-non-isa/riscv-elf-psabi-doc/
https://github.com/palmer-dabbelt/riscv-elf-psabi-doc/commit/126fa719972ff998a8a239c47d506c7809aea363

Best,
Lehua

gcc/ChangeLog:
PR target/110119
* config/riscv/riscv.cc (riscv_get_arg_info): Return NULL_RTX for vector mode
(riscv_pass_by_reference): Return true for vector mode

gcc/testsuite/ChangeLog:
PR target/110119
* gcc.target/riscv/rvv/base/pr110119-1.c: New test.
* gcc.target/riscv/rvv/base/pr110119-2.c: New test.

RISC-V: Align the predictor style for define_insn_and_split

This patch is considered as the follow up of the below PATCH.

https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621347.html

We aligned the predictor style for the define_insn_and_split suggested
by Kito. To avoid potential issues before we hit.

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/autovec-opt.md: Align the predictor sytle.
* config/riscv/autovec.md: Ditto.

RISC-V: Bugfix for vec_init repeating auto vectorization in RV32

When constructing a vector mask from individual elements we wrongly
assumed that we can broadcast BITS_PER_WORD (i.e. XLEN). The maximum is
actually the vector element length (i.e. ELEN). This patch fixes this.

After this patch, below failures on RV32 will be fixed.

FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/repeat_run-3.c -std=c99 -O3 -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-v.cc (rvv_builder::get_merge_scalar_mask):
Take elen instead of scalar BITS_PER_WORD.
(expand_vector_init_merge_repeating_sequence): Use inner_bits_size
instead of scaler BITS_PER_WORD.

Daily bump.

Remove MFWRAP_SPEC remnant

This patch removes a remnant of mudflap.

gcc/ChangeLog:
* config/moxie/uclinux.h (MFWRAP_SPEC): Remove

MAINTAINERS: Add myself to write after approval

ChangeLog:

* MAINTAINERS: Add myself to write after approval

aarch64: Fix -Werror=sign-compare bootstrap failure

Pushing to fix bootstrap.

gcc/ChangeLog:

* config/aarch64/aarch64-sve-builtins-base.cc (svlast_impl::fold):
Fix signed comparison warning in loop from npats to enelts.

[contrib] validate_failures.py: Ignore stray filesystem paths in results

This patch simplifies comparison of results that have filesystem
paths. E.g., (assuming different values of <N>):
<cut>
Running /home/user/gcc-N/gcc/testsuite/gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp ...
ERROR: tcl error sourcing /home/user/gcc-N/gcc/testsuite/gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp.
</cut>

We add "--srcpath <regex>", option, and set it by default to
"[^ ]+/testsuite/", which works well for all components of the GNU
Toolchain. We then remove substrings matching <regex> from paths of
.exp files and from occasional "ERROR:" results.

contrib/ChangeLog:

* testsuite-management/validate_failures.py (TestResult,)
(ParseManifestWorker, ParseSummary, Main): Handle new option
"--srcpath <regex>".

[contrib] validate_failures.py: Add "--expiry_date YYYYMMDD" option

This option sets "today" date to compare expiration entries against.
Setting expiration date into the future allows re-detection of flaky
tests and creating fresh entries for them before the current flaky
entries expire.

contrib/ChangeLog:

* testsuite-management/validate_failures.py (TestResult): Update.
(Main): Handle new option "--expiry_date YYYYMMDD".

[contrib] validate_failures.py: Add new option --invert_match

This option is used to detect flaky tests that FAILed in the clean
build (or manifest), but PASSed in the current build (or manifest).

The option inverts output logic similar to what "-v/--invert-match"
does for grep.

contrib/ChangeLog:

* testsuite-management/validate_failures.py (ResultSet.update,)
(ResultSet.HasTestsuite): New methods.
(GetResults): Update.
(ParseSummary, CompareResults, PerformComparison, Main): Handle new
option --invert_match.

[contrib] validate_failures.py: Improve error output

- Print message in case of broken sum file error.
- Print error messages to stderr. The script's stdout is, usually,
redirected to a file, and error messages shouldn't go there.

contrib/ChangeLog:

* testsuite-management/validate_failures.py (TestResult): Improve error
output.

[contrib] validate_failures.py: Support "$tool:" prefix in exp names

This makes it easier to extract the $tool:$exp pair when iterating
over failures/flaky tests, which, in turn, simplifies re-running
testsuite parts that have unexpected failures or passes.

contrib/ChangeLog:

* testsuite-management/validate_failures.py (_EXP_LINE_FORMAT,)
(_EXP_LINE_REX, ResultSet): Support "$tool:" prefix in exp names.

[contrib] validate_failures.py: Use exit code "2" to indicate regression

... in the results. Python exits with code "1" on exceptions and
internal errors, which we use to detect failure to parse results.

contrib/ChangeLog:

* testsuite-management/validate_failures.py (Main): Use exit code "2"
to indicate regression.

[contrib] validate_failures.py: Be more stringent in parsing result lines

Before this patch we would identify malformed line
"UNRESOLVEDTest run by tcwg-buildslave on Mon Aug 23 10:17:50 2021"
as an interesting result, only to fail in TestResult:__init__ due
to missing ":" after UNRESOLVED.

This patch makes all places that parse result lines use a single
compiled regex.

contrib/ChangeLog:

* testsuite-management/validate_failures.py (_VALID_TEST_RESULTS_REX):
Update.
(TestResult): Use _VALID_TEST_RESULTS_REX.

[contrib] validate_failures.py: Add more verbosity levels

... to control validate_failures.py output

contrib/ChangeLog:

* testsuite-management/validate_failures.py: Add more verbosity levels.

[contrib] validate_failures.py: Simplify GetManifestPath()

... and don't require a valid build directory when no data from it
is necessary.

contrib/ChangeLog:

* testsuite-management/validate_failures.py: Simplify GetManifestPath().

[contrib] validate_failures.py: Read in manifest when comparing build dirs

This allows comparison of two build directories with a manifest
listing known flaky tests on the side.

contrib/ChangeLog:

* testsuite-management/validate_failures.py (GetResults): Update.
(CompareBuilds): Read in manifest.

[contrib] validate_failures.py: Support expiry attributes in manifests

contrib/ChangeLog:

* testsuite-management/validate_failures.py (ParseManifestWorker):
Support expiry attributes in manifests.
(ParseSummary): Add a comment.

[contrib] validate_failures.py: Avoid testsuite aliasing

This patch adds tracking of current testsuite "tool" and "exp"
to the processing of .sum files. This avoids aliasing between
tests from different testsuites with same name+description.

E.g., this is necessary for testsuite/c-c++-common, which is ran
for both gcc and g++ "tools".

This patch changes manifest format from ...
<cut>
FAIL: gcc_test
FAIL: g++_test
</cut>
... to ...
<cut>
=== gcc tests ===
Running gcc/foo.exp ...
FAIL: gcc_test
=== gcc Summary ==
=== g++ tests ===
Running g++/bar.exp ...
FAIL: g++_test
=== g++ Summary ==
</cut>.

The new format uses same formatting as DejaGnu's .sum files
to specify which "tool" and "exp" the test belongs to.

contrib/ChangeLog:

* testsuite-management/validate_failures.py: Avoid testsuite
aliasing.

c++: tweak c++17 ctor/conversion tiebreaker [DR2327]

In discussion of this issue CWG decided that the change of behavior on
well-formed code like overload-conv-4.C is undesirable. In further
discussion of possible resolutions, we discovered that we can avoid that
change while still getting the desired behavior on overload-conv-3.C by
making this a tiebreaker after comparing conversions, rather than before.
This also simplifies the implementation.

The issue resolution has not yet been finalized, but this seems like a clear
improvement.

DR 2327
PR c++/86521

gcc/cp/ChangeLog:

* call.cc (joust_maybe_elide_copy): Don't change cand.
(joust): Move the elided tiebreaker later.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/overload-conv-4.C: Remove warnings.
* g++.dg/cpp1z/elide7.C: New test.

libstdc++: Clarify manual demangle doc

libstdc++-v3/ChangeLog:

* doc/xml/manual/extensions.xml: Remove demangle exception
description and include.
* doc/html/manual/ext_demangling.html: Regenerate.

Align a 'OMP_TARGET_OFFLOAD=mandatory' diagnostic with others

On 2023-06-14T11:42:22+0200, Tobias Burnus <tobias@codesourcery.com> wrote:
> On 14.06.23 10:09, Thomas Schwinge wrote:
>> Let me know if I should also adjust the new 'target { ! offload_device }'
>> diagnostic "[...] MANDATORY but only the host device is available" to
>> include a comma before 'but', for consistency with the other existing
>> diagnostics (cited above)?
>
> I think it makes sense to be consistent. Thus: Yes, please add the commas.

Fix-up for recent commit 18c8b56c7d67a9e37acf28822587786f0fc0efbc
"OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory".

libgomp/
* target.c (resolve_device): Align a
'OMP_TARGET_OFFLOAD=mandatory' diagnostic with others.
* testsuite/libgomp.c/target-51.c: Adjust.

driver: Forward '-lgfortran', '-lm' to offloading compilation

..., so that users don't manually need to specify
'-foffload-options=-lgfortran', '-foffload-options=-lm' in addition to
'-lgfortran', '-lm' (specified manually, or implicitly by the driver).

gcc/
* gcc.cc (driver_handle_option): Forward host '-lgfortran', '-lm'
to offloading compilation.
* config/gcn/mkoffload.cc (main): Adjust.
* config/nvptx/mkoffload.cc (main): Likewise.
* doc/invoke.texi (foffload-options): Update example.
libgomp/
* testsuite/libgomp.fortran/fortran.exp (lang_link_flags): Don't
set.
* testsuite/libgomp.oacc-fortran/fortran.exp (lang_link_flags):
Likewise.
* testsuite/libgomp.c/simd-math-1.c: Remove
'-foffload-options=-lm'.
* testsuite/libgomp.fortran/fortran-torture_execute_math.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/fortran-torture_execute_math.f90:
Likewise.

Add 'libgomp.{,oacc-}fortran/fortran-torture_execute_math.f90'

..., via 'include'ing the existing 'gfortran.fortran-torture/execute/math.f90',
which therefore is enhanced for optional OpenACC 'serial', OpenMP 'target'
usage.

gcc/testsuite/
* gfortran.fortran-torture/execute/math.f90: Enhance for optional
OpenACC 'serial', OpenMP 'target' usage.
libgomp/
* testsuite/libgomp.fortran/fortran-torture_execute_math.f90: New.
* testsuite/libgomp.oacc-fortran/fortran-torture_execute_math.f90:
Likewise.

Tighten 'dg-warning' alternatives in 'c-c++-common/Wfree-nonheap-object{,-2,-3}.c'

..., added in commit fe7f75cf16783589eedbab597e6d0b8d35d7e470
"Correct/improve maybe_emit_free_warning (PR middle-end/98166, PR c++/57111, PR middle-end/98160)".

These use alternatives like, for example, "AB|CDE|FG", but what really must've
been meant is "A(B|C)D(E|F)G". The former variant also does "work": it matches
any of "AB", or "CDE", or "FG", which are components of the latter variant.
(That means, the former variant matches too loosely.)

gcc/testsuite/
* c-c++-common/Wfree-nonheap-object-2.c: Tighten 'dg-warning'
alternatives.
* c-c++-common/Wfree-nonheap-object-3.c: Likewise.
* c-c++-common/Wfree-nonheap-object.c: Likewise.

Remove 'gcc/testsuite/g++.dg/warn/Wfree-nonheap-object.s'

..., which, presumably, was added by mistake in
commit dce6c58db87ebf7f4477bd3126228e73e4eeee97
"Add support for detecting mismatched allocation/deallocation calls".

gcc/testsuite/
* g++.dg/warn/Wfree-nonheap-object.s: Remove.

Fix typo in 'libgomp.c/target-51.c'

..., and therefore, given 'target offload_device':

    PASS: libgomp.c/target-51.c (test for excess errors)
    PASS: libgomp.c/target-51.c execution test
    [-FAIL:-]{+PASS:+} libgomp.c/target-51.c output pattern test

Fix-up for recent commit 18c8b56c7d67a9e37acf28822587786f0fc0efbc
"OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory".

libgomp/
* testsuite/libgomp.c/target-51.c: Fix typo.

Use x instead of v for alternative 2 (v, BH) in mov<mode>_internal.

Since there's no evex version for vpcmpeq ymm, ymm, ymm.

gcc/ChangeLog:

PR target/110227
* config/i386/sse.md (mov<mode>_internal>): Use x instead of v
for alternative 2 since there's no evex version for vpcmpeqd
ymm, ymm, ymm.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr110227.c: New test.

OpenMP: Set default-device-var with OMP_TARGET_OFFLOAD=mandatory

OMP_TARGET_OFFLOAD=mandatory handling was before inconsistent. Hence, in
OpenMP 5.2 it was clarified/extended by having implications on the
default-device-var; additionally, omp_initial_device and omp_invalid_device
enum values/PARAMETERs were added; support for it was added
in r13-1066-g1158fe43407568 including aborting for omp_invalid_device and
non-conforming device numbers. Only the mandatory handling was missing.

Namely, while the default-device-var is usually initialized to value 0,
with 'mandatory' it must have the value 'omp_invalid_device' if and only if
zero non-host devices are available. (The OMP_DEFAULT_DEVICE env var
overrides this as it comes semantically after the initialization.)

To achieve this, default-device-var is now initialized to MIN_INT. If
there is no 'mandatory', it is set to 0 directly after env var parsing.
Otherwise, it is updated in gomp_target_init to either 0 or
omp_invalid_device. To ensure INT_MIN is never seen by the user, both
the omp_get_default_device API routine and omp_display_env (user call
and OMP_DISPLAY_ENV env var) call gomp_init_targets_once() in that case.

libgomp/ChangeLog:

* env.c (gomp_default_icv_values): Init default_device_var to
an nonconforming value - INT_MIN.
(initialize_env): After env-var parsing, set default_device_var to
device 0 unless OMP_TARGET_OFFLOAD=mandatory.
(omp_display_env): If default_device_var is INT_MIN, call
gomp_init_targets_once.
* icv-device.c (omp_get_default_device): Likewise.
* libgomp.texi (OMP_DEFAULT_DEVICE): Update init description.
(OpenMP 5.2 Impl. Status): Mark OMP_TARGET_OFFLOAD=mandatory as 'Y'.
* target.c (resolve_device): Improve error message device-num < 0
with 'mandatory' and no no-host devices available.
(gomp_target_init): Set default-device-var if INT_MIN.
* testsuite/libgomp.c/target-48.c: New test.
* testsuite/libgomp.c/target-49.c: New test.
* testsuite/libgomp.c/target-50.c: New test.
* testsuite/libgomp.c/target-50a.c: New test.
* testsuite/libgomp.c/target-51.c: New test.
* testsuite/libgomp.c/target-52.c: New test.
* testsuite/libgomp.c/target-53.c: New test.
* testsuite/libgomp.c/target-54.c: New test.

Daily bump.

modula2 Fixes to the error format specifications

This patch contains a python3 script to check the meta format error
specifications. It also includes about 20 fixes to M2Quads.mod format
specifications.

gcc/m2/ChangeLog:

* Make-lang.in (check-format-error): New rule.
* gm2-compiler/M2MetaError.mod (op): Add calls InternalError if
digits are detected.
* gm2-compiler/M2Quads.mod (BuildForToByDo): Bugfix to format
specifier.
(BuildLengthFunction): Bugfix to format specifiers.
(BuildOddFunction): Bugfix to format specifiers.
(BuildAbsFunction): Bugfix to format specifiers.
(BuildCapFunction): Bugfix to format specifiers.
(BuildChrFunction): Bugfix to format specifiers.
(BuildOrdFunction): Bugfix to format specifiers.
(BuildMakeAdrFunction): Bugfix to format specifiers.
(BuildSizeFunction): Bugfix to format specifiers.
(BuildBitSizeFunction): Bugfix to format specifiers.
* tools-src/checkmeta.py: New file.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

c/c++: use positive tone in missing header notes [PR84890]

Quoting "How a computer should talk to people" (as quoted
in "Concepts Error Messages for Humans"):

"Various negative tones or actions are unfriendly: being manipulative,
not giving a second chance, talking down, using fashionable slang,
blaming. We must not seem to blame the person. We should avoid suggesting
that the person is inadequate. Phrases like "you forgot" may seem
harmless, but what if a computer said this to you four or five times in
two minutes? Anyway, the person may disagree, so why risk offense?"

gcc/c-family/ChangeLog:
PR c/84890
* known-headers.cc
(suggest_missing_header::~suggest_missing_header): Reword note to
avoid negative tone of "forgetting".

gcc/cp/ChangeLog:
PR c/84890
* name-lookup.cc (missing_std_header::~missing_std_header): Reword
note to avoid negative tone of "forgetting".

gcc/testsuite/ChangeLog:
PR c/84890
* g++.dg/cpp2a/srcloc3.C: Update expected message.
* g++.dg/lookup/missing-std-include-2.C: Likewise.
* g++.dg/lookup/missing-std-include-3.C: Likewise.
* g++.dg/lookup/missing-std-include-6.C: Likewise.
* g++.dg/lookup/missing-std-include.C: Likewise.
* g++.dg/spellcheck-inttypes.C: Likewise.
* g++.dg/spellcheck-stdint.C: Likewise.
* g++.dg/spellcheck-stdlib.C: Likewise.
* gcc.dg/spellcheck-inttypes.c: Likewise.
* gcc.dg/spellcheck-stdbool.c: Likewise.
* gcc.dg/spellcheck-stdint.c: Likewise.
* gcc.dg/spellcheck-stdlib.c: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

c++: Fix templated convertion operator demangling

Instantiations of templated conversion operators failed to demangle
for cases such as 'operator X<int>', but worked for 'operator X<int>
&', due to thinking the template instantiation of X was the
instantiation of the conversion operator itself.

libiberty/
* cp-demangle.c (d_print_conversion): Remove incorrect
template instantiation handling.
* testsuite/demangle-expected: Add testcases.

Fortran: add DATA statement testcase

gcc/testsuite/
* gfortran.dg/data_array_7.f90: New test.

Fortran: fix passing of zero-sized array arguments to procedures [PR86277]

gcc/fortran/ChangeLog:

PR fortran/86277
* trans-array.cc (gfc_trans_allocate_array_storage): When passing a
zero-sized array with fixed (= non-dynamic) size, allocate temporary
by the caller, not by the callee.

gcc/testsuite/ChangeLog:

PR fortran/86277
* gfortran.dg/zero_sized_14.f90: New test.
* gfortran.dg/zero_sized_15.f90: New test.

Co-authored-by: Mikael Morin <mikael@gcc.gnu.org>

Remove a couple mudflap remnants

I happened to be digging into the specs to understand a build
failure and spotted mflib and mfwrap. Those were used by the
mudflap system which we ripped out years ago and we just missed
these.

I verified x86 still bootstraps after removing these bits.

Pushed to the trunk as obvious,
gcc/
* gcc.cc (LINK_COMMAND_SPEC): Remove mudflap spec handling.

Remove sh5media divtab code

Spurred by Akari Takahashi's patch to config/sh/divtab.cc, this removes
divtab.cc completely.

divtab.cc was used to calculate a division table for the sh5 media
processor. GCC dropped support for that (unmanufactured) chip back
in 2016 and this file simply got missed AFAICT.

gcc/
* config/sh/divtab.cc: Remove.

i386: Fix up whitespace in assembly

I've noticed that standard_sse_constant_opcode emits some spurious
whitespace around tab, that isn't something which is done for
any other instruction and looks wrong.

2023-06-13 Jakub Jelinek <jakub@redhat.com>

* config/i386/i386.cc (standard_sse_constant_opcode): Remove
superfluous spaces around \t for vpcmpeqd.

Avoid duplicate vector initializations during RTL expansion.

This middle-end patch avoids some redundant RTL for vector initialization
during RTL expansion.  For the simple test case:

typedef __int128 v1ti __attribute__ ((__vector_size__ (16)));
__int128 key;

v1ti foo() {
    return (v1ti){key};
}

the middle-end currently expands:

(set (reg:V1TI 85) (const_vector:V1TI [ (const_int 0) ]))

(set (reg:V1TI 85) (mem/c:V1TI (symbol_ref:DI ("key"))))

where we create a dead instruction that initializes the vector to zero,
immediately followed by a set of the entire vector.  This patch skips
this zeroing instruction when the vector has only a single element.
It also updates the code to indicate when we've cleared the vector,
so that we don't need to initialize zero elements.

2023-06-13  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* expr.cc (store_constructor) <case VECTOR_TYPE>: Don't bother
clearing vectors with only a single element.  Set CLEARED if the
vector was initialized to zero.

RISC-V: Remove duplicate `#include "riscv-vector-switch.def"`

Hi,

This patch remove the duplicate `#include "riscv-vector-switch.def"` statement
and add #undef for ENTRY and TUPLE_ENTRY macros later.

Best,
Lehua

gcc/ChangeLog:

* config/riscv/riscv-v.cc (struct mode_vtype_group): Remove duplicate
#include.
(ENTRY): Undef.
(TUPLE_ENTRY): Undef.

RISC-V: Add comments of some functions

gcc/ChangeLog:

* config/riscv/riscv-v.cc (rvv_builder::single_step_npatterns_p): Add comment.
(shuffle_generic_patterns): Ditto.
(expand_vec_perm_const_1): Ditto.

RISC-V: Add more SLP tests

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/slp-10.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-11.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-13.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-14.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-15.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-10.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-11.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-13.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-14.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-15.c: New test.

RISC-V: Fix bug of VLA SLP auto-vectorization

Sorry for producing bugs in the previous VLA SLP patch.

Consider this following permutation:
_85 = VEC_PERM_EXPR <{ 99, 17, ... }, { 11, 80, ... }, { 0, POLY_INT_CST [4, 4], 1, POLY_INT_CST [5, 4], 2, POLY_INT_CST [6, 4], ... }>;

The correct result should be:
_85 = { 99, 11, 17, 80, ... }

However, I did wrong in the previous patch.

Code sequence before this patch:

set mask = { 0, 1, 0, 1, ... }
set v0 = { 99, 17, 99, 17, ... }
set v1 = { 11, 80, 11, 80, ... }
set index = viota (mask) = { 0, 0, 1, 1, 2, 2, ... }
set result = vrgather_mu (v0, v1, index, mask) = { 99, 11, 99, 80 }
The result is incorrect.

After this patch:

set mask = { 0, 1, 0, 1, ... }
set index = viota (mask) = { 0, 0, 1, 1, 2, 2, ... }
set v0 = vrgather ({ 99, 17, 99, 17, ... }, index) = { 99, 99, 17, 17, ... }
set v1 = { 11, 80, 11, 80, ... }
set result = vrgather_mu (v0, v1, index, mask) = { 99, 11, 17, 80 }
The result is what we expected.

This issue was discovered in the test I appended in this patch with --param=riscv-autovec-lmul=2.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (emit_vlmax_decompress_insn): Fix bug.
(shuffle_decompress_patterns): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/slp-12.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-12.c: New test.

Fix memory leak in loop header copying

* tree-ssa-loop-ch.cc (ch_base::copy_headers): Free loop BBs.

c++: mutable temps in rodata

If the type of a temporary has mutable members, we can't set TREE_READONLY
on the VAR_DECL; this is parallel to the check in
cp_apply_type_quals_to_decl.

gcc/cp/ChangeLog:

* tree.cc (build_target_expr): Check TYPE_HAS_MUTABLE_P.

gcc/testsuite/ChangeLog:

* g++.dg/tree-ssa/initlist-opt6.C: New test.

RISC-V: Add vector psabi checking.

This patch adds support to check function's argument or return is vector type
and throw warning if yes.

There're two exceptions,
- The vector_size attribute.
- The intrinsic functions.

Some cases that need to add -Wno-psabi to ignore the warning.

gcc/ChangeLog:

* config/riscv/riscv-protos.h (riscv_init_cumulative_args): Set
warning flag if func is not builtin
* config/riscv/riscv.cc
(riscv_scalable_vector_type_p): Determine whether the type is scalable vector.
(riscv_arg_has_vector): Determine whether the arg is vector type.
(riscv_pass_in_vector_p): Check the vector type param is passed by value.
(riscv_init_cumulative_args): The same as header.
(riscv_get_arg_info): Add the checking.
(riscv_function_value): Check the func return and set warning flag
* config/riscv/riscv.h (INIT_CUMULATIVE_ARGS): Add a flag to
determine whether warning psabi or not.

gcc/testsuite/ChangeLog:

* g++.target/riscv/rvv/base/pr109244.C: Add the -Wno-psabi.
* g++.target/riscv/rvv/base/pr109535.C: Same
* gcc.target/riscv/rvv/base/binop_vx_constraint-120.c: Same
* gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c: Same
* gcc.target/riscv/rvv/base/mask_insn_shortcut.c: Same
* gcc.target/riscv/rvv/base/misc_vreinterpret_vbool_vint.c: Same
* gcc.target/riscv/rvv/base/pr110109-2.c: Same
* gcc.target/riscv/rvv/base/scalar_move-9.c: Same
* gcc.target/riscv/rvv/base/spill-10.c: Same
* gcc.target/riscv/rvv/base/spill-11.c: Same
* gcc.target/riscv/rvv/base/spill-9.c: Same
* gcc.target/riscv/rvv/base/vlmul_ext-1.c: Same
* gcc.target/riscv/rvv/base/zero_base_load_store_optimization.c: Same
* gcc.target/riscv/rvv/base/zvfh-intrinsic.c: Same
* gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c: Same
* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Same
* gcc.target/riscv/rvv/vsetvl/vsetvl-1.c: Same
* gcc.target/riscv/vector-abi-1.c: New test.
* gcc.target/riscv/vector-abi-2.c: New test.
* gcc.target/riscv/vector-abi-3.c: New test.
* gcc.target/riscv/vector-abi-4.c: New test.
* gcc.target/riscv/vector-abi-5.c: New test.
* gcc.target/riscv/vector-abi-6.c: New test.

Signed-off-by: Yanzhang Wang <yanzhang.wang@intel.com>
Co-authored-by: Kito Cheng <kito.cheng@sifive.com>

libgomp/testsuite: Add requires-unified-addr-1.{c,f90} [PR109837]

Add a testcase for 'omp requires unified_address' that is currently supported
by all devices but was not tested for.

libgomp/

PR libgomp/109837
* testsuite/libgomp.c-c++-common/requires-unified-addr-1.c: New test.
* testsuite/libgomp.fortran/requires-unified-addr-1.f90: New test.

arm: Extend -mtp= arguments

After discussing the -mtp= option with Arm's LLVM developers we'd like to extend
the functionality of the option somewhat.
There are actually 3 system registers that can be accessed for the thread pointer
in aarch32: tpidrurw, tpidruro, tpidrprw. They are all read through the CP15 co-processor
mechanism. The current -mtp=cp15 option reads the tpidruro register.
This patch extends -mtp to allow for the above three explicit tpidr names and
keeps -mtp=cp15 as an alias of -mtp=tpidruro for backwards compatibility.

Bootstrapped and tested on arm-none-linux-gnueabihf.

gcc/ChangeLog:

* config/arm/arm-opts.h (enum arm_tp_type): Remove TP_CP15.
Add TP_TPIDRURW, TP_TPIDRURO, TP_TPIDRPRW values.
* config/arm/arm-protos.h (arm_output_load_tpidr): Declare prototype.
* config/arm/arm.cc (arm_option_reconfigure_globals): Replace TP_CP15
with TP_TPIDRURO.
(arm_output_load_tpidr): Define.
* config/arm/arm.h (TARGET_HARD_TP): Define in terms of TARGET_SOFT_TP.
* config/arm/arm.md (load_tp_hard): Call arm_output_load_tpidr to output
assembly.
(reload_tp_hard): Likewise.
* config/arm/arm.opt (tpidrurw, tpidruro, tpidrprw): New values for
arm_tp_type.
* doc/invoke.texi (Arm Options, mtp): Document new values.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mtp.c: New test.
* gcc.target/arm/mtp_1.c: New test.
* gcc.target/arm/mtp_2.c: New test.
* gcc.target/arm/mtp_3.c: New test.
* gcc.target/arm/mtp_4.c: New test.

aarch64: Extend -mtp= arguments

After discussing the -mtp= option with Arm's LLVM developers we'd like to extend
the functionality of the option somewhat.
First of all, there is another TPIDR register that can be used to read the thread pointer:
TPIDRRO_EL0 (which can also be accessed by AArch32 under another name) so it makes sense
to add -mtp=tpidrr0_el0. This makes the existing arguments el0, el1, el2, el3 somewhat
inconsistent in their naming so this patch introduces the more "full" names
tpidr_el0, tpidr_el1, tpidr_el2, tpidr_el3 and makes the above short names alias of these new ones.
Long story short, we preserve backwards compatibility and add a new TPIDR register to access through
-mtp that wasn't available previously.
There is more relevant discussion of the options at https://reviews.llvm.org/D152433 if you're interested.

Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/ChangeLog:

PR target/108779
* config/aarch64/aarch64-opts.h (enum aarch64_tp_reg): Add
AARCH64_TPIDRRO_EL0 value.
* config/aarch64/aarch64.cc (aarch64_output_load_tp): Define.
* config/aarch64/aarch64.opt (tpidr_el0, tpidr_el1, tpidr_el2,
tpidr_el3, tpidrro_el3): New accepted values to -mtp=.
* doc/invoke.texi (AArch64 Options): Document new -mtp= options.

gcc/testsuite/ChangeLog:

PR target/108779
* gcc.target/aarch64/mtp_5.c: New test.
* gcc.target/aarch64/mtp_6.c: New test.
* gcc.target/aarch64/mtp_7.c: New test.
* gcc.target/aarch64/mtp_8.c: New test.
* gcc.target/aarch64/mtp_9.c: New test.

fix frange_nextafter odr violation

C++ requires inline functions to be declared inline and defined in
every translation unit that uses them.  frange_nextafter is used in
gimple-range-op.cc but it's only defined as inline in
range-op-float.cc.  Drop the extraneous inline specifier.

Other non-static inline functions in range-op-float.cc are not
referenced elsewhere, so I'm making them static.

for  gcc/ChangeLog

* range-op-float.cc (frange_nextafter): Drop inline.
(frelop_early_resolve): Add static.
(frange_float): Likewise.

middle-end/110232 - fix native interpret of vector <signed-boolean:1>

The following fixes native interpretation of a buffer as boolean
vector with bit-precision elements such as AVX512 vectors. The
check whether the buffer covers the whole vector was broken for
bit-precision elements and the following instead implements it
based on the vector type size.

PR middle-end/110232
* fold-const.cc (native_interpret_vector): Use TYPE_SIZE_UNIT
to check whether the buffer covers the whole vector.

* gcc.target/i386/pr110232.c: New testcase.

Fix disambiguation against .MASK_LOAD

Alias analysis was treating .MASK_LOAD as storing a full vector
which means we disambiguate against decls of smaller than vector size.
This complements the previous patch handling .MASK_STORE and fixes
runtime execution FAILs of gfortran.dg/matmul_3.f90 and
gfortran.dg/inline_sum_2.f90 when using AVX512 with full masked loop
vectorization on Zen4.

* tree-ssa-alias.cc (ref_maybe_used_by_call_p_1): For
.MASK_LOAD and friends set the size of the access to unknown.