Andrew Pinski [Mon, 29 Apr 2024 03:21:02 +0000 (20:21 -0700)]
PHIOPT: Value-replacement check undef
While moving value replacement part of PHIOPT over
to use match-and-simplify, I ran into the case where
we would have an undef use that was conditional become
unconditional. This prevents that. I can't remember at this
point what the testcase was though.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
* tree-ssa-phiopt.cc (value_replacement): Reject undef variables
so they don't become unconditional used.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Andrew Pinski [Sun, 28 Apr 2024 01:54:45 +0000 (18:54 -0700)]
PHI-OPT: speed up value_replacement slightly
This adds a few early outs to value_replacement that I noticed
while rewriting this to use match-and-simplify but could be committed
seperately.
* virtual operands won't change so return early for them
* special case `A ? B : B` as that is already just `B`
Also moves the check for NE/EQ earlier as calculating empty_or_with_defined_p
is an IR walk for a BB and that might be big.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
* tree-ssa-phiopt.cc (value_replacement): Move check for
NE/EQ earlier.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Andrew Pinski [Sun, 28 Apr 2024 01:54:44 +0000 (18:54 -0700)]
MATCH: change single_non_singleton_phi_for_edges for singleton phis
I noticed that single_non_singleton_phi_for_edges could
return a phi whos entry are all the same for the edge.
This happens only if there was a single phis in the first place.
Also gimple_seq_singleton_p walks the sequence to see if it the one
element in the sequence so there is removing that check actually
reduces the number of pointer walks needed.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
* tree-ssa-phiopt.cc (single_non_singleton_phi_for_edges):
Remove the special case of gimple_seq_singleton_p.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Andrew Pinski [Wed, 17 Apr 2024 21:30:06 +0000 (14:30 -0700)]
Remove support for nontemporal stores with ssa_names on lhs [PR112976]
When cfgexpand was changed to support expanding from tuple gimple
(r0-95521-g28ed065ef9f345), the code was added to support
doing nontemporal stores with LHS of a SSA_NAME but that will
never be a nontemporal store.
This patch removes that and asserts that expanding with a LHS
of a SSA_NAME is not a nontemporal store.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
PR middle-end/112976
* cfgexpand.cc (expand_gimple_stmt_1): Remove
support for expanding nontemporal "moves" with
ssa names on the LHS.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Andrew Pinski [Wed, 17 Apr 2024 21:12:17 +0000 (14:12 -0700)]
Add verification of gimple_assign_nontemporal_move_p [PR112976]
Currently the middle-end only knows how to support temporal stores
(the undocumented storent optab) so let's verify that the only time
we set nontemporal_move on an assign is if the the lhs is not a
gimple reg.
Bootstrapped and tested on x86_64-linux-gnu no regressions.
gcc/ChangeLog:
PR middle-end/112976
* tree-cfg.cc (verify_gimple_assign): Verify that
nontmporal moves are stores.
* gimple.h (struct gimple): Note that only
nontemporal stores are supported.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
This patch is primarily meant to improve the code we generate for FP rounding
such as ceil/floor. It also addresses some unnecessary sign extensions in the
same areas.
RISC-V's FP conversions have a bit of undesirable behavior that make them
non-suitable as-is for ceil/floor and other related functions. These
deficiencies are addressed in the Zfa extension, but there's no reason not to
pick up a nice improvement when we can.
Basically we can still use the basic FP conversions for floor/ceil and friends
when we don't care about inexact exceptions by checking for the special cases
first, then emitting the conversion when the special cases don't apply. That's
still much faster than calling into glibc.
The redundant sign extensions are eliminated using the same trick Jivan added
last year, just in a few more places ;-)
This eliminates roughly 10% of the dynamic instruction count for imagick. But
more importantly it's about a 17% performance improvement for that workload
within spec.
This has been bootstrapped as well as regression tested in a cross environment.
It's also successfully built & run specint/specfp correctly.
Pushing to the trunk and the coordination branch momentarily.
gcc/
* config/riscv/iterators.md (fix_ops, fix_uns): New iterators.
(RINT, rint_pattern, rint_rm): Remove unused iterators.
* config/riscv/riscv-protos.h (get_fp_rounding_coefficient): Prototype.
* config/riscv/riscv-v.cc (get_fp_rounding_coefficient): Externalize.
external linkage.
* config/riscv/riscv.md (UNSPEC_LROUND): Remove.
(fix_trunc<ANYF:mode><GPR:mode>2): Replace with ...
(<fix_uns>_trunc<ANYF:mode>si2): New expander & associated insn.
(<fix_uns>_trunc<ANYF:mode>si2_ext): New insn.
(<fix_uns>_trunc<ANYF:mode>di2): Likewise.
(l<rint_pattern><ANYF:mode><GPR:mode>2): Replace with ...
(lrint<ANYF:mode>si2): New expander and associated insn.
(lrint<ANYF:mode>si2_ext, lrint<ANYF:mode>di2): New insns.
(<round_pattern><ANYF:mode>2): Replace with....
(l<round_pattern><ANYF:mode>si2): New expander and associated insn.
(l<round_pattern><ANYF:mode>si2_sext): New insn.
(l<round_pattern><ANYF:mode>di2): Likewise.
(<round_pattern><ANYF:mode>2): New expander.
gcc/testsuite/
* gcc.target/riscv/fix.c: New test.
* gcc.target/riscv/round.c: New test.
* gcc.target/riscv/round_32.c: New test.
* gcc.target/riscv/round_64.c: New test.
In my previous change I mistakenly changed Value_Range to
int_range<2>. The former has "infinite" precision for integer ranges,
whereas int_range<2> has two sub-ranges. This should have been
int_range_max.
A large number of gm2 tests are timing out even on current Solaris/SPARC
systems. As detailed in the PR, the problem is that the gm2 testsuite
artificially lowers many timeouts way below the DejaGnu default of 300
seconds, often as short as 10 seconds. The problem lies both in the
values (they may be appropriate for some targets, but too low for
others, especially under high load) and the fact that it uses absolute
values, overriding e.g. settings from a build-wide site.exp.
Therefore this patch removes all those overrides, restoring the
defaults.
Tested on sparc-sun-solaris2.11 (where all the previous timeouts are
gone) and i386-pc-solaris2.11.
Richard Biener [Wed, 17 Apr 2024 09:22:00 +0000 (11:22 +0200)]
Support {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR and EXACT_DIV_EXPR in GIMPLE FE
The following adds support for the various division and modulo operators
to the GIMPLE frontend via __{CEIL,FLOOR,ROUND}_{DIV,MOD} and
__EXACT_DIV operators.
gcc/c/
* gimple-parser.cc (c_parser_gimple_binary_expression):
Parse __{CEIL,FLOOR,ROUND}_{DIV,MOD} and __EXACT_DIV.
gcc/testsuite/
* gcc.dg/gimplefe-53.c: New testcase.
Richard Biener [Tue, 16 Apr 2024 12:05:35 +0000 (14:05 +0200)]
middle-end/13421 - -ftrapv vs. POINTER_DIFF_EXPR
Currently we expand POINTER_DIFF_EXPR using subv_optab when -ftrapv
(but -fsanitize=undefined does nothing). That's not consistent
with the behavior of POINTER_PLUS_EXPR which never uses addv_optab
with -ftrapv. Both are because of the way we select whether to use
the trapping or the non-trapping optab - we look at the result type
of the expression and check
the bugreport correctly complains that -ftrapv affects pointer
subtraction (there's no -ftrapv-pointer). Now that we have
POINTER_DIFF_EXPR we can honor that appropriately.
The patch moves both POINTER_DIFF_EXPR and POINTER_PLUS_EXPR
handling so they will never consider trapping (or saturating)
optabs.
PR middle-end/13421
* optabs-tree.cc (optab_for_tree_code): Do not consider
{add,sub}v or {us,ss}{add,sub} optabs for POINTER_DIFF_EXPR
or POINTER_PLUS_EXPR.
Jakub Jelinek [Tue, 30 Apr 2024 09:22:32 +0000 (11:22 +0200)]
gimple-ssa-sprintf: Use [0, 1] range for %lc with (wint_t) 0 argument [PR114876]
Seems when Martin S. implemented this, he coded there strict reading
of the standard, which said that %lc with (wint_t) 0 argument is handled
as wchar_t[2] temp = { arg, 0 }; %ls with temp arg and so shouldn't print
any values. But, most of the libc implementations actually handled that
case like %c with '\0' argument, adding a single NUL character, the only
known exception is musl.
Recently, C23 changed this in response to GB-141 and POSIX in
https://austingroupbugs.net/view.php?id=1647
so that it should have the same behavior as %c with '\0'.
Because there is implementation divergence, the following patch uses
a range rather than hardcoding it to all 1s (i.e. the %c behavior),
though the likely case is still 1 (forward looking plus most of
implementations).
The res.knownrange = true; assignment removed is redundant due to
the same assignment done unconditionally before the if statement,
rest is formatting fixes.
I don't think the min >= 0 && min < 128 case is right either, I'd think
it should be min >= 0 && max < 128, otherwise it is just some possible
inputs are (maybe) ASCII and there can be others, but this code is a total
mess anyway, with the min, max, likely (somewhere in [min, max]?) and then
unlikely possibly larger than max, dunno, perhaps for at least some chars
in the ASCII range the likely case could be for the ascii case; so perhaps
just the one_2_one_ascii shouldn't set max to 1 and mayfail should be true
for max >= 128. Anyway, didn't feel I should touch that right now.
2024-04-30 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/114876
* gimple-ssa-sprintf.cc (format_character): For min == 0 && max == 0,
set max, likely and unlikely members to 1 rather than 0. Remove
useless res.knownrange = true;. Formatting fixes.
* gcc.dg/pr114876.c: New test.
* gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Adjust expected
diagnostics.
Christophe Lyon [Thu, 25 Jan 2024 15:43:56 +0000 (15:43 +0000)]
Fix pretty printers regexp for GDB output
GDB emits end of lines as \r\n, we currently match any >0 number of
either \n or \r, possibly leading to mismatches under racy conditions.
I noticed this while running the GCC testsuite using the equivalent of
GDB's READ1 feature [1] which helps detecting bufferization issues.
We try to match
\n$1 = empty std::tuple\r
against {^(type|\$([0-9]+)) = ([^\n\r]*)[\n\r]+} which fails because
of the leading \n (which was left in the buffer after the previous
"skipping" pattern matched the preceding \r).
This patch accepts any number of leading \n and/or \r in the "got" clause.
Also take this opportunity to quote \r and \r in the logs, to make
debugging such issues easier.
The assertion doesn't allow IFN_COND_MIN/IFN_COND_MAX, which are
commutative conditional binary operations like ADD/MUL/AND/IOR/XOR,
and can be handled just fine.
In particular, we emit
vminpd %zmm3, %zmm5, %zmm0{%k2}
vminpd %zmm0, %zmm3, %zmm5{%k1}
and
vmaxpd %zmm3, %zmm5, %zmm0{%k2}
vmaxpd %zmm0, %zmm3, %zmm5{%k1}
in the vectorized loops of the first and second subroutine.
2024-04-30 Jakub Jelinek <jakub@redhat.com>
Hongtao Liu <hongtao.liu@intel.com>
PR tree-optimization/114883
* tree-vect-loop.cc (vect_transform_reduction): Allow IFN_COND_MIN and
IFN_COND_MAX in the assert.
Jakub Jelinek [Tue, 30 Apr 2024 07:00:07 +0000 (09:00 +0200)]
libgcc: Do use weakrefs for glibc 2.34 on GNU Hurd
On Mon, Apr 29, 2024 at 01:44:24PM +0000, Joseph Myers wrote:
> > glibc 2.34 and later doesn't have separate libpthread (libpthread.so.0 is a
> > dummy shared library with just some symbol versions for compatibility, but
> > all the pthread_* APIs are in libc.so.6).
>
> I suspect this has caused link failures in the glibc testsuite for Hurd,
> which still has separate libpthread.
>
> https://sourceware.org/pipermail/libc-testresults/2024q2/012556.html
So like this then?
2024-04-30 Jakub Jelinek <jakub@redhat.com>
* gthr.h (GTHREAD_USE_WEAK): Don't redefine to 0 for glibc 2.34+
on GNU Hurd.
Jakub Jelinek [Tue, 30 Apr 2024 06:58:39 +0000 (08:58 +0200)]
libcpp: Adjust __STDC_VERSION__ for C23
While the C23 standard isn't officially release yet,
in 2011 we've changed __STDC_VERSION__ value for C11 already
in the month in which the new __STDC_VERSION__ value has been
finalized, so we want to change this now or wait
until we implement all the C23 features?
Note, seems Clang up to 17 also used 202000L for -std=c2x but
Clang 18+ uses 202311L as specified in the latest C23 drafts.
2024-04-30 Jakub Jelinek <jakub@redhat.com>
* init.cc (cpp_init_builtins): Change __STDC_VERSION__
for C23 from 202000L to 202311L.
* doc/cpp.texi (__STDC_VERSION__): Document 202311L value
for -std=c23/-std=gnu23.
Jakub Jelinek [Tue, 30 Apr 2024 06:57:15 +0000 (08:57 +0200)]
c++: Implement C++26 P0609R3 - Attributes for Structured Bindings [PR114456]
The following patch implements the P0609R3 paper; we build the
VAR_DECLs for the structured binding identifiers early, so all we need
IMHO is just to parse the attributed identifier list and pass the attributes
to the VAR_DECL creation.
The paper mentions maybe_unused and gnu::nonstring attributes as examples
where they can be useful. Not sure about either of them.
For maybe_unused, the thing is that both GCC and clang already don't
diagnose maybe unused for the structured binding identifiers, because it
would be a false positive too often; and there is no easy way to find out
if a structured binding has been written with the P0609R3 paper in mind or
not (maybe we could turn it on if in the structured binding is any
attribute, even if just [[]] and record that as a flag on the whole
underlying decl, so that we'd diagnose
auto [a, b, c[[]]] = d;
// use a, c but not b
but not
auto [e, f, g] = d;
// use a, c but not b
). For gnu::nonstring, the issue is that we currently don't allow the
attribute on references to char * or references to char[], just on
char */char[]. I've filed a PR for that.
The first testcase in the patch tests it on [[]] and [[maybe_unused]],
just whether it is parsed properly, second on gnu::deprecated, which
works. Haven't used deprecated attribute because the paper said that
attribute is for further investigation.
2024-04-30 Jakub Jelinek <jakub@redhat.com>
PR c++/114456
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Predefine
__cpp_structured_bindings for C++26 to 202403L rather than
201606L.
gcc/cp/
* parser.cc (cp_parser_decomposition_declaration): Implement C++26
P0609R3 - Attributes for Structured Bindings. Parse attributed
identifier lists for structured binding declarations, pass the
attributes to start_decl.
gcc/testsuite/
* g++.dg/cpp26/decomp1.C: New test.
* g++.dg/cpp26/decomp2.C: New test.
* g++.dg/cpp26/feat-cxx26.C (__cpp_structured_bindings): Expect
202403 rather than 201606.
Richard Biener [Fri, 26 Apr 2024 13:47:13 +0000 (15:47 +0200)]
middle-end/114734 - wrong code with expand_call_mem_ref
When expand_call_mem_ref looks at the definition of the address
argument to eventually expand a &TARGET_MEM_REF argument together
with a masked load it fails to honor constraints imposed by SSA
coalescing decisions. The following fixes this.
PR middle-end/114734
* internal-fn.cc (expand_call_mem_ref): Use
get_gimple_for_ssa_name to get at the def stmt of the address
argument to honor SSA coalescing constraints.
c++: Fix instantiation of imported temploid friends [PR114275]
This patch fixes a number of issues with the handling of temploid friend
declarations.
The primary issue is that instantiations of friend declarations should
attach the declaration to the same module as the befriending class, by
[module.unit] p7.1 and [temp.friend] p2; this could be a different
module from the current TU, and so needs special handling.
The other main issue here is that we can't assume that just because name
lookup didn't find a definition for a hidden class template, that it
doesn't exist at all: it could be a non-exported entity that we've
nevertheless streamed in from an imported module. We need to ensure
that when instantiating template friend classes that we return the same
TEMPLATE_DECL that we got from our imports, otherwise we will get later
issues with 'duplicate_decls' (rightfully) complaining that they're
different when trying to merge.
This doesn't appear necessary for function templates due to the existing
name lookup handling already finding these hidden declarations.
PR c++/105320
PR c++/114275
gcc/cp/ChangeLog:
* cp-tree.h (propagate_defining_module): Declare.
(lookup_imported_hidden_friend): Declare.
* decl.cc (duplicate_decls): Also check if hidden decls can be
redeclared in this module.
* module.cc (imported_temploid_friends): New.
(init_modules): Initialize it.
(trees_out::decl_value): Write it; don't consider imported
temploid friends as attached to a module.
(trees_in::decl_value): Read it.
(get_originating_module_decl): Follow the owning decl for an
imported temploid friend.
(propagate_defining_module): New.
* name-lookup.cc (get_mergeable_namespace_binding): New.
(lookup_imported_hidden_friend): New.
* pt.cc (tsubst_friend_function): Propagate defining module for
new friend functions.
(tsubst_friend_class): Lookup imported hidden friends. Check
for valid module attachment of existing names. Propagate
defining module for new classes.
gcc/testsuite/ChangeLog:
* g++.dg/modules/tpl-friend-10_a.C: New test.
* g++.dg/modules/tpl-friend-10_b.C: New test.
* g++.dg/modules/tpl-friend-10_c.C: New test.
* g++.dg/modules/tpl-friend-10_d.C: New test.
* g++.dg/modules/tpl-friend-11_a.C: New test.
* g++.dg/modules/tpl-friend-11_b.C: New test.
* g++.dg/modules/tpl-friend-12_a.C: New test.
* g++.dg/modules/tpl-friend-12_b.C: New test.
* g++.dg/modules/tpl-friend-12_c.C: New test.
* g++.dg/modules/tpl-friend-12_d.C: New test.
* g++.dg/modules/tpl-friend-12_e.C: New test.
* g++.dg/modules/tpl-friend-12_f.C: New test.
* g++.dg/modules/tpl-friend-13_a.C: New test.
* g++.dg/modules/tpl-friend-13_b.C: New test.
* g++.dg/modules/tpl-friend-13_c.C: New test.
* g++.dg/modules/tpl-friend-13_d.C: New test.
* g++.dg/modules/tpl-friend-13_e.C: New test.
* g++.dg/modules/tpl-friend-13_f.C: New test.
* g++.dg/modules/tpl-friend-13_g.C: New test.
* g++.dg/modules/tpl-friend-14_a.C: New test.
* g++.dg/modules/tpl-friend-14_b.C: New test.
* g++.dg/modules/tpl-friend-14_c.C: New test.
* g++.dg/modules/tpl-friend-14_d.C: New test.
* g++.dg/modules/tpl-friend-9.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Currently different places calling 'module_may_redeclare' all emit very
similar but slightly different error messages, and handle different
kinds of declarations differently. This patch makes the function
perform its own error messages so that they're all in one place, and
prepares it for use with temploid friends.
Patrick Palka [Tue, 30 Apr 2024 01:27:59 +0000 (21:27 -0400)]
c++/modules: imported spec befriending class tmpl [PR114889]
When adding to CLASSTYPE_BEFRIENDING_CLASSES as part of installing an
imported class definition, we need to look through TEMPLATE_DECL like
make_friend_class does.
Otherwise in the below testcase we won't add _Hashtable<int, int> to
CLASSTYPE_BEFRIENDING_CLASSES of _Map_base, which leads to a bogus
access check failure for _M_hash_code.
PR c++/114889
gcc/cp/ChangeLog:
* module.cc (trees_in::read_class_def): Look through
TEMPLATE_DECL when adding to CLASSTYPE_BEFRIENDING_CLASSES.
gcc/testsuite/ChangeLog:
* g++.dg/modules/friend-8_a.H: New test.
* g++.dg/modules/friend-8_b.C: New test.
gcc/fortran/ChangeLog:
* expr.cc (check_transformational): Add SELECTED_LOGICAL_KIND
to allowed functions for Fortran 2023.
* gfortran.h (GFC_ISYM_SL_KIND): New.
* gfortran.texi: Mention SELECTED_LOGICAL_KIND.
* intrinsic.cc (add_functions): Add SELECTED_LOGICAL_KIND.
(gfc_intrinsic_func_interface): Allow it in initialization
expressions.
* intrinsic.h (gfc_simplify_selected_logical_kind): New proto.
* intrinsic.texi: Add SELECTED_LOGICAL_KIND.
* simplify.cc (gfc_simplify_selected_logical_kind): New
function.
* trans-decl.cc (gfc_build_intrinsic_function_decls): Initialize
gfor_fndecl_sl_kind.
* trans-intrinsic.cc (gfc_conv_intrinsic_sl_kind): New function.
(gfc_conv_intrinsic_function): Call it for GFC_ISYM_SL_KIND.
* trans.h (gfor_fndecl_sl_kind): New symbol.
gcc/testsuite/ChangeLog:
* gfortran.dg/selected_logical_kind_1.f90: New test.
* gfortran.dg/selected_logical_kind_2.f90: New test.
* gfortran.dg/selected_logical_kind_3.f90: New test.
* gfortran.dg/selected_logical_kind_4.f90: New test.
This patch updates the Solaris baselines for the GLIBCXX_3.4.33 version
added in GCC 14.0.
Tested on i386-pc-solaris2.11 and sparc-sun-solaris2.11 (32 and 64-bit
each), together with the GLIBCXX_3.4.32 update, on both gcc-14 branch
and trunk.
This patch updates the Solaris baselines for the GLIBCXX_3.4.32 version
added in GCC 13.2.
Tested on i386-pc-solaris2.11 and sparc-sun-solaris2.11 (32 and 64-bit
each) on the gcc-13 branch and (together with the GLIBCXX_3.4.33 update)
on both gcc-14 branch and trunk.
Paul Thomas [Mon, 29 Apr 2024 10:52:11 +0000 (11:52 +0100)]
Fortran: Fix regression caused by r14-9752 [PR114959]
2024-04-29 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/114959
* trans-expr.cc (gfc_trans_class_init_assign): Return NULL_TREE
if the default initializer has all NULL fields. Guard this
by a requirement that the code not be EXEC_INIT_ASSIGN and that
the object be an INTENT_OUT dummy.
* trans-stmt.cc (gfc_trans_allocate): Change the initializer
code for allocate with mold to EXEC_ALLOCATE to allow an
initializer with all NULL fields.
gcc/testsuite/
PR fortran/114959
* gfortran.dg/pr114959.f90: New test.
internal compiler error: in extract_insn, at recog.cc:2812.
This patch would like to take care of the above rtl. Given the value of
const_poly_int can hardly excceed the max of int64, we can simply
consider the highest 8 bytes of TImode is zero and then set the dest
to (const_int 0).
The below test cases are fixed by this PATCH.
C:
FAIL: gcc.dg/graphite/pr111878.c (internal compiler error: in
extract_insn, at recog.cc:2812)
FAIL: gcc.dg/graphite/pr111878.c (test for excess errors)
Fortran:
FAIL: gfortran.dg/graphite/vect-pr40979.f90 -O (internal compiler
error: in extract_insn, at recog.cc:2812)
FAIL: gfortran.dg/graphite/pr29832.f90 -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions (internal
compiler error: in extract_insn, at recog.cc:2812)
FAIL: gfortran.dg/graphite/pr29581.f90 -O3 -g (test for excess
errors)
FAIL: gfortran.dg/graphite/pr14741.f90 -O (test for excess errors)
FAIL: gfortran.dg/graphite/pr29581.f90 -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions (test for
excess errors)
FAIL: gfortran.dg/graphite/vect-pr40979.f90 -O (test for excess
errors)
FAIL: gfortran.dg/graphite/id-27.f90 -O (internal compiler error: in
extract_insn, at recog.cc:2812)
FAIL: gfortran.dg/graphite/pr29832.f90 -O3 -g (internal compiler
error: in extract_insn, at recog.cc:2812)
FAIL: gfortran.dg/graphite/pr29832.f90 -O3 -g (test for excess
errors)
FAIL: gfortran.dg/graphite/id-27.f90 -O (test for excess errors)
FAIL: gfortran.dg/graphite/pr29832.f90 -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions (test for
excess errors)
FAIL: gfortran.dg/graphite/pr29581.f90 -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions (internal
compiler error: in extract_insn, at recog.cc:2812)
FAIL: gfortran.dg/graphite/pr14741.f90 -O (internal compiler error:
in extract_insn, at recog.cc:2812)
FAIL: gfortran.dg/graphite/pr29581.f90 -O3 -g (internal compiler
error: in extract_insn, at recog.cc:2812)
The below test suites are passed for this patch:
* The rv64gcv fully regression test.
* The rv64gc fully regression test.
Try to write some RTL code for test but not works well according to
existing test cases. Thus, take above as test cases. Please note
graphite require the gcc build with isl.
PR target/114885
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_legitimize_subreg_const_poly_move): New
func impl to take care of (const_int_poly:TI 8).
(riscv_legitimize_move): Handle subreg is const_int_poly,
The extension parsing table entries for a range of Zic* extensions
does not match the mask definition in riscv.opt.
This results in broken TARGET_ZIC* macros, because the values of
riscv_zi_subext and riscv_zicmo_subext are set wrong.
This patch fixes this by moving Zic64b into riscv_zicmo_subext
and all other affected Zic* extensions to riscv_zi_subext.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Move ziccamoa, ziccif,
zicclsm, and ziccrse into riscv_zi_subext.
* config/riscv/riscv.opt: Define MASK_ZIC64B for
riscv_ziccmo_subext.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Jie Mei [Sun, 28 Apr 2024 08:57:31 +0000 (16:57 +0800)]
MIPS: Add MIN/MAX.fmt instructions support for MIPS R6
This patch adds the smin/smax RTL mode for the
min/max.fmt instructions.
Also, since the min/max.fmt instrucions applies to the
IEEE 754-2008 "minNum" and "maxNum" operations, this
patch also provides the new "fmin<mode>3" and
"fmax<mode>3" modes.
gcc/ChangeLog:
* config/mips/i6400.md (i6400_fpu_minmax): New
define_insn_reservation.
* config/mips/mips.h (ISA_HAS_FMIN_FMAX): Define new macro.
* config/mips/mips.md (UNSPEC_FMIN): New unspec.
(UNSPEC_FMAX): Same as above.
(type): Add fminmax.
(smin<mode>3): Generates MIN.fmt instructions.
(smax<mode>3): Generates MAX.fmt instructions.
(fmin<mode>3): Generates MIN.fmt instructions.
(fmax<mode>3): Generates MAX.fmt instructions.
* config/mips/p6600.md (p6600_fpu_fabs): Include fminmax
type.
gcc/testsuite/ChangeLog:
* gcc.target/mips/mips-minmax1.c: New test for MIPS R6.
* gcc.target/mips/mips-minmax2.c: Same as above.
Callers of irange_bitmask must normalize value/mask pairs.
As per the documentation, irange_bitmask must have the unknown bits in
the mask set to 0 in the value field. Even though we say we must have
normalized value/mask bits, we don't enforce it, opting to normalize
on the fly in union and intersect. Avoiding this lazy enforcing as
well as the extra saving/restoring involved in returning the changed
status, gives us a performance increase of 1.25% for VRP and 1.51% for
ipa-CP.
gcc/ChangeLog:
* tree-ssa-ccp.cc (ccp_finalize): Normalize before calling
set_bitmask.
* value-range.cc (irange::intersect_bitmask): Calculate changed
irange_bitmask bits on our own.
(irange::union_bitmask): Same.
(irange_bitmask::verify_mask): Verify that bits are normalized.
* value-range.h (irange_bitmask::union_): Do not normalize.
Remove return value.
(irange_bitmask::intersect): Same.
Aldy Hernandez [Wed, 20 Mar 2024 04:51:55 +0000 (05:51 +0100)]
Remove range_zero and range_nonzero.
Remove legacy range_zero and range_nonzero as they return by value,
which make it not work in a separate irange and prange world. Also,
we already have set_zero and set_nonzero methods in vrange.
gcc/ChangeLog:
* range-op-ptr.cc (pointer_plus_operator::wi_fold): Use method
range setters instead of out of line functions.
(pointer_min_max_operator::wi_fold): Same.
(pointer_and_operator::wi_fold): Same.
(pointer_or_operator::wi_fold): Same.
* range-op.cc (operator_negate::fold_range): Same.
(operator_addr_expr::fold_range): Same.
(range_op_cast_tests): Same.
* range.cc (range_zero): Remove.
(range_nonzero): Remove.
* range.h (range_zero): Remove.
(range_nonzero): Remove.
* value-range.cc (range_tests_misc): Use method instead of out of
line function.
Aldy Hernandez [Tue, 19 Mar 2024 17:22:08 +0000 (18:22 +0100)]
Make some integer specific ranges generic Value_Range's.
There are some irange uses that should be Value_Range, because they
can be either integers or pointers. This will become a problem when
prange comes live.
gcc/ChangeLog:
* tree-ssa-loop-split.cc (split_at_bb_p): Make int_range a Value_Range.
* tree-ssa-strlen.cc (get_range): Same.
* value-query.cc (range_query::get_tree_range): Handle both
integers and pointers.
* vr-values.cc (simplify_using_ranges::fold_cond_with_ops): Make
r0 and r1 Value_Range's.
Aldy Hernandez [Tue, 19 Mar 2024 16:17:53 +0000 (17:17 +0100)]
Accept a vrange in get_legacy_range.
In preparation for prange, make get_legacy_range take a generic
vrange, not just an irange.
gcc/ChangeLog:
* value-range.cc (get_legacy_range): Make static and add another
version of get_legacy_range that takes a vrange.
* value-range.h (class irange): Remove unnecessary friendship with
get_legacy_range.
(get_legacy_range): Accept a vrange.
Aldy Hernandez [Tue, 19 Mar 2024 15:35:41 +0000 (16:35 +0100)]
Verify that reading back from vrange_storage doesn't drop bits.
We have a sanity check in the irange storage code to make sure that
reading back a cache entry we have just written to yields exactly the
same range. There's no need to do this only for integers. This patch
moves the code to a more generic place.
However, doing so tickles a latent bug in the frange code where a
range is being pessimized from [0.0, 1.0] to [-0.0, 1.0]. Exclude
checking frange's until this bug is fixed.
Aldy Hernandez [Wed, 7 Feb 2024 10:27:29 +0000 (11:27 +0100)]
Make fold_cond_with_ops use a boolean type for range_true/range_false.
Conditional operators are always boolean, regardless of their
operands. Getting the type wrong is not currently a problem, but will
be when prange's can no longer store an integer.
gcc/ChangeLog:
* vr-values.cc (simplify_using_ranges::fold_cond_with_ops): Remove
type from range_true and range_false.
Aldy Hernandez [Wed, 21 Feb 2024 08:34:29 +0000 (09:34 +0100)]
Remove GTY support for vrange and derived classes.
Now that we have a vrange storage class to save ranges in long-term
memory, there is no need for GTY markers for any of the vrange
classes, since they should never live in GC.
Aldy Hernandez [Thu, 22 Feb 2024 08:18:46 +0000 (09:18 +0100)]
Move bitmask routines to vrange base class.
Any range can theoretically have a bitmask of set bits. This patch
moves the bitmask accessors to the base class. This cleans up some
users in IPA*, and will provide a cleaner interface when prange is in
place.
gcc/ChangeLog:
* ipa-cp.cc (propagate_bits_across_jump_function): Access bitmask
through base class.
(ipcp_store_vr_results): Same.
* ipa-prop.cc (ipa_compute_jump_functions_for_edge): Same.
(ipcp_get_parm_bits): Same.
(ipcp_update_vr): Same.
* range-op-mixed.h (update_known_bitmask): Change argument to vrange.
* range-op.cc (update_known_bitmask): Same.
* value-range.cc (vrange::update_bitmask): New.
(irange::set_nonzero_bits): Move to vrange class.
(irange::get_nonzero_bits): Same.
* value-range.h (class vrange): Add update_bitmask, get_bitmask,
get_nonzero_bits, and set_nonzero_bits.
(class irange): Make bitmask methods virtual overrides.
(class Value_Range): Add get_bitmask and update_bitmask.
Add tree versions of lower and upper bounds to vrange.
This patch adds vrange::lbound() and vrange::ubound() that return
trees. These can be used in generic code that is type agnostic, and
avoids special casing for pointers and integers in places where we
handle both. It also cleans up a wart in the Value_Range class.
Aldy Hernandez [Wed, 21 Feb 2024 08:33:19 +0000 (09:33 +0100)]
Add a virtual vrange destructor.
Richi mentioned in PR113476 that it would be cleaner to move the
destructor from int_range to the base class. Although this isn't
strictly necessary, as there are no users, it is good to future proof
things, and the overall impact is miniscule.
gcc/ChangeLog:
* value-range.h (vrange::~vrange): New.
(int_range::~int_range): Make final override.
Ian Lance Taylor [Sun, 28 Apr 2024 18:14:17 +0000 (11:14 -0700)]
libbacktrace: load Windows modules
Patch from Björn Schäpers <bjoern@hazardy.de>.
* configure.ac: Checked for tlhelp32.h
* pecoff.c: Include <tlhelp32.h> if available.
(backtrace_initialize): Use tlhelp32 api for a snapshot to
detect loaded modules.
(coff_add): New argument for the module handle of the file,
to get the base address.
* configure, config.h.in: Regenerate.
Adjust alternative *k to ?k for avx512 mask in zero_extend patterns
So when both source operand and dest operand require avx512 MASK_REGS, RA
can allocate MASK_REGS register instead of GPR to avoid reload it from
GPR to MASK_REGS.
gcc/ChangeLog:
* config/i386/i386.md: (zero_extendsidi2): Adjust
alternative *k to ?k.
(zero_extend<mode>di2): Ditto.
(*zero_extend<mode>si2): Ditto.
(*zero_extendqihi2): Ditto.
[testsuite] require sqrt_insn effective target where needed
Some tests fail on ppc and ppc64 when testing a compiler [with options
for] for a CPU [emulator] that doesn't support the sqrt insn.
The gcc.dg/cdce3.c is one in which the expected shrink-wrap
optimization only takes place when the target CPU supports a sqrt
insn.
The gcc.target/powerpc/pr46728-1[0-4].c tests use -mpowerpc-gpopt and
call sqrt(), which involves the sqrt insn that the target CPU under
test may not support.
Require a sqrt_insn effective target for all the affected tests.
gcc.dg/torture/pr91323.c tests that a compare with NaNf doesn't set an
exception using builtin compare intrinsics, and that it does when
using regular compare operators.
That doesn't seem to be expected to work on powerpc targets. It fails
on GNU/Linux, it's marked to be skipped on AIX, and a similar test,
gcc.dg/torture/pr93133.c, has the execution test xfailed for all of
powerpc*-*-*.
In this test, the functions that use intrinsics for the compare end up
with the same code as the one that uses compare operators, using
fcmpu, a floating compare that, unlike fcmpo, does not set the invalid
operand exception for quiet NaN. I couldn't find any evidence that
the rs6000 backend ever outputs fcmpo. Therefore, I'm adding the same
execution xfail marker to this test.
for gcc/testsuite/ChangeLog
PR target/58684
* gcc.dg/torture/pr91323.c: Expect execution fail on
powerpc*-*-*.
vec-mul is an execution test, but it only requires a powerpc_vsx_ok
effective target, which is enough only for compile tests. In order to
check for runtime and execution environment support, we need to
require vsx_hw. Make that a condition for execution, but still
perform a compile test if the condition is not satisfied.
for gcc/testsuite/ChangeLog
* gcc.target/powerpc/vec-mul.c: Run on target vsx_hw, just
compile otherwise.
disable ldist for test, to restore vectorizing-candidate loop
The loop we're supposed to try to vectorize in
gcc.dg/vect/costmodel/ppc/costmodel-vect-31a.c is turned into a memset
before the vectorizer runs.
Various other tests in this set have already run into this, and the
solution has been to disable this loop distribution transformation,
enabled at -O2, so that the vectorizer gets a chance to transform the
loop and, in this testcase, fail to do so.
Request check for hw support in ppc run tests with -maltivec/-mvsx
for gcc/testsuite/ChangeLog
* gcc.target/powerpc/swaps-p8-20.c: Change powerpc_altivec_ok
require-effective-target test into vmx_hw.
* gcc.target/powerpc/vsx-vector-5.c: Change powerpc_vsx_ok
require-effective-target test into vsx_hw.
These ppc lp64 tests check for errors or warnings on -mno-powerpc64.
On powerpc64-*-vxworks* we get the same errors as on most other
covered platforms, but the tests did not mark them as expected for
this target. On powerpc-*-vxworks*, the tests are skipped because
lp64 is not satisfied, so I'm naming powerpc*-*-vxworks* rather than
something more specific.
When vect.exp finds our configuration disables altivec by default, it
disables the execution of vectorization tests, assuming the test
hardware doesn't support it.
Tests become just compile tests, but compile tests won't work
correctly when additional sources are named, e.g. pr95401.cc, because
GCC refuses to compile multiple files into the same asm output.
With this patch, the default for when execution is not possible
becomes link.
for gcc/testsuite/ChangeLog
* lib/target-supports.exp (check_vect_support_and_set_flags):
Decay to link rather than compile.
Andrew Pinski [Mon, 12 Feb 2024 23:48:48 +0000 (15:48 -0800)]
aarch64: Use vec_perm_indices::new_shrunk_vector in aarch64_evpc_reencode
While working on PERM related stuff, I can across that aarch64_evpc_reencode
was manually figuring out if we shrink the perm indices instead of
using vec_perm_indices::new_shrunk_vector; shrunk was added after reencode
was added.
Built and tested for aarch64-linux-gnu with no regressions.
gcc/ChangeLog:
PR target/113822
* config/aarch64/aarch64.cc (aarch64_evpc_reencode): Use
vec_perm_indices::new_shrunk_vector instead of manually
going through the indices.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Fangrui Song [Sat, 27 Apr 2024 01:14:33 +0000 (18:14 -0700)]
RISC-V: Add -X to link spec
--discard-locals (-X) instructs the linker to remove local .L* symbols,
which occur a lot due to label differences for linker relaxation. The
arm port has a similar need and passes -X to ld.
In contrast, the RISC-V port does not pass -X to ld and rely on the
default --discard-locals in GNU ld's riscv port. The arm way is more
conventional (compiler driver instead of the linker customizes the
default linker behavior) and works with lld.
Wilco Dijkstra [Wed, 21 Feb 2024 23:34:37 +0000 (23:34 +0000)]
AArch64: Cleanup memset expansion
Cleanup memset implementation. Similar to memcpy/memmove, use an offset and
bytes throughout. Simplify the complex calculations when optimizing for size
by using a fixed limit.
gcc/ChangeLog:
* config/aarch64/aarch64.cc (MAX_SET_SIZE): New define.
(aarch64_progress_pointer): Remove function.
(aarch64_set_one_block_and_progress_pointer): Simplify and clean up.
(aarch64_expand_setmem): Clean up implementation, use byte offsets,
simplify size calculation.
Remove the tune AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS since it is only
used by an old core and doesn't properly support -Os. SPECINT_2017
shows that removing it has no performance difference, while codesize
is reduced by 0.07%.
Jonathan Wakely [Fri, 26 Apr 2024 10:42:26 +0000 (11:42 +0100)]
libstdc++: Do not apply localized formatting to NaN and inf [PR114863]
We don't want to add grouping to strings like "-inf", and there is no
radix character to replace either.
libstdc++-v3/ChangeLog:
PR libstdc++/114863
* include/std/format (__formatter_fp::format): Only use
_M_localized for finite values.
* testsuite/std/format/functions/format.cc: Check localized
formatting of NaN and initiny.
Patrick Palka [Fri, 26 Apr 2024 11:44:25 +0000 (07:44 -0400)]
c++: fix source printing for "required from here" message
It seems the diagnostic machinery's source line printing respects
the pretty printer prefix, but this is undesirable for the call to
diagnostic_show_locus in print_instantiation_partial_context_line
(added in r14-4388-g1c45319b66edc9) since the prefix may have been
set when issuing an earlier, unrelated diagnostic and we just want
to print an unprefixed source line.
This patch naively fixes this by clearing the prefix before calling
diagnostic_show_locus.
Before this patch, for error60a.C below we'd print
gcc/testsuite/g++.dg/template/error60a.C: In function ‘void usage()’:
gcc/testsuite/g++.dg/template/error60a.C:24:3: error: ‘unrelated_error’ was not declared in this scope
24 | unrelated_error; // { dg-error "not declared" }
| ^~~~~~~~~~~~~~~
gcc/testsuite/g++.dg/template/error60a.C: In instantiation of ‘void test(Foo) [with Foo = int]’:
gcc/testsuite/g++.dg/template/error60a.C:25:13: required from here
gcc/testsuite/g++.dg/template/error60a.C:24:3: error: 25 | test<int> (42); // { dg-message " required from here" }
gcc/testsuite/g++.dg/template/error60a.C:24:3: error: | ~~~~~~~~~~^~~~
gcc/testsuite/g++.dg/template/error60a.C:19:24: error: invalid conversion from ‘int’ to ‘int*’ [-fpermissive]
19 | my_pointer<Foo> ptr (val); // { dg-error "invalid conversion from 'int' to 'int\\*'" }
| ^~~
| |
| int
gcc/testsuite/g++.dg/template/error60a.C:9:20: note: initializing argument 1 of ‘my_pointer<Foo>::my_pointer(Foo*) [with Foo = int]’
9 | my_pointer (Foo *ptr) // { dg-message " initializing argument 1" }
| ~~~~~^~~
and afterward we print
gcc/testsuite/g++.dg/template/error60a.C: In function ‘void usage()’:
gcc/testsuite/g++.dg/template/error60a.C:24:3: error: ‘unrelated_error’ was not declared in this scope
24 | unrelated_error; // { dg-error "not declared" }
| ^~~~~~~~~~~~~~~
gcc/testsuite/g++.dg/template/error60a.C: In instantiation of ‘void test(Foo) [with Foo = int]’:
gcc/testsuite/g++.dg/template/error60a.C:25:13: required from here
25 | test<int> (42); // { dg-message " required from here" }
| ~~~~~~~~~~^~~~
gcc/testsuite/g++.dg/template/error60a.C:19:24: error: invalid conversion from ‘int’ to ‘int*’ [-fpermissive]
19 | my_pointer<Foo> ptr (val); // { dg-error "invalid conversion from 'int' to 'int\\*'" }
| ^~~
| |
| int
gcc/testsuite/g++.dg/template/error60a.C:9:20: note: initializing argument 1 of ‘my_pointer<Foo>::my_pointer(Foo*) [with Foo = int]’
9 | my_pointer (Foo *ptr) // { dg-message " initializing argument 1" }
| ~~~~~^~~
gcc/cp/ChangeLog:
* error.cc (print_instantiation_partial_context_line): Clear the
pretty printer prefix around the call to diagnostic_show_locus.
gcc/testsuite/ChangeLog:
* g++.dg/concepts/diagnostic2.C: Expect source line printed
for the "required from here" message.
* g++.dg/template/error60a.C: New test.
Jakub Jelinek [Fri, 26 Apr 2024 10:18:29 +0000 (12:18 +0200)]
Update crontab and git_update_version.py
2024-04-26 Jakub Jelinek <jakub@redhat.com>
maintainer-scripts/
* crontab: Snapshots from trunk are now GCC 15 related.
Add GCC 14 snapshots from the respective branch.
contrib/
* gcc-changelog/git_update_version.py (active_refs): Add
releases/gcc-14.
Frederik Harwath [Wed, 24 Apr 2024 18:29:14 +0000 (20:29 +0200)]
amdgcn: Add gfx90c target
Add support for gfx90c GCN5 APU integrated graphics devices.
The LLVM AMDGPU documentation does not list those devices as supported
by rocm-amdhsa, but it passes most libgomp offloading tests.
Although they are constrainted compared to dGPUs, they might be
interesting for learning, experimentation, and testing.
Jakub Jelinek [Thu, 25 Apr 2024 18:45:04 +0000 (20:45 +0200)]
c++: Fix constexpr evaluation of parameters passed by invisible reference [PR111284]
My r9-6136 changes to make a copy of constexpr function bodies before
genericization modifies it broke the constant evaluation of non-POD
arguments passed by value.
In the callers such arguments are passed as reference to usually a
TARGET_EXPR, but on the callee side until genericization they are just
direct uses of a PARM_DECL with some class type.
In cxx_bind_parameters_in_call I've used convert_from_reference to
pretend it is passed by value and then cxx_eval_constant_expression
is called there and evaluates that as an rvalue, followed by
adjust_temp_type if the types don't match exactly (e.g. const Foo
argument and passing to it reference to Foo TARGET_EXPR).
The reason this doesn't work is that when the TARGET_EXPR in the caller
is constant initialized, this for it is the address of the TARGET_EXPR_SLOT,
but if the code later on pretends the PARM_DECL is just initialized to the
rvalue of the constant evaluation of the TARGET_EXPR, it is as if there
is a bitwise copy of the TARGET_EXPR to the callee, so this in the callee
is then address of the PARM_DECL in the callee.
The following patch attempts to fix that by constexpr evaluation of such
arguments in the caller as an lvalue instead of rvalue, and on the callee
side when seeing such a PARM_DECL, if we want an lvalue, lookup the value
(lvalue) saved in ctx->globals (if any), and if wanting an rvalue,
recursing with vc_prvalue on the looked up value (because it is there
as an lvalue, nor rvalue).
adjust_temp_type doesn't work for lvalues of non-scalarish types, for
such types it relies on changing the type of a CONSTRUCTOR, but on the
other side we know what we pass to the argument is addressable, so
the patch on type mismatch takes address of the argument value, casts
to reference to the desired type and dereferences it.
2024-04-25 Jakub Jelinek <jakub@redhat.com>
PR c++/111284
* constexpr.cc (cxx_bind_parameters_in_call): For PARM_DECLs with
TREE_ADDRESSABLE types use vc_glvalue rather than vc_prvalue for
cxx_eval_constant_expression and if it doesn't have the same
type as it should, cast the reference type to reference to type
before convert_from_reference and instead of adjust_temp_type
take address of the arg, cast to reference to type and then
convert_from_reference.
(cxx_eval_constant_expression) <case PARM_DECL>: For lval case
on parameters with TREE_ADDRESSABLE types lookup result in
ctx->globals if possible. Otherwise if lookup in ctx->globals
was successful for parameter with TREE_ADDRESSABLE type,
recurse with vc_prvalue on the returned value.
* g++.dg/cpp1z/constexpr-111284.C: New test.
* g++.dg/cpp1y/constexpr-lifetime7.C: Expect one error on a different
line.
Jakub Jelinek [Thu, 25 Apr 2024 18:43:13 +0000 (20:43 +0200)]
libgcc: Don't use weakrefs for glibc 2.34
glibc 2.34 and later doesn't have separate libpthread (libpthread.so.0 is a
dummy shared library with just some symbol versions for compatibility, but
all the pthread_* APIs are in libc.so.6).
So, we don't need to do the .weakref dances to check whether a program
has been linked with -lpthread or not, in dynamically linked apps those
will be always true anyway.
In -static linking, this fixes various issues people had when only linking
some parts of libpthread.a and getting weird crashes. A hack for that was
what e.g. some Fedora glibcs used, where libpthread.a was a library
containing just one giant *.o file which had all the normal libpthread.a
*.o files linked with -r together.
libstdc++-v3 actually does something like this already since r10-10928,
the following patch is meant to fix it even for libgfortran, libobjc and
whatever else uses gthr.h.
2024-04-25 Jakub Jelinek <jakub@redhat.com>
* gthr.h (GTHREAD_USE_WEAK): Redefine to 0 for GLIBC 2.34 or later.
Jakub Jelinek [Thu, 25 Apr 2024 18:37:10 +0000 (20:37 +0200)]
c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
When expand_or_defer_fn is called at_eof time, it calls import_export_decl
and then maybe_clone_body, which uses DECL_ONE_ONLY and comdat name in a
couple of places to try to optimize cdtors which are known to have the
same body by making the complete cdtor an alias to base cdtor (and in
that case also uses *[CD]5* as comdat group name instead of the normal
comdat group names specific to each mangled name).
Now, this optimization depends on DECL_ONE_ONLY and DECL_INTERFACE_KNOWN,
maybe_clone_body and can_alias_cdtor use:
if (DECL_ONE_ONLY (fn))
cgraph_node::get_create (clone)->set_comdat_group (cxx_comdat_group (clone));
...
bool can_alias = can_alias_cdtor (fn);
...
/* Tell cgraph if both ctors or both dtors are known to have
the same body. */
if (can_alias
&& fns[0]
&& idx == 1
&& cgraph_node::get_create (fns[0])->create_same_body_alias
(clone, fns[0]))
{
alias = true;
if (DECL_ONE_ONLY (fns[0]))
{
/* For comdat base and complete cdtors put them
into the same, *[CD]5* comdat group instead of
*[CD][12]*. */
comdat_group = cdtor_comdat_group (fns[1], fns[0]);
cgraph_node::get_create (fns[0])->set_comdat_group (comdat_group);
if (symtab_node::get (clone)->same_comdat_group)
symtab_node::get (clone)->remove_from_same_comdat_group ();
symtab_node::get (clone)->add_to_same_comdat_group
(symtab_node::get (fns[0]));
}
}
and
/* Don't use aliases for weak/linkonce definitions unless we can put both
symbols in the same COMDAT group. */
return (DECL_INTERFACE_KNOWN (fn)
&& (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn))
&& (!DECL_ONE_ONLY (fn)
|| (HAVE_COMDAT_GROUP && DECL_WEAK (fn))));
The following testcase regressed with Marek's r14-5979 change,
when pr113208_0.C is compiled where the ctor is marked constexpr,
we no longer perform this optimization, where
_ZN6vectorI12QualityValueEC2ERKS1_ was emitted in the
_ZN6vectorI12QualityValueEC5ERKS1_ comdat group and
_ZN6vectorI12QualityValueEC1ERKS1_ was made an alias to it,
instead we emit _ZN6vectorI12QualityValueEC2ERKS1_ in
_ZN6vectorI12QualityValueEC2ERKS1_ comdat group and the same
content _ZN6vectorI12QualityValueEC1ERKS1_ as separate symbol in
_ZN6vectorI12QualityValueEC1ERKS1_ comdat group.
Now, the linker seems to somehow cope with that, eventhough it
probably keeps both copies of the ctor, but seems LTO can't cope
with that and Honza doesn't know what it should do in that case
(linker decides that the prevailing symbol is
_ZN6vectorI12QualityValueEC2ERKS1_ (from the
_ZN6vectorI12QualityValueEC2ERKS1_ comdat group) and
_ZN6vectorI12QualityValueEC1ERKS1_ alias (from the other TU,
from _ZN6vectorI12QualityValueEC5ERKS1_ comdat group)).
Note, the case where some constructor is marked constexpr in one
TU and not in another one happens pretty often in libstdc++ when
one mixes -std= flags used to compile different compilation units.
The reason the optimization doesn't trigger when the constructor is
constexpr is that expand_or_defer_fn is called in that case much earlier
than when it is not constexpr; in the former case it is called when we
try to constant evaluate that constructor. But DECL_INTERFACE_KNOWN
is false in that case and comdat_linkage hasn't been called either
(I think it is desirable, because comdat group is stored in the cgraph
node and am not sure it is a good idea to create cgraph nodes for
something that might not be needed later on at all), so maybe_clone_body
clones the bodies, but doesn't make them as aliases.
The following patch is an attempt to redo this optimization when
import_export_decl is called at_eof time on the base/complete cdtor
(or deleting dtor). It will not do anything if maybe_clone_body
hasn't been called uyet (the TREE_ASM_WRITTEN check on the
DECL_MAYBE_IN_CHARGE_CDTOR_P), or when one or both of the base/complete
cdtors have been lowered already, or when maybe_clone_body called
maybe_thunk_body and it was successful. Otherwise retries the
can_alias_cdtor check and makes the complete cdtor alias to the
base cdtor with adjustments to the comdat group.
2024-04-25 Jakub Jelinek <jakub@redhat.com>
PR lto/113208
* cp-tree.h (maybe_optimize_cdtor): Declare.
* decl2.cc (import_export_decl): Call it for cloned cdtors.
* optimize.cc (maybe_optimize_cdtor): New function.
* g++.dg/abi/comdat2.C: New test.
* g++.dg/abi/comdat5.C: New test.
* g++.dg/lto/pr113208_0.C: New test.
* g++.dg/lto/pr113208_1.C: New file.
* g++.dg/lto/pr113208.h: New file.
David Faust [Thu, 25 Apr 2024 16:31:14 +0000 (09:31 -0700)]
bpf: avoid issues with CO-RE and -gtoggle
Compiling a BPF program with CO-RE relocations (and BTF) while also
passing -gtoggle led to an inconsistent state where CO-RE support was
enabled but BTF would not be generated, and this was not caught by the
existing option parsing. This led to an ICE when generating the CO-RE
relocation info, since BTF is required for CO-RE.
Update bpf_option_override to avoid this case, and add a few tests for
the interactions of these options.
gcc/
* config/bpf/bpf.cc (bpf_option_override): Improve handling of CO-RE
options to avoid issues with -gtoggle.
Jakub Jelinek [Thu, 25 Apr 2024 18:09:35 +0000 (20:09 +0200)]
openmp: Copy DECL_LANG_SPECIFIC and DECL_LANG_FLAG_? to tree-nested decl copy [PR114825]
tree-nested.cc creates in 2 spots artificial VAR_DECLs, one of them is used
both for debug info and OpenMP/OpenACC lowering purposes, the other solely for
OpenMP/OpenACC lowering purposes.
When the decls are used in OpenMP/OpenACC lowering, the OMP langhooks (mostly
Fortran, C just a little and C++ doesn't have nested functions) then inspect
the flags on the vars and based on that decide how to lower the corresponding
clauses.
Unfortunately we weren't copying DECL_LANG_SPECIFIC and DECL_LANG_FLAG_?, so
the langhooks made decisions on the default flags on those instead.
As the original decl isn't necessarily a VAR_DECL, could be e.g. PARM_DECL,
using copy_node wouldn't work properly, so this patch just copies those
flags in addition to other flags it was copying already. And I've removed
code duplication by introducing a helper function which does copying common
to both uses.
2024-04-25 Jakub Jelinek <jakub@redhat.com>
PR fortran/114825
* tree-nested.cc (get_debug_decl): New function.
(get_nonlocal_debug_decl): Use it.
(get_local_debug_decl): Likewise.
Jose E. Marchesi [Thu, 25 Apr 2024 14:53:49 +0000 (16:53 +0200)]
bpf: default to using pseudo-C assembly syntax by default
At this point the kernel headers that almost all BPF programs use
contain pseudo-C inline assembly and having the GNU toolchain using
the conventional assembly syntax by default would force users to
specify the command-line option explicitly almost all of the time,
which is very inconvenient.
This patch changes GCC in order to recognize and generate the pseudo-C
assembly syntax of BPF by default. The ASM_SPEC is adapted
accordingly, and in a way that the current release of the BPF
assembler (which still expects conventional assembler syntax by
default) does the right thing.
Tested in bpf-unknown-none-bpf target and x86_64-linux-gnu host.
No regressions.
gcc/ChangeLog
* config/bpf/bpf.opt: Use ASM_PSEUDOC for the default value of
-masm.
* config/bpf/bpf.h (ASM_SPEC): Adapt accordingly.
* doc/invoke.texi (eBPF Options): Update.
Richard Ball [Thu, 25 Apr 2024 14:30:42 +0000 (15:30 +0100)]
arm: Zero/Sign extends for CMSE security
Co-Authored by: Andre Simoes Dias Vieira <Andre.SimoesDiasVieira@arm.com>
This patch makes the following changes:
1) When calling a secure function from non-secure code then any arguments
smaller than 32-bits that are passed in registers are zero- or sign-extended.
2) After a non-secure function returns into secure code then any return value
smaller than 32-bits that is passed in a register is zero- or sign-extended.
Gaius Mulley [Thu, 25 Apr 2024 14:19:30 +0000 (15:19 +0100)]
modula2: issue the parameter incompatibility error message based on dialect
This tiny patch improves the parameter incompatibility error message by
having a different message for the dialect chosen mentioning the specific
violation. PIM uses assignment rules for pass by value and expression
rules for pass by reference. ISO uses expression type checking for
pass by value and pass by reference.
gcc/m2/ChangeLog:
* gm2-compiler/M2FileName.def (CalculateFileName): Remove
quoted string in comment.
* gm2-compiler/M2Range.mod (FoldTypeParam): Generate dialect
specific parameter incompatibility error message.
Richard Biener [Thu, 25 Apr 2024 06:08:24 +0000 (08:08 +0200)]
tree-optimization/114792 - order loops to unloops in CH
When we use unloop_loops we have to make sure to have loops ordered
inner to outer as otherwise we can wreck inner loop structure where
unlooping relies on that being intact. The following re-sorts the
vector of to unloop loops after copy-header as that adds to the
vector in two places and the wrong order.
PR tree-optimization/114792
* tree-ssa-loop-ch.cc (ch_order_loops): New function.
(ch_base::copy_headers): Sort loops to unloop inner-to-outer.
Eric Botcazou [Thu, 25 Apr 2024 10:44:14 +0000 (12:44 +0200)]
Fix calling convention incompatibility with vendor compiler
For the 20th anniversary of https://gcc.gnu.org/gcc-3.4/sparc-abi.html,
a new calling convention incompatibility with the vendor compiler (and
the ABI) has been discovered in 64-bit mode, affecting small structures
containing arrays of floating-point components. The decision has been
made to fix it on Solaris only at this point.
gcc/
PR target/114416
* config/sparc/sparc.h (SUN_V9_ABI_COMPATIBILITY): New macro.
* config/sparc/sol2.h (SUN_V9_ABI_COMPATIBILITY): Redefine it.
* config/sparc/sparc.cc (fp_type_for_abi): New predicate.
(traverse_record_type): Use it to spot floating-point types.
(compute_fp_layout): Also deal with array types.
Pan Li [Thu, 25 Apr 2024 07:04:02 +0000 (15:04 +0800)]
RISC-V: Add test cases for insn does not satisfy its constraints [PR114714]
We have one ICE when RVV register overlap is enabled. We reverted this
feature as it is in stage 4 and there is no much time to figure a better
solution for this. Thus, for now add the related test cases which will
trigger ICE when register overlap enabled.
This will gate the RVV register overlap support in GCC-15.
PR target/114714
gcc/testsuite/ChangeLog:
* g++.target/riscv/rvv/base/pr114714-1.C: New test.
* g++.target/riscv/rvv/base/pr114714-2.C: New test.
Signed-off-by: Pan Li <pan2.li@intel.com> Co-Authored-by: Kito Cheng <kito.cheng@sifive.com>
Pan Li [Thu, 25 Apr 2024 00:55:08 +0000 (08:55 +0800)]
RISC-V: Add early clobber to the dest of vwsll
We missed the existing early clobber for the dest operand of vwsll
pattern when resolve the conflict of revert register overlap. Thus
add it back to the pattern. Unfortunately, we have no test to cover
this part and will improve this after GCC-15 open.
The below tests are passed for this patch:
* The rv64gcv fully regression test with isl build.
gcc/ChangeLog:
* config/riscv/vector-crypto.md: Add early clobber to the
dest operand of vwsll.
Paul Thomas [Thu, 25 Apr 2024 05:56:10 +0000 (06:56 +0100)]
Fortran: Fix ICE in gfc_trans_create_temp_array from bad type [PR93678]
2024-04-25 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/93678
* trans-expr.cc (gfc_conv_procedure_call): Use the interface,
where possible, to obtain the type of character procedure
pointers of class entities.
gcc/testsuite/
PR fortran/93678
* gfortran.dg/pr93678.f90: New test.