[frange] Remove special casing from unordered operators.
In coming up with testcases for the unordered folders, I realized that
we were already handling them correctly, even in the absence of my
work in this area lately.
All of the unordered fold_range() methods try to fold with the ordered
variants first, and if they return TRUE, we are guaranteed to be able
to fold, even in the presence of NANs. For example:
if (x_5 >= y_8)
if (x_5 __UNLE y_8)
On the true side of the first conditional we know that either x_5 < y_8
or that one or more operands is a NAN. Since UNLE_EXPR returns true
for precisely this scenario, we can fold as true.
This is handled in the fold_range() methods as follows:
if (!range_op_handler (LE_EXPR).fold_range (r, type, op1_no_nan,
op2_no_nan, trio))
return false;
// The result is the same as the ordered version when the
// comparison is true or when the operands cannot be NANs.
if (!maybe_isnan (op1, op2) || r == range_true (type))
return true;
This code has been there since the last release, and makes the special
casing I am deleting obsolete. I have added tests to make sure we
keep track of this behavior.
Jakub Jelinek [Wed, 20 Sep 2023 16:37:29 +0000 (18:37 +0200)]
c, c++: Accept __builtin_classify_type (typename)
As mentioned in my stdckdint.h mail, __builtin_classify_type has
a problem that argument promotion (the argument is passed to ...
prototyped builtin function) means that certain type classes will
simply never appear.
I think it is too late to change how it behaves, lots of code in the
wild might rely on the current behavior.
So, the following patch adds option to use a typename rather than
expression as the operand to the builtin, making it behave similarly
to sizeof, typeof or say the clang _Generic extension where the
first argument can be there not just expression, but also typename.
I think we have other prior art here, e.g. __builtin_va_arg also
expects typename.
I've added this to both C and C++, because it would be weird if it
supported it only in C and not in C++.
2023-09-20 Jakub Jelinek <jakub@redhat.com>
gcc/
* builtins.h (type_to_class): Declare.
* builtins.cc (type_to_class): No longer static. Return
int rather than enum.
* doc/extend.texi (__builtin_classify_type): Document.
gcc/c/
* c-parser.cc (c_parser_postfix_expression_after_primary): Parse
__builtin_classify_type call with typename as argument.
gcc/cp/
* parser.cc (cp_parser_postfix_expression): Parse
__builtin_classify_type call with typename as argument.
* pt.cc (tsubst_copy_and_build): Handle __builtin_classify_type
with dependent typename as argument.
gcc/testsuite/
* c-c++-common/builtin-classify-type-1.c: New test.
* g++.dg/ext/builtin-classify-type-1.C: New test.
* g++.dg/ext/builtin-classify-type-2.C: New test.
* gcc.dg/builtin-classify-type-1.c: New test.
Patrick Palka [Wed, 20 Sep 2023 16:09:36 +0000 (12:09 -0400)]
c++: improve class NTTP object pretty printing [PR111471]
1. Move class NTTP object pretty printing to a more general spot in
the pretty printer, so that we always print its value instead of
its (mangled) name even when it appears outside of a template
argument list.
2. Print the type of an class NTTP object alongside its CONSTRUCTOR
value, like dump_expr would have done.
3. Don't print const VIEW_CONVERT_EXPR wrappers for class NTTPs.
PR c++/111471
gcc/cp/ChangeLog:
* cxx-pretty-print.cc (cxx_pretty_printer::expression)
<case VAR_DECL>: Handle class NTTP objects by printing
their type and value.
<case VIEW_CONVERT_EXPR>: Strip const VIEW_CONVERT_EXPR
wrappers for class NTTPs.
(pp_cxx_template_argument_list): Don't handle class NTTP
objects here.
Patrick Palka [Wed, 20 Sep 2023 16:07:15 +0000 (12:07 -0400)]
c++: further optimize tsubst_template_decl
This patch makes tsubst_template_decl use use_spec_table=false also in
the non-class non-function template case, to avoid computing 'argvec' and
doing a hash table lookup from tsubst_decl (when partially instantiating
a member variable/alias template).
This change reveals that for function templates, tsubst_template_decl
registers the partially instantiated TEMPLATE_DECL, whereas for other
non-class templates it registers the corresponding DECL_TEMPLATE_RESULT
which is an interesting inconsistency that I decided to preserve for now.
Trying to consistently register the TEMPLATE_DECL (or DECL_TEMPLATE_RESULT)
causes modules ICEs which I didn't look into.
In passing, in tsubst_function_decl I noticed 'argvec' is unused
when 'lambda_fntype' is set (since lambdas aren't recorded in the
specializations table), so we can avoid computing it in that case.
gcc/cp/ChangeLog:
* pt.cc (tsubst_function_decl): Don't bother computing 'argvec'
when 'lambda_fntype' is set.
(tsubst_template_decl): Make sure we return a TEMPLATE_DECL
during specialization lookup. In the non-class non-function
template case, use tsubst_decl directly with use_spec_table=false,
update DECL_TI_ARGS and call register_specialization like
tsubst_decl would have done if use_spec_table=true.
OpenMP: Add ME support for 'omp allocate' stack variables
Call GOMP_alloc/free for 'omp allocate' allocated variables. This is
for C only as C++ and Fortran show a sorry already in the FE. Note that
this only applies to stack variables as the C FE shows a sorry for
static variables.
gcc/ChangeLog:
* gimplify.cc (gimplify_bind_expr): Call GOMP_alloc/free for
'omp allocate' variables; move stack cleanup after other
cleanup.
(omp_notice_variable): Process original decl when decl
of the value-expression for a 'omp allocate' variable is passed.
* omp-low.cc (scan_omp_1_op): Handle 'omp allocate' variables
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.1 Impl.): Mark 'omp allocate' as
implemented for C only.
* testsuite/libgomp.c/allocate-4.c: New test.
* testsuite/libgomp.c/allocate-5.c: New test.
* testsuite/libgomp.c/allocate-6.c: New test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/allocate-11.c: Remove C-only dg-message
for 'sorry, unimplemented'.
* c-c++-common/gomp/allocate-12.c: Likewise.
* c-c++-common/gomp/allocate-15.c: Likewise.
* c-c++-common/gomp/allocate-9.c: Likewise.
* c-c++-common/gomp/allocate-10.c: New test.
* c-c++-common/gomp/allocate-17.c: New test.
Darwin: Move checking of the 'shared' driver spec.
This avoids a bunch of irrelevant diagnostics if the user passes '-shared' to
gnatmake. Currently, we push '-dynamiclib' back onto the command line (since
that is the Darwin spelling of 'shared') but this is not handled by gnat1,
leading to a diagnostic for every character after the '-d'.
'-shared' has no effect on gnatmake (it needs to be passed to gnatbind).
This moves the handling of '-shared' to leaf specs so that we do not need to
push 'dynamiclib' onto the command line.
gcc/ChangeLog:
* config/darwin.h:
(SUBTARGET_DRIVER_SELF_SPECS): Move handling of 'shared' into the same
specs as 'dynamiclib'. (STARTFILE_SPEC): Handle 'shared'.
Richard Biener [Wed, 20 Sep 2023 06:40:34 +0000 (08:40 +0200)]
tree-optimization/111489 - turn uninit limits to params
The following turns MAX_NUM_CHAINS and MAX_CHAIN_LEN to params which
allows to experiment with raising them. For the testcase in PR111489
raising MAX_CHAIN_LEN from 5 to 8 avoids the bogus diagnostics
at -O2, at -O3 we need a MAX_CHAIN_LEN of 6.
PR tree-optimization/111489
* doc/invoke.texi (--param uninit-max-chain-len): Document.
(--param uninit-max-num-chains): Likewise.
* params.opt (-param=uninit-max-chain-len=): New.
(-param=uninit-max-num-chains=): Likewise.
* gimple-predicate-analysis.cc (MAX_NUM_CHAINS): Define to
param_uninit_max_num_chains.
(MAX_CHAIN_LEN): Define to param_uninit_max_chain_len.
(uninit_analysis::init_use_preds): Avoid VLA.
(uninit_analysis::init_from_phi_def): Likewise.
(compute_control_dep_chain): Avoid using MAX_CHAIN_LEN in
template parameter.
RISC-V: Reorganize and rename combine patterns in autovec-opt.md
This patch reorganize and rename the combine patterns in autovec-opt.md
by category. There shouldn't be any functional changes.
The current classification includes the following categories:
- Combine op + vmerge to cond_op
- Combine binop + trunc to narrow_binop
- Combine extend + binop to widen_binop
- Combine extend + ternop to widen_ternop
- Misc combine patterns
Jakub Jelinek [Wed, 20 Sep 2023 06:43:02 +0000 (08:43 +0200)]
openmp: Add omp::decl attribute support [PR111392]
This patch adds support for (so far C++) omp::decl attribute. For
declare simd and declare variant directives it is essentially another
spelling of omp::decl, except per discussions it is not allowed inside
of omp::sequence attribute. For threadprivate, declare target, allocate
and later groupprivate directives it should appertain to variable (or for
declare target also function definitions and) declarations and where in
normal syntax one specifies a list of variables (or variables and functions),
either as argument of the directive or clause argument, such argument is
not specified and implied to be the variable it applies to.
2023-09-20 Jakub Jelinek <jakub@redhat.com>
PR c++/111392
gcc/
* attribs.cc (decl_attributes): Don't warn on omp::directive attribute
on vars or function decls if -fopenmp or -fopenmp-simd.
gcc/c-family/
* c-omp.cc (c_omp_directives): Add commented out groupprivate
directive entry.
gcc/cp/
* parser.h (struct cp_lexer): Add in_omp_decl_attribute member.
* cp-tree.h (cp_maybe_parse_omp_decl): Declare.
* parser.cc (cp_parser_handle_statement_omp_attributes): Diagnose
omp::decl attribute on statements. Adjust diagnostic wording for
omp::decl.
(cp_parser_omp_directive_args): Add DECL_P argument, set TREE_PUBLIC
to it on the DEFERRED_PARSE tree.
(cp_parser_omp_sequence_args): Adjust caller.
(cp_parser_std_attribute): Handle omp::decl attribute.
(cp_parser_omp_var_list): If parser->lexer->in_omp_decl_attribute
don't expect any arguments, instead create clause or TREE_LIST for
that decl.
(cp_parser_late_parsing_omp_declare_simd): Adjust diagnostic wording
for omp::decl.
(cp_maybe_parse_omp_decl): New function.
(cp_parser_omp_declare_target): If
parser->lexer->in_omp_decl_attribute and first token isn't name or
comma invoke cp_parser_omp_var_list.
* decl2.cc (cplus_decl_attributes): Adjust diagnostic wording for
omp::decl. Handle omp::decl on declarations.
* name-lookup.cc (finish_using_directive): Adjust diagnostic wording
for omp::decl.
gcc/testsuite/
* g++.dg/gomp/attrs-19.C: New test.
* g++.dg/gomp/attrs-20.C: New test.
* g++.dg/gomp/attrs-21.C: New test.
libgomp/
* libgomp.texi: Mark decl attribute was added to the C++ attribute
syntax as implemented.
debug/111409 - don't generate COMDAT macro sections for split DWARF
Split DWARF files aren't processed by the linker, so DW_MACRO_import
offsets aren't relocated and the .debug_macro.dwo sections aren't
deduplicated and merged. There's no clear way for this to work for
split DWARF, so disable it.
gcc/ChangeLog:
PR debug/111409
* dwarf2out.cc (output_macinfo): Don't call optimize_macinfo_range if
dwarf_split_debug_info.
The pr92301.c is the latent bug in middle-end GIMPLE FOLD.
We are just lucky that this test passes with this patch which makes us not trigger the GIMPLE FOLD bug again.
gcc/ChangeLog:
* config/riscv/riscv-v.cc (can_find_related_mode_p): New function.
(vectorize_related_mode): Add VLS related modes.
* config/riscv/vector-iterators.md: Extend VLS modes.
ira: Consider save/restore costs of callee-save registers [PR110071]
In improve_allocation() routine, IRA checks for each allocno if spilling
any conflicting allocnos can improve the allocation of this allocno.
This routine computes the cost improvement for usage of each profitable
hard register for a given allocno. The existing code in
improve_allocation() does not consider the save/restore costs of callee
save registers while computing the cost improvement.
This can result in a callee save register being assigned to a pseudo
that is live in the entire function and across a call, overriding a
non-callee save register assigned to the pseudo by graph coloring. So
the entry basic block requires a prolog, thereby causing shrink wrap to
fail.
Some assemblers (GNU as for LoongArch) generates relocations for leb128
symbol arithmetic for relaxation, we need to disable relaxation probing
leb128 support then.
gcc/ChangeLog:
* configure: Regenerate.
* configure.ac: Checking assembler for -mno-relax support.
Disable relaxation when probing leb128 support.
LoongArch: Check whether binutils supports the relax function. If supported, explicit relocs are turned off by default.
gcc/ChangeLog:
* config.in: Regenerate.
* config/loongarch/genopts/loongarch.opt.in: Add compilation option
mrelax. And set the initial value of explicit-relocs according to the
detection status.
* config/loongarch/gnu-user.h: When compiling with -mno-relax, pass the
--no-relax option to the linker.
* config/loongarch/loongarch-driver.h (ASM_SPEC): When compiling with
-mno-relax, pass the -mno-relax option to the assembler.
* config/loongarch/loongarch-opts.h (HAVE_AS_MRELAX_OPTION): Define macro.
* config/loongarch/loongarch.opt: Regenerate.
* configure: Regenerate.
* configure.ac: Add detection of support for binutils relax function.
Ben Boeckel [Fri, 1 Sep 2023 13:04:04 +0000 (09:04 -0400)]
c++modules: report module mapper files as a dependency
It affects the build, and if used as a static file, can reliably be
tracked using the `-MF` mechanism.
gcc/cp/:
* mapper-client.cc, mapper-client.h (open_module_client): Accept
dependency tracking and track module mapper files as
dependencies.
* module.cc (make_mapper, get_mapper): Pass the dependency
tracking class down.
gcc/testsuite/:
* g++.dg/modules/depreport-2.modmap: New test.
* g++.dg/modules/depreport-2_a.C: New test.
* g++.dg/modules/depreport-2_b.C: New test.
* g++.dg/modules/test-depfile.py: Support `:|` syntax output
when generating modules.
Signed-off-by: Ben Boeckel <ben.boeckel@kitware.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Ben Boeckel [Fri, 1 Sep 2023 13:04:03 +0000 (09:04 -0400)]
c++modules: report imported CMI files as dependencies
They affect the build, so report them via `-MF` mechanisms.
gcc/cp/
* module.cc (do_import): Report imported CMI files as
dependencies.
gcc/testsuite/
* g++.dg/modules/depreport-1_a.C: New test.
* g++.dg/modules/depreport-1_b.C: New test.
* g++.dg/modules/test-depfile.py: New tool for validating depfile
information.
* lib/modules.exp: Support for validating depfile contents.
Signed-off-by: Ben Boeckel <ben.boeckel@kitware.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Ben Boeckel [Fri, 1 Sep 2023 13:04:02 +0000 (09:04 -0400)]
p1689r5: initial support
This patch implements support for [P1689R5][] to communicate to a build
system the C++20 module dependencies to build systems so that they may
build `.gcm` files in the proper order.
Support is communicated through the following three new flags:
- `-fdeps-format=` specifies the format for the output. Currently named
`p1689r5`.
- `-fdeps-file=` specifies the path to the file to write the format to.
- `-fdeps-target=` specifies the `.o` that will be written for the TU
that is scanned. This is required so that the build system can
correlate the dependency output with the actual compilation that will
occur.
CMake supports this format as of 17 Jun 2022 (to be part of 3.25.0)
using an experimental feature selection (to allow for future usage
evolution without committing to how it works today). While it remains
experimental, docs may be found in CMake's documentation for
experimental features.
Future work may include using this format for Fortran module
dependencies as well, however this is still pending work.
Header units (including the standard library headers) are 100%
unsupported right now because the `-E` mechanism wants to import their
BMIs. A new mode (i.e., something more workable than existing `-E`
behavior) that mocks up header units as if they were imported purely
from their path and content would be required.
- non-utf8 paths
The current standard says that paths that are not unambiguously
represented using UTF-8 are not supported (because these cases are rare
and the extra complication is not worth it at this time). Future
versions of the format might have ways of encoding non-UTF-8 paths. For
now, this patch just doesn't support non-UTF-8 paths (ignoring the
"unambiguously representable in UTF-8" case).
- figure out why junk gets placed at the end of the file
Sometimes it seems like the file gets a lot of `NUL` bytes appended to
it. It happens rarely and seems to be the result of some
`ftruncate`-style call which results in extra padding in the contents.
Noting it here as an observation at least.
libcpp/
* include/cpplib.h: Add cpp_fdeps_format enum.
(cpp_options): Add fdeps_format field
(cpp_finish): Add structured dependency fdeps_stream parameter.
* include/mkdeps.h (deps_add_module_target): Add flag for
whether a module is exported or not.
(fdeps_add_target): Add function.
(deps_write_p1689r5): Add function.
* init.cc (cpp_finish): Add new preprocessor parameter used for C++
module tracking.
* mkdeps.cc (mkdeps): Implement P1689R5 output.
gcc/
* doc/invoke.texi: Document -fdeps-format=, -fdeps-file=, and
-fdeps-target= flags.
* gcc.cc: add defaults for -fdeps-target= and -fdeps-file= when
only -fdeps-format= is specified.
* json.h: Add a TODO item to refactor out to share with
`libcpp/mkdeps.cc`.
gcc/c-family/
* c-opts.cc (c_common_handle_option): Add fdeps_file variable and
-fdeps-format=, -fdeps-file=, and -fdeps-target= parsing.
* c.opt: Add -fdeps-format=, -fdeps-file=, and -fdeps-target=
flags.
gcc/cp/
* module.cc (preprocessed_module): Pass whether the module is
exported to dependency tracking.
gcc/testsuite/
* g++.dg/modules/depflags-f-MD.C: New test.
* g++.dg/modules/depflags-f.C: New test.
* g++.dg/modules/depflags-fi.C: New test.
* g++.dg/modules/depflags-fj-MD.C: New test.
* g++.dg/modules/depflags-fj.C: New test.
* g++.dg/modules/depflags-fjo-MD.C: New test.
* g++.dg/modules/depflags-fjo.C: New test.
* g++.dg/modules/depflags-fo-MD.C: New test.
* g++.dg/modules/depflags-fo.C: New test.
* g++.dg/modules/depflags-j-MD.C: New test.
* g++.dg/modules/depflags-j.C: New test.
* g++.dg/modules/depflags-jo-MD.C: New test.
* g++.dg/modules/depflags-jo.C: New test.
* g++.dg/modules/depflags-o-MD.C: New test.
* g++.dg/modules/depflags-o.C: New test.
* g++.dg/modules/p1689-1.C: New test.
* g++.dg/modules/p1689-1.exp.ddi: New test expectation.
* g++.dg/modules/p1689-2.C: New test.
* g++.dg/modules/p1689-2.exp.ddi: New test expectation.
* g++.dg/modules/p1689-3.C: New test.
* g++.dg/modules/p1689-3.exp.ddi: New test expectation.
* g++.dg/modules/p1689-4.C: New test.
* g++.dg/modules/p1689-4.exp.ddi: New test expectation.
* g++.dg/modules/p1689-5.C: New test.
* g++.dg/modules/p1689-5.exp.ddi: New test expectation.
* g++.dg/modules/modules.exp: Load new P1689 library routines.
* g++.dg/modules/test-p1689.py: New tool for validating P1689 output.
* lib/modules.exp: Support for validating P1689 outputs.
Signed-off-by: Ben Boeckel <ben.boeckel@kitware.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Ben Boeckel [Fri, 1 Sep 2023 13:04:01 +0000 (09:04 -0400)]
spec: add a spec function to join arguments
When passing `-o` flags to other options, the typical `-o foo` spelling
leaves a leading whitespace when replacing elsewhere. This ends up
creating flags spelled as `-some-option-with-arg= foo.ext` which doesn't
parse properly. When attempting to make a spec function to just remove
the leading whitespace, the argument splitting ends up masking the
whitespace. However, the intended extension *also* ends up being its own
argument. To perform the desired behavior, the arguments need to be
concatenated together.
gcc/:
* gcc.cc (join_spec_func): Add a spec function to join all
arguments.
Signed-off-by: Ben Boeckel <ben.boeckel@kitware.com> Co-authored-by: Jason Merrill <jason@redhat.com>
Patrick O'Neill [Tue, 19 Sep 2023 17:03:35 +0000 (10:03 -0700)]
RISC-V: Fix --enable-checking=rtl ICE on rv32gc bootstrap
Resolves PR 111461.
during RTL pass: expand
offtime.c: In function '__offtime':
offtime.c:79:6: internal compiler error: RTL check: expected elt 0 type 'e' or 'u', have 'w' (rtx const_int) in riscv_legitimize_const_move, at config/riscv/riscv.cc:2176
79 | ip = __mon_yday[__isleap(y)];
Tested on rv32gc glibc with --enable-checking=rtl.
2023-09-19 Juzhe Zhong <juzhe.zhong@rivai.ai>
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_legitimize_const_move): Eliminate
src_op_0 var to avoid rtl check error.
[frange] Clean up floating point relational folding.
The following patch removes all the special casing from the floating
point relational folding code. Now all the code relating to folding
of relationals is in frelop_early_resolve() and in
operator_not_equal::fold_range() which requires a small tweak.
I have written new relational tests, and moved them to
gcc.dg/tree-ssa/vrp-float-relations-* for easy reference. In the
tests it's easy to see the type of things we need to handle:
(a)
if (x != y)
if (x == y)
link_error ();
(b)
if (a != b)
if (a != b) // Foldable as true.
(c)
/* We can thread BB2->BB4->BB5 even though we have no knowledge
of the NANness of either x_1 or a_5. */
__BB(4):
x_1 = __PHI (__BB2: a_5(D), __BB3: b_4(D));
if (x_1 __UNEQ a_5(D))
(d)
/* Even though x_1 and a_4 are equivalent on the BB2->BB4 path,
we cannot fold the conditional because of possible NANs: */
__BB(4):
# x_1 = __PHI (__BB2: a_4(D), __BB3: 8.0e+0(3));
if (x_1 == a_4(D))
(e)
if (cond)
x = a;
else
x = 8.0;
/* We can fold this as false on the path coming out of cond==1,
regardless of NANs on either "x" or "a". */
if (x < a)
stuff ();
[etc, etc]
We can implement everything without either special casing,
get_identity_relation(), or adding new unordered relationals.
The basic idea is that if we accurately reflect NANs in op[12]_range,
this information gets propagated to the relevant edges, and there's no
need for unordered relations (VREL_UN*), because the information is in
the range itself. This information is then used in
frelop_early_resolve() to fold certain combinations.
I don't mean this patch as a hard-no against implementing the
unordered relations Jakub preferred, but seeing that it's looking
cleaner and trivially simple without the added burden of more enums,
I'd like to flesh it out completely and then discuss if we still think
new codes are needed.
More testcases or corner cases are highly welcome.
In follow-up patches I will finish up unordered relation folding, and
come up with suitable tests.
gcc/ChangeLog:
* range-op-float.cc (frelop_early_resolve): Clean-up and remove
special casing.
(operator_not_equal::fold_range): Handle VREL_EQ.
(operator_lt::fold_range): Remove special casing for VREL_EQ.
(operator_gt::fold_range): Same.
(foperator_unordered_equal::fold_range): Same.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/vrp-float-12.c: Moved to...
* gcc.dg/tree-ssa/vrp-float-relations-1.c: ...here.
* gcc.dg/tree-ssa/vrp-float-relations-2.c: New test.
* gcc.dg/tree-ssa/vrp-float-relations-3.c: New test.
* gcc.dg/tree-ssa/vrp-float-relations-4.c: New test.
Javier Martinez [Wed, 23 Aug 2023 13:02:40 +0000 (15:02 +0200)]
c++: extend cold, hot attributes to classes
Most code is cold. This patch extends support for attribute ((cold)) to C++
Classes, Unions, and Structs (RECORD_TYPES and UNION_TYPES) to benefit from
encapsulation - reducing the verbosity of using the attribute where
deserved. The ((hot)) attribute is also extended for its semantic relation.
gcc/c-family/ChangeLog:
* c-attribs.cc (handle_hot_attribute): remove warning on
RECORD_TYPE and UNION_TYPE when in c_dialect_xx.
(handle_cold_attribute): Likewise.
gcc/cp/ChangeLog:
* class.cc (propagate_class_warmth_attribute): New function.
(check_bases_and_members): propagate hot and cold attributes
to all FUNCTION_DECL when the record is marked hot or cold.
* cp-tree.h (maybe_propagate_warmth_attributes): New function.
* decl2.cc (maybe_propagate_warmth_attributes): New function.
* method.cc (lazily_declare_fn): propagate hot and cold
attributes to lazily declared functions when the record is
marked hot or cold.
gcc/ChangeLog:
* doc/extend.texi: Document attributes hot, cold on C++ types.
gcc/testsuite/ChangeLog:
* g++.dg/ext/attr-hotness.C: New test.
Signed-off-by: Javier Martinez <javier.martinez.bugzilla@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Patrick Palka [Tue, 19 Sep 2023 18:38:10 +0000 (14:38 -0400)]
c++: fix cxx_print_type's template-info dumping
Unlike DECL_TEMPLATE_INFO which is stored in DECL_LANG_SPECIFIC,
TYPE_TEMPLATE_INFO isn't stored in TYPE_LANG_SPECIFIC, so we don't
need to check for both in cxx_print_type. This fixes dumping the
template-info of ENUMERAL_TYPE and BOUND_TEMPLATE_TEMPLATE_PARM,
which seem to never have TYPE_LANG_SPECIFIC.
gcc/cp/ChangeLog:
* ptree.cc (cxx_print_type): Remove TYPE_LANG_SPECIFIC
test guarding TYPE_TEMPLATE_INFO.
Pat Haugen [Tue, 19 Sep 2023 18:19:59 +0000 (13:19 -0500)]
Disable generation of scalar modulo instructions.
It was recently discovered that the scalar modulo instructions can suffer
noticeable performance issues for certain input values. This patch disables
their generation since the equivalent div/mul/sub sequence does not suffer
the same problem.
gcc/
* config/rs6000/rs6000.cc (rs6000_rtx_costs): Check whether the
modulo instruction is disabled.
* config/rs6000/rs6000.h (RS6000_DISABLE_SCALAR_MODULO): New.
* config/rs6000/rs6000.md (mod<mode>3, *mod<mode>3): Check it.
(define_expand umod<mode>3): New.
(define_insn umod<mode>3): Rename to *umod<mode>3 and check if the modulo
instruction is disabled.
(umodti3, modti3): Check if the modulo instruction is disabled.
Gaius Mulley [Tue, 19 Sep 2023 18:23:03 +0000 (19:23 +0100)]
PR 108143/modula2 LONGREAL and powerpc64le-linux
This patch introduces a configure for LONGREAL as float128 when
targetting or hosting cc1gm2 on ppc64le. It fixes calls to builtins
and fixes the -fdebug-builtins option.
* Make-lang.in (host_mc_longreal): Detect hosting on powerpc64le
and if so use __float128 for longreal in mc.
(MC_ARGS): Append host_mc_longreal.
* config-make.in (TEST_TARGET_CPU_DEFAULT): New variable.
(TEST_HOST_CPU_DEFAULT): New variable.
* configure: Regenerate.
* configure.ac (M2C_LONGREAL_FLOAT128): New define set if target
is powerpc64le.
(M2C_LONGREAL_PPC64LE): New define set if target is powerpc64le.
* gm2-compiler/M2GCCDeclare.mod: Correct comment case.
* gm2-compiler/M2GenGCC.mod (MaybeDebugBuiltinAlloca): Call
SetLastFunction for the builtin function call.
(MaybeDebugBuiltinMemcpy): Call SetLastFunction for the builtin
function call.
(MaybeDebugBuiltinMemset): New procedure function.
(MakeCopyUse): Use GNU formatting.
(UseBuiltin): Rewrite to check BuiltinExists.
(CodeDirectCall): Rewrite to check BuiltinExists and call
SetLastFunction.
(CodeMakeAdr): Re-format.
* gm2-compiler/M2Options.def (SetDebugBuiltins): New procedure.
* gm2-compiler/M2Options.mod (SetUninitVariableChecking): Allow
"cond" to switch UninitVariableConditionalChecking separately.
(SetDebugBuiltins): New procedure.
* gm2-compiler/M2Quads.def (BuildFunctionCall): Add parameter
ConstExpr.
* gm2-compiler/M2Quads.mod (BuildRealProcedureCall): Add parameter
to BuildRealFuncProcCall.
(BuildRealFuncProcCall): Add ConstExpr parameter. Pass ConstExpr
to BuildFunctionCall.
(BuildFunctionCall): Add parameter ConstExpr. Pass ConstExpr to
BuildRealFunctionCall.
(BuildConstFunctionCall): Add parameter ConstExpr. Pass ConstExpr to
BuildFunctionCall.
(BuildRealFunctionCall): Add parameter ConstExpr. Pass ConstExpr to
BuildRealFuncProcCall.
* gm2-compiler/P3Build.bnf (SetOrDesignatorOrFunction): Pass FALSE
to BuildFunctionCall.
(AssignmentOrProcedureCall): Pass FALSE to BuildFunctionCall.
* gm2-compiler/SymbolTable.def (IsProcedureBuiltinAvailable): New
procedure function.
* gm2-compiler/SymbolTable.mod (CanUseBuiltin): New procedure
function.
(IsProcedureBuiltinAvailable): New procedure function.
* gm2-gcc/m2builtins.cc (DEBUGGING): Undef.
(bf_category): New enum type.
(struct builtin_function_entry): New field function_avail.
(m2builtins_BuiltInMemCopy): Rename from ...
(m2builtins_BuiltinMemCopy): ... this.
(DoBuiltinMemSet): New function.
(m2builtins_BuiltinMemSet): New function.
(do_target_support_exists): New function.
(target_support_exists): New function.
(m2builtins_BuiltinExists): Return true or false.
(m2builtins_BuildBuiltinTree): Rename local variables.
Replace long_double_type_node with GetM2LongRealType.
(m2builtins_init): Use GetM2LongRealType rather than
long_double_type_node.
* gm2-gcc/m2builtins.def (BuiltInMemCopy): Rename to ...
(BuiltinMemCopy): ... this.
(BuiltinMemSet): New procedure function.
* gm2-gcc/m2builtins.h (m2builtins_BuiltInMemCopy): Rename to ...
(m2builtins_BuiltinMemCopy): ... this.
(m2builtins_BuiltinMemSet): New procedure function.
* gm2-gcc/m2configure.cc (m2configure_M2CLongRealFloat128): New
procedure function.
(m2configure_M2CLongRealIBM128): New procedure function.
(m2configure_M2CLongRealLongDouble): New procedure function.
(m2configure_M2CLongRealLongDoublePPC64LE): New procedure function.
* gm2-gcc/m2configure.def (M2CLongRealFloat128): New procedure function.
(M2CLongRealIBM128): New procedure function.
(M2CLongRealLongDouble): New procedure function.
(M2CLongRealLongDoublePPC64LE): New procedure function.
* gm2-gcc/m2configure.h (m2configure_FullPathCPP): New procedure function.
(m2configure_M2CLongRealFloat128): New procedure function.
(m2configure_M2CLongRealIBM128): New procedure function.
(m2configure_M2CLongRealLongDouble): New procedure function.
(m2configure_M2CLongRealLongDoublePPC64LE): New procedure function.
* gm2-gcc/m2convert.cc (m2convert_BuildConvert): Use convert_loc.
* gm2-gcc/m2options.h (M2Options_SetDebugBuiltins): New function.
* gm2-gcc/m2statement.cc (m2statement_BuildAssignmentTree): Set
TREE_USED to true.
(m2statement_BuildGoto):Set TREE_USED to true.
(m2statement_BuildParam): Set TREE_USED to true.
(m2statement_BuildBuiltinCallTree): New function.
(m2statement_BuildFunctValue): Set TREE_USED to true.
* gm2-gcc/m2statement.def (BuildBuiltinCallTree): New procedure function.
* gm2-gcc/m2statement.h (m2statement_BuildBuiltinCallTree): New
procedure function.
* gm2-gcc/m2treelib.cc (m2treelib_DoCall0): Remove spacing.
(m2treelib_DoCall1): Remove spacing.
(m2treelib_DoCall2): Remove spacing.
(m2treelib_DoCall3): Remove spacing.
(add_stmt): Rename parameter.
* gm2-gcc/m2type.cc (build_set_type): Remove spacing.
(build_m2_specific_size_type): Remove spacing.
(finish_build_pointer_type): Remove spacing.
(m2type_BuildVariableArrayAndDeclare): Remove spacing.
(build_m2_short_real_node): Remove spacing.
(build_m2_real_node): Remove spacing.
(build_m2_long_real_node): Use float128_type_node if
M2CLongRealFloat128 is set.
(build_m2_ztype_node): Remove spacing.
(build_m2_long_int_node): Remove spacing.
(build_m2_long_card_node): Remove spacing.
(build_m2_short_int_node): Remove spacing.
(build_m2_short_card_node): Remove spacing.
(build_m2_iso_loc_node): Remove spacing.
(m2type_SameRealType): New function.
(m2type_InitBaseTypes): Create m2_c_type_node using
m2_long_complex_type_node.
(m2type_SetAlignment): Tidy up comment.
* gm2-gcc/m2type.def (SameRealType): New procedure function.
* gm2-gcc/m2type.h (m2type_SameRealType): New procedure function.
* gm2-lang.cc (gm2_langhook_type_for_mode): Build long complex
node from m2 language specific long double node.
* gm2-libs-log/RealConversions.mod (IsNan): New procedure
function.
(doPowerOfTen): Re-implement.
* gm2-libs/Builtins.mod: Add newline.
* gm2-libs/DynamicStrings.def (ReplaceChar): New procedure function.
* gm2-libs/DynamicStrings.mod (ReplaceChar): New procedure function.
* gm2config.aci.in (M2C_LONGREAL_FLOAT128): New config value.
(M2C_LONGREAL_PPC64LE): New config value.
* gm2spec.cc (lang_specific_driver): New local variable
need_default_mabi set to default value depending upon
M2C_LONGREAL_PPC64LE and M2C_LONGREAL_FLOAT128.
* lang.opt (Wcase-enum): Moved to correct section.
* m2pp.cc (m2pp_real_type): New function.
(m2pp_type): Call m2pp_real_type.
(m2pp_print_mode): New function.
(m2pp_simple_type): Call m2pp_simple_type.
(m2pp_float): New function.
(m2pp_expression): Call m2pp_float.
* mc-boot/GDynamicStrings.cc: Rebuild.
* mc-boot/GDynamicStrings.h: Rebuild.
* mc-boot/GFIO.cc: Rebuild.
* mc-boot/GFIO.h: Rebuild.
* mc-boot/GIO.cc: Rebuild.
* mc-boot/GRTint.cc: Rebuild.
* mc-boot/Gdecl.cc: Rebuild.
* mc-boot/GmcOptions.cc: Rebuild.
* mc-boot/GmcOptions.h: Rebuild.
* mc/decl.mod: Rebuild.
* mc/mcOptions.def (getCRealType): New procedure function.
(getCLongRealType): New procedure function.
(getCShortRealType): New procedure function.
* mc/mcOptions.mod (getCRealType): New procedure function.
(getCLongRealType): New procedure function.
(getCShortRealType): New procedure function.
libgm2/ChangeLog:
* Makefile.am (TARGET_LONGDOUBLE_ABI): New variable set to
-mabi=ieeelongdouble if the target is powerpc64le.
(AM_MAKEFLAGS): Append TARGET_LONGDOUBLE_ABI.
* Makefile.in: Rebuild.
* libm2cor/Makefile.am (AM_MAKEFLAGS): Add CFLAGS_LONGDOUBLE and
TARGET_LONGDOUBLE_ABI.
(libm2cor_la_CFLAGS): Add TARGET_LONGDOUBLE_ABI.
(libm2cor_la_M2FLAGS): Add TARGET_LONGDOUBLE_ABI.
* libm2cor/Makefile.in: Rebuild.
* libm2iso/Makefile.am (AM_MAKEFLAGS): Add CFLAGS_LONGDOUBLE and
TARGET_LONGDOUBLE_ABI.
(libm2iso_la_CFLAGS): Add TARGET_LONGDOUBLE_ABI.
(libm2iso_la_M2FLAGS): Add TARGET_LONGDOUBLE_ABI.
* libm2iso/Makefile.in: Rebuild.
* libm2log/Makefile.am (AM_MAKEFLAGS): Add CFLAGS_LONGDOUBLE and
TARGET_LONGDOUBLE_ABI.
(libm2log_la_CFLAGS): Add TARGET_LONGDOUBLE_ABI.
(libm2log_la_M2FLAGS): Add TARGET_LONGDOUBLE_ABI.
* libm2log/Makefile.in: Rebuild.
* libm2min/Makefile.am (AM_MAKEFLAGS): Add CFLAGS_LONGDOUBLE and
TARGET_LONGDOUBLE_ABI.
(libm2min_la_CFLAGS): Add TARGET_LONGDOUBLE_ABI.
(libm2min_la_M2FLAGS): Add TARGET_LONGDOUBLE_ABI.
* libm2min/Makefile.in: Rebuild.
* libm2pim/Makefile.am (AM_MAKEFLAGS): Add CFLAGS_LONGDOUBLE and
TARGET_LONGDOUBLE_ABI.
(libm2pim_la_CFLAGS): Add TARGET_LONGDOUBLE_ABI.
(libm2pim_la_M2FLAGS): Add TARGET_LONGDOUBLE_ABI.
* libm2pim/Makefile.in: Rebuild.
gcc/testsuite/ChangeLog:
* gm2/extensions/pass/libc.def: Add spacing.
* gm2/pimlib/logitech/run/pass/realconv.mod: Add debugging print.
* gm2/switches/uninit-variable-checking/cascade/fail/switches-uninit-variable-checking-cascade-fail.exp:
Add -fdebug-builtins flag.
* lib/gm2.exp (gm2_target_compile_default): Add
-mabi=ieeelongdouble if the target is powerpc.
(gm2_link_flags): Add
-mabi=ieeelongdouble if the target is powerpc.
* gm2/pim/intrinsic/run/pass/cstub.c: New test.
* gm2/pim/intrinsic/run/pass/cstub.def: New test.
* gm2/pim/intrinsic/run/pass/pim-intrinsic-run-pass.exp: New test.
* gm2/pim/intrinsic/run/pass/test.mod: New test.
* gm2/pim/run/pass/builtins.mod: New test.
* gm2/pim/run/pass/convert1.mod: New test.
* gm2/pim/run/pass/longint1.mod: New test.
* gm2/pim/run/pass/longint2.mod: New test.
* gm2/pim/run/pass/longint3.mod: New test.
* gm2/pim/run/pass/longint4.mod: New test.
* gm2/pim/run/pass/longint5.mod: New test.
* gm2/pim/run/pass/longint6.mod: New test.
* gm2/pim/run/pass/longint7.mod: New test.
* gm2/pim/run/pass/longint8.mod: New test.
Jeff Law [Tue, 19 Sep 2023 17:28:53 +0000 (11:28 -0600)]
Fix bogus operand predicate on iq2000
The iq2000-elf port regressed these tests recently:
> iq2000-sim: gcc.c-torture/execute/20040703-1.c -O2 (test for excess errors)
> iq2000-sim: gcc.c-torture/execute/20040703-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors)
> iq2000-sim: gcc.c-torture/execute/20040703-1.c -O3 -g (test for excess errors)
It turns out one of the patterns had an operand predicate that allowed REG,
SUBREG, CONST_INT (with a limited set of CONST_INTs). Yet the constraint only
allowed the limited set of immediates. This naturally triggered an LRA
constraint failure.
The fix is trivial, create an operand predicate that accurately reflects the
kinds of operands allowed by the instruction.
It turns out this was a long standing bug -- fixing the pattern resolved 55
failing tests in the testsuite.
gcc/
* config/iq2000/predicates.md (uns_arith_constant): New predicate.
* config/iq2000/iq2000.md (rotrsi3): Use it.
Harald Anlauf [Mon, 18 Sep 2023 20:11:40 +0000 (22:11 +0200)]
fortran: fix checking of CHARACTER lengths in array constructors [PR70231]
gcc/fortran/ChangeLog:
PR fortran/70231
* trans-array.cc (trans_array_constructor): In absence of a typespec,
use string length determined by get_array_ctor_strlen() to reasonably
initialize auxiliary variable for bounds-checking.
gcc/testsuite/ChangeLog:
PR fortran/70231
* gfortran.dg/bounds_check_fail_7.f90: New test.
We're missing an op2_range entry for operator_not_equal so GORI can
calculate an outgoing edge. The false side of != is true and
guarantees we don't have a NAN, so it's important to get this right.
We eventually get it through an intersection of various ranges in
ranger, but it's best to get things correct as early as possible.
Jakub Jelinek [Tue, 19 Sep 2023 15:48:42 +0000 (17:48 +0200)]
testsuite work-around compound-assignment-1.c C++ failures on various targets [PR111377]
On Mon, Sep 11, 2023 at 11:11:30PM +0200, Jakub Jelinek via Gcc-patches wrote:
> I think the divergence is whether called_by_test_5b returns the struct
> in registers or in memory. If in memory (like in the x86_64 -m32 case), we have
> [compound-assignment-1.c:71:21] D.3191 = called_by_test_5b (); [return slot optimization]
> [compound-assignment-1.c:71:21 discrim 1] D.3191 ={v} {CLOBBER(eol)};
> [compound-assignment-1.c:72:1] return;
> in the IL, while if in registers (like x86_64 -m64 case), just
> [compound-assignment-1.c:71:21] D.3591 = called_by_test_5b ();
> [compound-assignment-1.c:72:1] return;
>
> If you just want to avoid the differences, putting } on the same line as the
> call might be a usable workaround for that.
Here is the workaround in patch form.
2023-09-19 Jakub Jelinek <jakub@redhat.com>
PR testsuite/111377
* c-c++-common/analyzer/compound-assignment-1.c (test_5b): Move
closing } to the same line as the call to work-around differences in
diagnostics line.
Jason Merrill [Tue, 19 Sep 2023 02:16:04 +0000 (22:16 -0400)]
c++: inherited default constructor [CWG2799]
In this testcase, it seems clear that B should be trivially
default-constructible, since the inherited default constructor is trivial
and there are no other subobjects to initialize. But we were saying no
because we don't define triviality of inherited constructors.
CWG discussion suggested that the solution is to implicitly declare a
default constructor when inheriting a default constructor; that makes sense
to me.
DR 2799
gcc/cp/ChangeLog:
* class.cc (add_implicit_default_ctor): Split out...
(add_implicitly_declared_members): ...from here.
Also call it when inheriting a default ctor.
Andrew MacLeod [Wed, 13 Sep 2023 15:52:15 +0000 (11:52 -0400)]
New early __builtin_unreachable processing.
in VRP passes before __builtin_unreachable MUST be removed, only remove it
if all exports affected by the unreachable can have global values updated, and
do not involve loads from memory.
PR tree-optimization/110080
PR tree-optimization/110249
gcc/
* tree-vrp.cc (remove_unreachable::final_p): New.
(remove_unreachable::maybe_register): Rename from
maybe_register_block and call early or final routine.
(fully_replaceable): New.
(remove_unreachable::handle_early): New.
(remove_unreachable::remove_and_update_globals): Remove
non-final processing.
(rvrp_folder::rvrp_folder): Add final flag to constructor.
(rvrp_folder::post_fold_bb): Remove unreachable registration.
(rvrp_folder::pre_fold_stmt): Move unreachable processing to here.
(execute_ranger_vrp): Adjust some call parameters.
Marek Polacek [Fri, 1 Sep 2023 00:11:50 +0000 (20:11 -0400)]
c++: Move consteval folding to cp_fold_r
In the review of P2564:
<https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628747.html>
it turned out that in order to correctly handle an example in the paper,
we should stop doing immediate evaluation in build_over_call and
bot_replace, and instead do it in cp_fold_r. This patch does that.
Another benefit is that this is a pretty significant simplification, at
least in my opinion. Also, this fixes the c++/110997 ICE (but the test
doesn't compile yet).
The main drawback seems to be that cp_fold_r doesn't process
uninstantiated templates. We still have to handle things like
"false ? foo () : 1". To that end, I've added cp_fold_immediate, called
on dead branches in cxx_eval_conditional_expression.
You'll see that I've reintroduced ADDR_EXPR_DENOTES_CALL_P here. This
is to detect
Richard Biener [Tue, 19 Sep 2023 11:18:51 +0000 (13:18 +0200)]
c/111468 - dump unordered compare operators in their GIMPLE form with -gimple
The following adjusts -gimple dumping to dump the unordered compare ops
and *h in their GIMPLE form. It also adds parsing for __LTGT which I
missed before.
Patrick Palka [Tue, 19 Sep 2023 12:29:39 +0000 (08:29 -0400)]
c++: overeager type completion in convert_to_void [PR111419]
Here convert_to_void always completes the type of an indirection or
id-expression, but according to [expr.context] an lvalue-to-rvalue
conversion is applied to a discarded-value expression only if "the
expression is a glvalue of volatile-qualified type". This patch
restricts convert_to_void's type completion to match.
PR c++/111419
gcc/cp/ChangeLog:
* cvt.cc (convert_to_void) <case INDIRECT_REF>: Only call
complete_type if the type is volatile.
<case VAR_DECL>: Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-requires36.C: New test.
* g++.dg/expr/discarded1.C: New test.
* g++.dg/expr/discarded1a.C: New test.
Patrick Palka [Tue, 19 Sep 2023 12:21:05 +0000 (08:21 -0400)]
c++: constness of decltype of NTTP object [PR99631]
This corrects resolving decltype of a (class) NTTP object as per
[dcl.type.decltype]/1.2 and [temp.param]/6 in the type-dependent case.
Note that in the non-dependent case we resolve the decltype ahead of
time, in which case finish_decltype_type drops the const VIEW_CONVERT_EXPR
wrapper around the TEMPLATE_PARM_INDEX, and the latter has the desired
non-const type.
In the type-dependent case, at instantiation time tsubst drops the
VIEW_CONVERT_EXPR since the substituted NTTP is the already-const object
created by get_template_parm_object. So in this case finish_decltype_type
sees the const object, which this patch now adds special handling for.
PR c++/99631
gcc/cp/ChangeLog:
* semantics.cc (finish_decltype_type): For an NTTP object,
return its type modulo cv-quals.
Thomas Schwinge [Thu, 10 Aug 2023 13:23:37 +0000 (15:23 +0200)]
LTO: Get rid of 'lto_mode_identity_table'
This, in particular, resolves LTO ICEs with big 'machine_mode's, as for RISC-V.
('mode_table' in 'lto_file_decl_data' still is 'unsigned char'; changing that
is still to be done (for use in offloading compilation), but is not trivial.)
For now, get rid of 'lto_mode_identity_table' to resolve the RISC-V LTO ICEs;
we don't need an actual table for a 1-to-1 mapping.
After support the VLS mode conversion, current case triggers a latent bug that we are
lucky we didn't encounter.
This is a real bug in 'cprop_hardreg':
orig:RVVMF8BI,16,16
new:V32BI,32,0
during RTL pass: cprop_hardreg
auto.c: In function 'main':
auto.c:79:1: internal compiler error: in partial_subreg_p, at rtl.h:3186
79 | }
| ^
0x10979a7 partial_subreg_p(machine_mode, machine_mode)
../../../../gcc/gcc/rtl.h:3186
0x1723eda mode_change_ok
../../../../gcc/gcc/regcprop.cc:402
0x1724007 maybe_mode_change
../../../../gcc/gcc/regcprop.cc:436
0x172445d find_oldest_value_reg
../../../../gcc/gcc/regcprop.cc:489
0x172534d copyprop_hardreg_forward_1
../../../../gcc/gcc/regcprop.cc:808
0x1727017 cprop_hardreg_bb
../../../../gcc/gcc/regcprop.cc:1358
0x17272f7 execute
../../../../gcc/gcc/regcprop.cc:1425
When trying to do reg copy propagation between RVVMF8BI (precision = 16,16)
and V32BI (precision = 32,0).
The assertion failed in partial_subreg_p:
gcc_checking_assert (ordered_p (outer_prec, inner_prec));
In regcprop.cc:
if (partial_subreg_p (orig_mode, new_mode))
return false;
If orig_mode (RVVMF8BI) smaller than new_mode (V32BI), we don't do the hard reg propogation.
However, the 'partial_subreg_p' cause ICE since gcc_checking_assert (ordered_p (outer_prec, inner_prec)).
After analysis in aarch64.cc, they do careful block in 'TARGET_CAN_CHANGE_MODE_CLASS'.
So it's reasonable block regcprop when old mode size maybe_lt than new mode size since we won't do the
copy propgation.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_can_change_mode_class): Block unordered VLA and VLS modes.
Richard Wai [Sun, 17 Sep 2023 15:00:00 +0000 (11:00 -0400)]
ada: TSS finalize address subprogram generation for constrained...
...subtypes of unconstrained synchronized private extensions should take
care to designate the corresponding record of the underlying concurrent
type.
When generating TSS finalize address subprograms for class-wide types of
constrained root types, it follows the parent chain looking for the
first "non-constrained" type. It is possible that such a type is a
private extension with the “synchronized” keyword, in which case the
underlying type is a concurrent type. When that happens, the designated
type of the finalize address subprogram should be the corresponding
record’s class-wide-type.
gcc/ada/ChangeLog:
* exp_ch3.adb (Expand_Freeze_Class_Wide_Type): Expanded comments
explaining why TSS Finalize_Address is not generated for
concurrent class-wide types.
* exp_ch7.adb (Make_Finalize_Address_Stmts): Handle cases where the
underlying non-constrained parent type is a concurrent type, and
adjust the designated type to be the corresponding record’s
class-wide type.
gcc/testsuite/ChangeLog:
* gnat.dg/sync_tag_finalize.adb: New test.
Signed-off-by: Richard Wai <richard@annexi-strayline.com>
Richard Wai [Wed, 9 Aug 2023 05:54:48 +0000 (01:54 -0400)]
ada: Private extensions with the keyword "synchronized" are always limited.
GNAT was relying on synchronized private type extensions deriving from a
concurrent interface to determine its limitedness. This does not cover the case
where such an extension derives a limited interface. RM-7.6(6/2) makes is clear
that "synchronized" in a private extension implies the derived type is limited.
GNAT should explicitly check for the presence of "synchronized" in a private
extension declaration, and it should have the same effect as the presence of
“limited”.
gcc/ada/ChangeLog:
* sem_ch3.adb (Build_Derived_Record_Type): Treat presence of
keyword "synchronized" the same as "limited" when determining if a
private extension is limited.
gcc/testsuite/ChangeLog:
* gnat.dg/sync_tag_discriminals.adb: New test.
* gnat.dg/sync_tag_limited.adb: New test.
Signed-off-by: Richard Wai <richard@annexi-strayline.com>
Marc Poulhiès [Fri, 8 Sep 2023 15:15:48 +0000 (15:15 +0000)]
ada: Refine upper array bound for bit packed array
When using bit-packed arrays, the compiler creates new array subtypes of
1-bit component indexed by integers. The existing routine checks the
index subtype to find the min/max values. Bit-packed arrays being
indexed by integers, the routines gives up as returning the maximum
possible integer carries no useful information.
This change adds a simple max_value routine that can evaluate very
simple expressions by substituting variables by their min/max value.
Bit-packed array subtypes are currently declared as:
subtype bp_array is packed_bytes1 (0 .. integer((1 * Var + 7) / 8 - 1));
The simple max_value evaluator handles the bare minimum for this
expression pattern.
gcc/ada/ChangeLog:
* gcc-interface/utils.cc (max_value): New.
* gcc-interface/gigi.h (max_value): New.
* gcc-interface/decl.cc (gnat_to_gnu_entity) <E_Array_Subtype>:
When computing gnu_min/gnu_max, try to use max_value if there is
an initial expression.
Richard Biener [Tue, 19 Sep 2023 10:36:04 +0000 (12:36 +0200)]
tree-optimization/111465 - bougs jump threading with no-copy src block
The following avoids to forward thread a path with a EDGE_NO_COPY_SRC_BLOCK
block that became non-empty due to folding.
PR tree-optimization/111465
* tree-ssa-threadupdate.cc (fwd_jt_path_registry::thread_block_1):
Cancel the path when a EDGE_NO_COPY_SRC_BLOCK became non-empty.
Richard Biener [Tue, 19 Sep 2023 09:49:54 +0000 (11:49 +0200)]
c/111468 - add unordered compare and pointer diff to GIMPLE FE parsing
The following adds __UN{LT,LE,GT,GE,EQ}, __UNORDERED and __ORDERED
operator parsing support and support for parsing - as POINTER_DIFF_EXPR.
PR c/111468
gcc/c/
* gimple-parser.cc (c_parser_gimple_binary_expression): Add
return type argument.
(c_parser_gimple_statement): Adjust.
(c_parser_gimple_paren_condition): Likewise.
(c_parser_gimple_binary_expression): Use passed in return type,
add support for - as POINTER_DIFF_EXPR, __UN{LT,LE,GT,GE,EQ},
__UNORDERED and __ORDERED.
gcc/testsuite/
* gcc.dg/gimplefe-50.c: New testcase.
* gcc.dg/gimplefe-51.c: Likewise.
* gcc.target/riscv/rvv/autovec/vls/def.h: Add FMS tests.
* gcc.target/riscv/rvv/autovec/vls/fma-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fma-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fma-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fms-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fms-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fms-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fnma-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fnma-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fnma-7.c: New test.
Jakub Jelinek [Tue, 19 Sep 2023 08:44:54 +0000 (10:44 +0200)]
match.pd: Some build_nonstandard_integer_type tweaks
As discussed earlier, using build_nonstandard_integer_type blindly for all
INTEGRAL_TYPE_Ps is problematic now that we have BITINT_TYPE, because it
always creates an INTEGRAL_TYPE with some possibly very large precision.
The following patch attempts to deal with 3 such spots in match.pd, others
still need looking at.
In the first case, I think it is quite expensive/undesirable to create
a non-standard INTEGER_TYPE with possibly huge precision and then
immediately just see type_has_mode_precision_p being false for it, or even
worse introducing a cast to TImode or OImode or XImode INTEGER_TYPE which
nothing will be able to actually handle. 128-bit or 64-bit (on 32-bit
targets) types are the largest supported by the backend, so the following
patch avoids creating and matching conversions to larger types, it is
an optimization anyway and so should be used when it is cheap that way.
In the second hunk, I believe the uses of build_nonstandard_integer_type
aren't useful at all. It is when matching a ? -1 : 0 and trying to express
it as say -(type) (bool) a etc., but this is all GIMPLE only, where most of
integral types with same precision/signedness are compatible and we know
-1 is representable in that type, so I really don't see any reason not to
perform the negation of a [0, 1] valued expression in type, rather
than doing it in
build_nonstandard_integer_type (TYPE_PRECISION (type), TYPE_UNSIGNED (type))
(except that it breaks the BITINT_TYPEs). I don't think we need to do
something like range_check_type.
While in there, I've also noticed it was using a (with {
tree booltrue = constant_boolean_node (true, boolean_type_node);
} and removed that + replaced uses of booltrue with boolean_true_node
which the above function always returns.
2023-09-19 Jakub Jelinek <jakub@redhat.com>
* match.pd ((x << c) >> c): Don't call build_nonstandard_integer_type
nor check type_has_mode_precision_p for width larger than [TD]Imode
precision.
(a ? CST1 : CST2): Don't use build_nonstandard_type, just convert
to type. Use boolean_true_node instead of
constant_boolean_node (true, boolean_type_node). Formatting fixes.
* gcc.target/riscv/rvv/autovec/vls/def.h: Add VLS FMA/FNMA test.
* gcc.target/riscv/rvv/autovec/vls/fma-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fma-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fma-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fma-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fnma-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fnma-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fnma-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fnma-4.c: New test.
Jakub Jelinek [Tue, 19 Sep 2023 07:50:35 +0000 (09:50 +0200)]
small _BitInt tweaks
I think it is undesirable when being asked for signed_type_for
of unsigned _BitInt(1) (which is valid) to get signed _BitInt(1) (which is
invalid, the standard only allows signed _BitInt(2) and larger), so the
patch returns 1-bit signed INTEGER_TYPE for those cases.
Furthermore it asserts in build_bitint_type that nothing attempts to create
signed _BitInt(0), unsigned _BitInt(0) or signed _BitInt(1) types.
2023-09-18 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree.cc (build_bitint_type): Assert precision is not 0, or
for signed types 1.
(signed_or_unsigned_type_for): Return INTEGER_TYPE for signed variant
of unsigned _BitInt(1).
gcc/c-family/
* c-common.cc (c_common_signed_or_unsigned_type): Return INTEGER_TYPE
for signed variant of unsigned _BitInt(1).
Jakub Jelinek [Tue, 19 Sep 2023 07:26:35 +0000 (09:26 +0200)]
libgomp: Handle NULL environ like pointer to NULL pointer [PR111413]
clearenv function just sets environ to NULL (after sometimes freeing it),
rather than setting it to a pointer to NULL, and our code was assuming
it is always non-NULL.
Fixed thusly, the change seems to be large but actually is just
+ if (environ)
for (env = environ; *env != 0; env++)
plus reindentation. I've also noticed the block after this for loop
was badly indented (too much) and fixed that too.
No testcase added, as it needs clearenv + dlopen.
2023-09-19 Jakub Jelinek <jakub@redhat.com>
PR libgomp/111413
* env.c (initialize_env): Don't dereference environ if it is NULL.
Reindent.
At present, FMA autovec's patterns do not fully use the corresponding pattern
in vector.md. The previous reason is that the merge operand of pattern in
vector.md cannot be VUNDEF. Now allowing it to be VUNDEF, reunify insn used for
reload pass into vector.md, and the corresponding vlmax pattern in autovec.md
is used for combine. This patch also refactors the corresponding combine
pattern inside autovec-opt.md and removes the unused ones.
Tsukasa OI [Mon, 18 Sep 2023 09:23:41 +0000 (09:23 +0000)]
RISC-V: Add builtin .def file dependencies
riscv-builtins.cc includes riscv-cmo.def and riscv-scalar-crypto.def
(making dependencies) but their dependencies must be explicitly defined at
the configuration file, t-riscv.
They were the last two .def files without correct dependency information.
gcc/ChangeLog:
* config/riscv/t-riscv: Add dependencies for riscv-builtins.cc,
riscv-cmo.def and riscv-scalar-crypto.def.
Before this patch:
init_vl:
addi sp,sp,-16
vsetivli zero,2,e64,m1,ta,ma
vle64.v v1,0(a1)
vse64.v v1,0(sp)
slli a4,a2,32
srli a2,a4,29
add a2,sp,a2
slli a3,a3,32
srli a3,a3,32
sd a3,0(a2)
vle64.v v1,0(sp)
vse64.v v1,0(a0)
addi sp,sp,16
jr ra
After this patch:
init_vl:
vsetivli zero,2,e64,m1,ta,ma
vle64.v v1,0(a1)
slli a3,a3,32
srli a3,a3,32
addi a5,a2,1
vsetvli zero,a5,e64,m1,tu,ma
vmv.v.x v2,a3
vslideup.vx v1,v2,a2
vsetivli zero,2,e64,m1,ta,ma
vse64.v v1,0(a0)
ret
Please note this patch depends the RVV SCALAR_MOVE_MERGED_OP bugfix.
gcc/ChangeLog:
* config/riscv/autovec.md: Extend to vls mode.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls/def.h: New macros.
* gcc.target/riscv/rvv/autovec/vls/vec-set-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-10.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-11.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-12.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-13.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-14.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-15.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-16.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-17.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-18.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-19.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-20.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-21.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-22.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-8.c: New test.
* gcc.target/riscv/rvv/autovec/vls/vec-set-9.c: New test.
The higher 64 bits of 142[V2DI] is unknown here and it generated incorrect
code when store back to memory. This patch would like to fix this issue
by adding a new SCALAR_MOVE_MERGED_OP for vec_set.
Please note this patch doesn't enable VLS for vec_set, the underlying
patches will support this soon.
gcc/ChangeLog:
* config/riscv/autovec.md: Bugfix.
* config/riscv/riscv-protos.h (SCALAR_MOVE_MERGED_OP): New enum.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/scalar-move-merged-run-1.c: New test.
Andrew Pinski [Sun, 17 Sep 2023 18:20:36 +0000 (11:20 -0700)]
MATCH: Make zero_one_valued_p non-recursive fully
So it turns out VN can't handle any kind of recursion for match. In this
case we have `b = a & -1` and we try to match a as being zero_one_valued_p
and VN returns b as being the value and we just go into an infinite loop at
this point.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
Note genmatch should warn (or error out) if this gets detected so I filed PR 111446
which I will be looking into next week or the week after so we don't run into
this issue again.
PR tree-optimization/111442
gcc/ChangeLog:
* match.pd (zero_one_valued_p): Have the bit_and match not be
recursive.
Andrew Pinski [Sat, 16 Sep 2023 22:19:58 +0000 (15:19 -0700)]
MATCH: Avoid recursive zero_one_valued_p for conversions
So when VN finds a name which has a nop conversion, it says
both names are equivalent to each other and the valuaization
function for one will return the other. This normally does not
cause any issues as there is no recursive matches. But after r14-4038-gb975c0dc3be285, there was one added. So we would
do an infinite recursion on the match and never finish.
This fixes the issue (and adds a comment in match.pd) by
for converts just handle one level instead of being recursive
always.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
Note the testcase was reduced from tree-ssa-loop-niter.cc and then
changed slightly into C rather than C++ but it still needs exceptions
turned on get the IR that VN would produce this equivalence relationship
going on. Also had to turn off early inline to force put to be inlined later.
PR tree-optimization/111435
gcc/ChangeLog:
* match.pd (zero_one_valued_p): Don't do recursion
on converts.
Since the LHS of a qualified-id is a non-deduced context, it effectively
means we can't deduce from outer template arguments of a class template
specialization. And checking for equality between the TI_TEMPLATE of a
class specialization parm/arg already implies that the outer template
arguments are the same. Hence recursing into outer template arguments
during unification of class specializations is redundant, so this patch
makes unify recurse only into innermost arguments.
This incidentally fixes the testcase from PR89231 because there
more_specialized_partial_inst wrongly considers the two partial
specializations to be unordered ultimately because unify for identical
parm=arg=A<Ps...>::Collect<N...> gets confused when it recurses into
parm=arg={Ps...} since Ps is outside the (innermost) level of tparms
that we're actually deducing.
PR c++/89231
gcc/cp/ChangeLog:
* pt.cc (try_class_unification): Strengthen TI_TEMPLATE equality
test by not calling most_general_template. Only unify the
innermost levels of template arguments.
(unify) <case CLASS_TYPE>: Only unify the innermost levels of
template arguments, and only if the template is primary.
This patch makes us recognize and check non-dependent simple assigments
ahead of time, like we already do for compound assignments. This means
the templated representation of such assignments will now usually have
an implicit INDIRECT_REF (due to the reference return type), which the
-Wparentheses code needs to handle. As a drive-by improvement, this
patch also makes maybe_convert_cond issue -Wparentheses warnings ahead
of time, and removes a seemingly unnecessary suppress_warning call in
build_x_modify_expr.
On the libstdc++ side, some tests were attempting to modify a data
member from a uninstantiated const member function, which this patch
minimally fixes by making the data member mutable.
PR c++/63198
PR c++/18474
gcc/cp/ChangeLog:
* semantics.cc (maybe_convert_cond): Look through implicit
INDIRECT_REF when deciding whether to issue a -Wparentheses
warning, and consider templated assignment expressions as well.
(finish_parenthesized_expr): Look through implicit INDIRECT_REF
when suppressing -Wparentheses warning.
* typeck.cc (build_x_modify_expr): Check simple assignments
ahead time too, not just compound assignments. Give the second
operand of MODOP_EXPR a non-null type so that it's not considered
always instantiation-dependent. Don't call suppress_warning.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/static_assert15.C: Expect diagnostic for
non-constant static_assert condition.
* g++.dg/expr/unary2.C: Remove xfails.
* g++.dg/template/init7.C: Make initializer type-dependent to
preserve intent of test.
* g++.dg/template/recurse3.C: Likewise for the erroneous
statement.
* g++.dg/template/non-dependent26.C: New test.
* g++.dg/warn/Wparentheses-32.C: New test.
libstdc++-v3/ChangeLog:
* testsuite/26_numerics/random/discard_block_engine/cons/seed_seq2.cc:
Make data member seed_seq::called mutable.
* testsuite/26_numerics/random/independent_bits_engine/cons/seed_seq2.cc:
Likewise.
* testsuite/26_numerics/random/linear_congruential_engine/cons/seed_seq2.cc:
Likewise.
* testsuite/26_numerics/random/mersenne_twister_engine/cons/seed_seq2.cc:
Likewise.
* testsuite/26_numerics/random/shuffle_order_engine/cons/seed_seq2.cc:
Likewise.
* testsuite/26_numerics/random/subtract_with_carry_engine/cons/seed_seq2.cc:
Likewise.
* testsuite/ext/random/simd_fast_mersenne_twister_engine/cons/seed_seq2.cc:
Likewise.
Patrick Palka [Mon, 18 Sep 2023 18:41:07 +0000 (14:41 -0400)]
c++: unifying identical tmpls from current inst [PR108347]
Here more_specialized_partial_spec wrongly considers the two partial
specializations to be unordered ultimately because unify for identical
parm=arg=A<T>::C returns failure due to C being dependent.
This patch fixes this by relaxing unify's early-exit identity test to
also accept dependent decls; we can't deduce anything further from them
anyway. In passing this patch removes the CONST_DECL case of unify:
we should never see the CONST_DECL version of a template parameter here,
and for other CONST_DECLs (such as enumerators) it seems we can rely on
them to already have been folded to their DECL_INITIAL.
PR c++/108347
gcc/cp/ChangeLog:
* pt.cc (unify): Return unify_success for identical dependent
DECL_P 'arg' and 'parm'.
<case CONST_DECL>: Remove handling.
Patrick Palka [Mon, 18 Sep 2023 18:41:05 +0000 (14:41 -0400)]
c++: always check arity before deduction
This simple patch extends the r12-3271-gf1e73199569287 optimization
to happen for deduction without explicit template arguments as well.
The motivation for this is to accept testcases such as conv20.C and
ttp40.C below, which don't use explicit template arguments but for
which unnecessary template instantiation during deduction could be
avoided if we uniformly pruned overloads according to arity early.
This incidentally causes us to accept one reduced testcase from
PR c++/84075, but the underlying issue there remains at large.
As a nice side effect, this change causes the "candidate expects N
argument(s)" note during overload resolution failure to point to the
template candidate instead of the call site, which seems like an
improvement along the lines of r14-309-g14e881eb030509.
gcc/cp/ChangeLog:
* call.cc (add_template_candidate_real): Check arity even
when there are no explicit template arguments. Combine the
two adjacent '!obj' tests into one.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/vt-57397-1.C: Expect "candidate expects ... N
argument(s)" at the declaration site instead of the call site.
* g++.dg/cpp0x/vt-57397-2.C: Likewise.
* g++.dg/overload/template5.C: Likewise.
* g++.dg/template/local6.C: Likewise.
* g++.dg/template/conv20.C: New test.
* g++.dg/template/ttp40.C: New test.
Darwin,debug : Switch to DWARF 3 or 4 when dsymutil supports it.
The main reason that Darwin has been using DWARF2 only as debug is that
earlier debug linkers (dsymutil) did not support any extensions to this
so that the default "non-strict" mode used in GCC would cause tool errors.
There are two sources for dsymutil, those based off a closed source base
"dwarfutils" and those based off LLVM.
For dsymutil versions based off LLVM-7+ we can use up to DWARF-4, and for
versions based on dwarfutils 121+ we can use DWARF-3.
configure, Darwin: Adjust handing of stdlib option.
The intent of the configuration choices for -stdlib is that default
setting should choose reasonable options for the target. This should
enable -stdlib= for Darwin targets where libc++ is the default on the
system (so that it is only necessary to provide the headers).
However, it seems that there are some cases where (external) config
scripts are using -stdlib (incorrectly) to determine if the compiler
in use is GCC or clang.
In order to allow for these cases, this patch refines the setting
like so:
--with-gxx-libcxx-include-dir= is used to configure the path containing
libc++ headers; it also controls the enabling of the -stdlib option.
We are adding a special value for path:
if --with-gxx-libcxx-include-dir is 'no' we disable the stdlib option.
Otherwise if the --with-gxx-libcxx-include-dir is set we use the path
provided, and enable the stdlib option.
if --with-gxx-libcxx-include-dir is unset
We decide on the stdlib option based on the OS type and revision being
targeted. The path is set to a fixed position relative to the compiler
install (similar logic to that used for libstdc++ headers).
Patrick Palka [Mon, 18 Sep 2023 18:27:18 +0000 (14:27 -0400)]
c++: optimize tsubst_template_decl for function templates
r14-2655-g92d1425ca78040 made instantiate_template avoid redundantly
performing a specialization lookup when calling tsubst_decl. This patch
applies the same optimization to the analagous tsubst_template_decl when
(partially) instantiating a function template. This allows us to remove
an early exit test from register_specialization since we no longer try
to register the FUNCTION_DECL corresponding to a function template
partial instantiation.
gcc/cp/ChangeLog:
* pt.cc (register_specialization): Remove now-unnecessary
early exit for FUNCTION_DECL partial instantiation.
(tsubst_template_decl): Pass use_spec_table=false to
tsubst_function_decl. Set DECL_TI_ARGS of a non-lambda
FUNCTION_DECL specialization to the full set of arguments.
Simplify register_specialization call accordingly.
gcc/testsuite/ChangeLog:
* g++.dg/template/nontype12.C: Expect two instead of three
duplicate diagnostics for A<double>::bar() specialization.
Andrew Pinski [Sat, 16 Sep 2023 03:27:26 +0000 (03:27 +0000)]
MATCH: Add simplifications of `(a == CST) & a`
`(a == CST) & a` can be either simplified to simplying `a == CST`
or 0 depending on the first bit of the CST.
This is an extension of the already pattern of `X & !X` and allows
us to remove the 2 xfails on gcc.dg/binop-notand1a.c and gcc.dg/binop-notand4a.c.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
PR tree-optimization/111431
gcc/ChangeLog:
* match.pd (`(a == CST) & a`): New pattern.
gcc/testsuite/ChangeLog:
* gcc.dg/binop-notand1a.c: Remove xfail.
* gcc.dg/binop-notand4a.c: Likewise.
* gcc.c-torture/execute/pr111431-1.c: New test.
* gcc.dg/binop-andeq1.c: New test.
* gcc.dg/binop-andeq2.c: New test.
* gcc.dg/binop-notand7.c: New test.
* gcc.dg/binop-notand7a.c: New test.
Thomas Schwinge [Mon, 18 Sep 2023 14:34:47 +0000 (16:34 +0200)]
Fix up 'g++.dg/abi/nvptx-ptrmem1.C'
..., which shortly after its inception in
commit 44eba92d0a0594bda5b53fcb3c8f84f164c653b6 (Subversion r231628)
"[PTX] parameters and return values" was forgotten to be updated in next day's
commit 1f0659546bcf5b95c3263cdc73149f6c2a05ebe1 (Subversion r231663)
"[PTX] more register cleanups". Fix it up now, as obvious, for the current
state of things.
Currently, VLS and VLA patterns are different.
VLA is define_expand
VLS is define_insn_and_split
It makes no sense that they are different pattern format.
Merge them into same pattern (define_insn_and_split).
It can also be helpful for the future vv -> vx fwprop optimization.
This patch removed the misleading comments in testcases since we
support fold min(int, poly) to constant by this patch
(https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629651.html).
Thereby the csrr will not appear inside the assembly code, even if there
is no support for some VLS vector patterns.
Support immediate expansion of immediates which can be created from 2 MOVKs
and a shifted ORR or BIC instruction. Change aarch64_split_dimode_const_store
to apply if we save one instruction.
This reduces the number of 4-instruction immediates in SPECINT/FP by 5%.
gcc/ChangeLog:
PR target/105928
* config/aarch64/aarch64.cc (aarch64_internal_mov_immediate)
Add support for immediates using shifted ORR/BIC.
(aarch64_split_dimode_const_store): Apply if we save one instruction.
* config/aarch64/aarch64.md (<LOGICAL:optab>_<SHIFT:optab><mode>3):
Make pattern global.
List official cores first so that -mcpu=native does not show a codename with
-v or in errors/warnings.
gcc/ChangeLog:
* config/aarch64/aarch64-cores.def (neoverse-n1): Place before ares.
(neoverse-v1): Place before zeus.
(neoverse-v2): Place before demeter.
* config/aarch64/aarch64-tune.md: Regenerate.
RISC-V: Add fixed PR111255 testcase by other patch
This patch add the missed PR111255 testcase which is fixed by this
committed patch (https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628922.html).
PR target/111255
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vsetvl/pr111255.c: New test.
Jonathan Wakely [Mon, 18 Sep 2023 11:14:15 +0000 (12:14 +0100)]
libstdc++: Minor update to installation docs
libstdc++-v3/ChangeLog:
* doc/xml/manual/intro.xml: Clarify that building libstdc++
separately from GCC is not supported.
* doc/xml/manual/prerequisites.xml: Note msgfmt prerequisite for
testing.
* doc/html/manual/setup.html: Regenerate.
There is an obvious fusion bug that is exposed by more VLS patterns support.
After more VLS modes support, it cause following FAILs:
FAIL: gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c execution test
Demand 1: SEW = 64, LMUL = 1, RATIO = 64, demand SEW, demand GE_SEW.
Demand 2: SEW = 64, LMUL = 2, RATIO = 32, demand SEW, demand GE_SEW, demand RATIO.
Before this patch:
merge demand: SEW = 64, LMUL = 1, RATIO = 32, demand SEW, demand LMUL, demand GE_SEW.
It's obvious incorrect of merge LMUL which should be new LMUL = (demand 2 RATIO * greatest SEW) = M2
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc (vlmul_for_greatest_sew_second_ratio): New function.
* config/riscv/riscv-vsetvl.def (DEF_SEW_LMUL_FUSE_RULE): Fix bug.
This revives an earlier patch since the problematic code applying
extra costs to PHIs in copied blocks we couldn't make any sense of
prevents a required threading in this case. Instead of coming up
with an artificial other costing the following simply removes the
bits.
As with all threading changes this requires a plethora of testsuite
adjustments, but only the last three are unfortunate as is the
libgomp team.c adjustment which is required to avoid a bogus -Werror
diagnostic during bootstrap.
PR tree-optimization/111294
gcc/
* tree-ssa-threadbackward.cc (back_threader_profitability::m_name):
Remove
(back_threader::find_paths_to_names): Adjust.
(back_threader::maybe_thread_block): Likewise.
(back_threader_profitability::possibly_profitable_path_p): Remove
code applying extra costs to copies PHIs.
RISC-V: Support VLS modes vec_init auto-vectorization
There are multiple SLP dump FAILs in vect testsuite.
After analysis, confirm we are missing vec_init for VLS modes.
This patch is not sufficient to fix those FAILs (We need more VLS patterns will send them soon).
This patch is the prerequsite patch for fixing those SLP FAILs.
* gcc.target/riscv/rvv/autovec/vls/def.h: Add VLS vec_init tests.
* gcc.target/riscv/rvv/autovec/vls/init-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/init-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/init-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/init-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/init-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/init-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/init-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls/repeat-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/repeat-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/repeat-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/repeat-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/repeat-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/repeat-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/repeat-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls/repeat-8.c: New test.
* gcc.target/riscv/rvv/autovec/vls/repeat-9.c: New test.
It can optimize the current reduction vectorization codegen with current COST model.
TYPE __attribute__ ((noinline, noclone)) \
reduc_plus_##TYPE (TYPE * __restrict a, int n) \
{ \
TYPE r = 0; \
for (int i = 0; i < n; ++i) \
r += a[i]; \
return r; \
}
* gcc.target/riscv/rvv/autovec/vls/def.h: Add VLS mode reduction case.
* gcc.target/riscv/rvv/autovec/vls/reduc-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-10.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-11.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-12.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-13.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-14.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-15.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-16.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-17.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-18.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-19.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-20.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-21.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-8.c: New test.
* gcc.target/riscv/rvv/autovec/vls/reduc-9.c: New test.