* tree-switch-conversion.cc (bit_test_cluster::emit):
Handle correctly remaining probability.
(switch_decision_tree::try_switch_expansion): Fix BB's count
where a cluster expansion happens.
(switch_decision_tree::emit_cmp_and_jump_insns): Fill up also
BB count.
(switch_decision_tree::do_jump_if_equal): Likewise.
(switch_decision_tree::emit_case_nodes): Handle special case
for BT expansion which can also fallback to a default BB.
* tree-switch-conversion.h (cluster::cluster): Add
m_default_prob probability.
Richard Biener [Wed, 30 Nov 2022 11:05:29 +0000 (12:05 +0100)]
tree-optimization/107919 - predicate simplification in uninit
The testcase from the PR at -O2 shows
((_277 == 2) AND (_79 == 0))
OR ((NOT (_277 == 0)) AND (NOT (_277 > 2)) AND (NOT (_277 == 2)) AND (_79 == 0))
OR ((NOT (pretmp_300 == 255)) AND (_277 == 0) AND (NOT (_277 > 2)) AND (NOT (_277 == 2)) AND (_79 == 0))
which we fail to simplify. The following patch makes us simplify
the relations on _277, producing
((_79 == 0) AND (_277 == 2))
OR ((_79 == 0) AND (_277 <= 1) AND (NOT (_277 == 0)))
OR ((_79 == 0) AND (_277 == 0) AND (NOT (pretmp_300 == 255)))
which might be an incremental step to resolve a bogus uninit
diagnostic at -O2. The patch uses maybe_fold_and_comparison for this.
PR tree-optimization/107919
* gimple-predicate-analysis.cc (simplify_1): Rename to ...
(simplify_1a): .. this.
(simplify_1b): New.
(predicate::simplify): Call both simplify_1a and simplify_1b.
((_145 != 0B) AND (_531 == 2) AND (_109 == 0))
OR ((NOT (_145 != 0B)) AND (_531 == 2) AND (_109 == 0))
OR ((NOT (_531 == 2)) AND (_109 == 0))
because the existing simplification of !A && B || A && B is implemented
too simplistic. The following re-implements that which fixes the
bogus uninit diagnostic when using -O1 but not yet at -O2.
PR tree-optimization/107919
* gimple-predicate-analysis.cc (predicate::simplify_2):
Handle predicates of arbitrary length.
* g++.dg/warn/Wuninitialized-pr107919-1.C: New testcase.
Jakub Jelinek [Wed, 30 Nov 2022 10:44:27 +0000 (11:44 +0100)]
tree-chrec: Fix up ICE on pointer multiplication [PR107835]
r13-254-gdd3c7873a61019e9 added an optimization for {a, +, a} (x-1),
but as can be seen on the following testcase, the way it is written
where chrec_fold_multiply is called with type doesn't work for pointers:
res = build_int_cst (TREE_TYPE (x), 1);
res = chrec_fold_plus (TREE_TYPE (x), x, res);
res = chrec_convert_rhs (type, res, NULL);
res = chrec_fold_multiply (type, chrecr, res);
while what we were doing before and what is still used if the condition
doesn't match is fine:
res = chrec_convert_rhs (TREE_TYPE (chrecr), x, NULL);
res = chrec_fold_multiply (TREE_TYPE (chrecr), chrecr, res);
res = chrec_fold_plus (type, CHREC_LEFT (chrec), res);
because it performs chrec_fold_multiply on TREE_TYPE (chrecr) and converts
only afterwards.
I think the easiest fix is to ignore the new path for pointer types.
2022-11-30 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/107835
* tree-chrec.cc (chrec_apply): Don't handle "{a, +, a} (x-1)"
as "a*x" if type is a pointer type.
* testsuite/libgomp.c/declare-variant-4-fiji.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx803.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx900.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx906.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx908.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx90a.c: New test.
* testsuite/libgomp.c/declare-variant-4.h: New header file.
Lulu Cheng [Tue, 29 Nov 2022 08:06:12 +0000 (16:06 +0800)]
LoongArch: Optimize the implementation of stack check.
The old stack check was performed before the stack was dropped,
which would cause the detection tool to report a memory leak.
The current stack check scheme is as follows:
'-fstack-clash-protection':
1. When the frame->total_size is smaller than the guard page size,
the stack is dropped according to the original scheme, and there
is no need to perform stack detection in the prologue.
2. When frame->total_size is greater than or equal to guard page size,
the first step to drop the stack is to drop the space required by
the caller-save registers. This space needs to save the caller-save
registers, so an implicit stack check is performed.
So just need to check the rest of the stack space.
'-fstack-check':
There is no one-time stack drop and then page-by-page detection as
described in the document. It is also the same as
'-fstack-clash-protection', which is detected immediately after page drop.
It is judged that when frame->total_size is not 0, only the size required
to save the s register is dropped for the first stack down.
The test cases are referenced from aarch64.
gcc/ChangeLog:
* config/loongarch/linux.h (STACK_CHECK_MOVING_SP):
Define this macro to 1.
* config/loongarch/loongarch.cc (STACK_CLASH_PROTECTION_GUARD_SIZE):
Size of guard page.
(loongarch_first_stack_step): Return the size of the first drop stack
according to whether stack checking is performed.
(loongarch_emit_probe_stack_range): Adjust the method of stack checking in prologue.
(loongarch_output_probe_stack_range): Delete useless code.
(loongarch_expand_prologue): Adjust the method of stack checking in prologue.
(loongarch_option_override_internal): Enforce that interval is the same
size as size so the mid-end does the right thing.
* config/loongarch/loongarch.h (STACK_CLASH_MAX_UNROLL_PAGES):
New macro decide whether to loop stack detection.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp:
* gcc.target/loongarch/stack-check-alloca-1.c: New test.
* gcc.target/loongarch/stack-check-alloca-2.c: New test.
* gcc.target/loongarch/stack-check-alloca-3.c: New test.
* gcc.target/loongarch/stack-check-alloca-4.c: New test.
* gcc.target/loongarch/stack-check-alloca-5.c: New test.
* gcc.target/loongarch/stack-check-alloca-6.c: New test.
* gcc.target/loongarch/stack-check-alloca.h: New test.
* gcc.target/loongarch/stack-check-cfa-1.c: New test.
* gcc.target/loongarch/stack-check-cfa-2.c: New test.
* gcc.target/loongarch/stack-check-prologue-1.c: New test.
* gcc.target/loongarch/stack-check-prologue-2.c: New test.
* gcc.target/loongarch/stack-check-prologue-3.c: New test.
* gcc.target/loongarch/stack-check-prologue-4.c: New test.
* gcc.target/loongarch/stack-check-prologue-5.c: New test.
* gcc.target/loongarch/stack-check-prologue-6.c: New test.
* gcc.target/loongarch/stack-check-prologue-7.c: New test.
* gcc.target/loongarch/stack-check-prologue.h: New test.
David Malcolm [Wed, 30 Nov 2022 00:56:27 +0000 (19:56 -0500)]
analyzer work on issues with flex-generated lexers [PR103546]
PR analyzer/103546 tracks various false positives seen on
flex-generated lexers.
Whilst investigating them, I noticed an ICE with
-fanalyzer-call-summaries due to attempting to store sm-state
for an UNKNOWN svalue, which this patch fixes.
This patch also provides known_function implementations of all of the
external functions called by the lexer, reducing the number of false
positives.
The patch doesn't eliminate all false positives, but adds integration
tests to try to establish a baseline from which the remaining false
positives can be fixed.
gcc/analyzer/ChangeLog:
PR analyzer/103546
* analyzer.h (register_known_file_functions): New decl.
* program-state.cc (sm_state_map::replay_call_summary): Rejct
attempts to store sm-state for caller_sval that can't have
associated state.
* region-model-impl-calls.cc (register_known_functions): Call
register_known_file_functions.
* sm-fd.cc (class kf_isatty): New.
(register_known_fd_functions): Register it.
* sm-file.cc (class kf_ferror): New.
(class kf_fileno): New.
(class kf_getc): New.
(register_known_file_functions): New.
gcc/ChangeLog:
PR analyzer/103546
* doc/invoke.texi (Static Analyzer Options): Add isatty, ferror,
fileno, and getc to the list of functions known to the analyzer.
gcc/testsuite/ChangeLog:
PR analyzer/103546
* gcc.dg/analyzer/ferror-1.c: New test.
* gcc.dg/analyzer/fileno-1.c: New test.
* gcc.dg/analyzer/flex-with-call-summaries.c: New test.
* gcc.dg/analyzer/flex-without-call-summaries.c: New test.
* gcc.dg/analyzer/getc-1.c: New test.
* gcc.dg/analyzer/isatty-1.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Wed, 30 Nov 2022 00:56:27 +0000 (19:56 -0500)]
analyzer: fix folding of '(PTR + 0) => PTR' [PR105784]
gcc/analyzer/ChangeLog:
PR analyzer/105784
* region-model-manager.cc
(region_model_manager::maybe_fold_binop): For POINTER_PLUS_EXPR,
PLUS_EXPR and MINUS_EXPR, eliminate requirement that the final
type matches that of arg0 in favor of a cast.
gcc/testsuite/ChangeLog:
PR analyzer/105784
* gcc.dg/analyzer/torture/fold-ptr-arith-pr105784.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Patrick Palka [Wed, 30 Nov 2022 00:25:37 +0000 (19:25 -0500)]
c++: ICE with <=> of incompatible pointers [PR107542]
In a SFINAE context composite_pointer_type returns error_mark_node if
the given pointer types are incompatible. But the SPACESHIP_EXPR case
of cp_build_binary_op wasn't prepared for this error_mark_node result,
which led to an ICE (from spaceship_comp_cat) for the below testcase.
(In a non-SFINAE context composite_pointer_type issues a permerror and
returns cv void* in this case, so this ICE seems specific to SFINAE.)
PR c++/107542
gcc/cp/ChangeLog:
* typeck.cc (cp_build_binary_op): In the SPACESHIP_EXPR case,
handle an error_mark_node result type.
Harald Anlauf [Mon, 28 Nov 2022 19:43:02 +0000 (20:43 +0100)]
Fortran: intrinsic MERGE shall use all its arguments [PR107874]
gcc/fortran/ChangeLog:
PR fortran/107874
* simplify.cc (gfc_simplify_merge): When simplifying MERGE with a
constant scalar MASK, ensure that arguments TSOURCE and FSOURCE are
either constant or will be evaluated.
* trans-intrinsic.cc (gfc_conv_intrinsic_merge): Evaluate arguments
before generating conditional expression.
gcc/testsuite/ChangeLog:
PR fortran/107874
* gfortran.dg/merge_init_expr_2.f90: Adjust code to the corrected
simplification.
* gfortran.dg/merge_1.f90: New test.
Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
Jonathan Wakely [Tue, 29 Nov 2022 15:50:06 +0000 (15:50 +0000)]
libstdc++: Avoid bogus warning in std::vector::insert [PR107852]
GCC assumes that any global variable might be modified by operator new,
and so in the testcase for this PR all data members get reloaded after
allocating new storage. By making local copies of the _M_start and
_M_finish members we avoid that, and then the compiler has enough info
to remove the dead branches that trigger bogus -Warray-bounds warnings.
libstdc++-v3/ChangeLog:
PR libstdc++/107852
PR libstdc++/106199
PR libstdc++/100366
* include/bits/vector.tcc (vector::_M_fill_insert): Copy
_M_start and _M_finish members before allocating.
(vector::_M_default_append): Likewise.
(vector::_M_range_insert): Likewise.
Jonathan Wakely [Tue, 29 Nov 2022 15:19:33 +0000 (15:19 +0000)]
libstdc++: Remove unnecessary tag dispatching in std::vector
There's no need to call a _M_xxx_dispatch function with a
statically-known __false_type tag, we can just directly call the
function that should be dispatched to. This will compile a tiny bit
faster and save a function call with optimization or inlining turned
off.
Also add the always_inline attribute to the __iterator_category helper
used for dispatching on the iterator category.
Patrick Palka [Tue, 29 Nov 2022 14:55:21 +0000 (09:55 -0500)]
c++: explicit specialization and trailing requirements [PR107864]
Here we're crashing when using the explicit specialization of the
function template g with trailing requirements ultimately because
earlier decls_match (called indirectly from register_specialization) for
for the explicit specialization returned false since the template has
trailing requirements whereas the specialization doesn't.
In r12-2230-gddd25bd1a7c8f4, we fixed a similar issue concerning template
requirements instead of trailing requirements. We could extend that fix
to ignore trailing requirement mismatches for explicit specializations
as well, but it seems cleaner to just propagate constraints from the
specialized template to the specialization when declaring an explicit
specialization so that decls_match will naturally return true in this
case. And it looks like determine_specialization already does this,
albeit inconsistently (only when specializing a non-template member
function of a class template as in cpp2a/concepts-explicit-spec4.C).
So this patch makes determine_specialization consistently propagate
constraints from the specialized template to the specialization, which
in turn lets us get rid of the function_requirements_equivalent_p special
case added by r12-2230.
PR c++/107864
gcc/cp/ChangeLog:
* decl.cc (function_requirements_equivalent_p): Don't check
DECL_TEMPLATE_SPECIALIZATION.
* pt.cc (determine_specialization): Propagate constraints when
specializing a function template too. Simplify by using
add_outermost_template_args.
Richard Biener [Tue, 29 Nov 2022 09:41:36 +0000 (10:41 +0100)]
tree-optimization/106995 - if-conversion and vanishing loops
When we version loops for vectorization during if-conversion it
can happen that either loop vanishes because we run some VN and
CFG cleanup. If the to-be vectorized part vanishes we already
redirect the versioning condition to the original loop. The following
does the same in case the original loop vanishes as happened
for the testcase in the bug in the past (but no longer).
PR tree-optimization/106995
* tree-if-conv.cc (pass_if_conversion::execute): Also redirect the
versioning condition to the original loop if this very loop
vanished during CFG cleanup.
Richard Biener [Tue, 29 Nov 2022 08:03:46 +0000 (09:03 +0100)]
tree-optimization/107898 - ICE with -Walloca-larger-than
The following avoids ICEing with a mismatched prototype for alloca
and -Walloca-larger-than using irange for checks which doesn't
like mismatched types.
PR tree-optimization/107898
* gimple-ssa-warn-alloca.cc (alloca_call_type): Check
the type of the alloca argument is compatible with size_t
before querying ranges.
Richard Biener [Tue, 29 Nov 2022 07:43:55 +0000 (08:43 +0100)]
ipa/107897 - avoid property verification ICE after error
The target clone pass is the only small IPA pass that doesn't disable
itself after errors but has properties whose verification can fail
because we cut off build SSA passes after errors.
PR ipa/107897
* multiple_target.cc (pass_target_clone::gate): Disable
after errors.
Jason Merrill [Sat, 26 Nov 2022 16:13:55 +0000 (11:13 -0500)]
c++: simple-requirement starting with 'typename' [PR101733]
Usually a requirement starting with 'typename' is a type-requirement, but it
might be a simple-requirement such as a functional cast to a typename-type.
PR c++/101733
gcc/cp/ChangeLog:
* parser.cc (cp_parser_requirement): Parse tentatively for the
'typename' case.
Sinan [Mon, 28 Nov 2022 19:41:17 +0000 (12:41 -0700)]
riscv: improve cost model for loading 64bit constant in rv32
The motivation of this patch is to correct the wrong estimation of the number of instructions needed for loading a 64bit constant in rv32 in the current cost model(riscv_interger_cost). According to the current implementation, if a constant requires more than 3 instructions(riscv_const_insn and riscv_legitimate_constant_p), then the constant will be put into constant pool when expanding gimple to rtl(legitimate_constant_p hook and emit_move_insn). So the inaccurate cost model leads to the suboptimal codegen in rv32 and the wrong estimation part could be corrected through this fix.
e.g. the current codegen for loading 0x839290001 in rv32
RISC-V: Avoid redundant sign-extension for SImode SGE, SGEU, SLE, SLEU
We produce inefficient code for some synthesized SImode conditional set
operations (i.e. ones that are not directly implemented in hardware) on
RV64. For example a piece of C code like this:
int
sleu (unsigned int x, unsigned int y)
{
return x <= y;
}
This is because the middle end expands a SLEU operation missing from
RISC-V hardware into a sequence of a SImode SGTU operation followed by
an explicit SImode XORI operation with immediate 1. And while the SGTU
machine instruction (alias SLTU with the input operands swapped) gives a
properly sign-extended 32-bit result which is valid both as a SImode or
a DImode operand the middle end does not see that through a SImode XORI
operation, because we tell the middle end that the RISC-V target (unlike
MIPS) may hold values in DImode integer registers that are valid for
SImode operations even if not properly sign-extended.
However the RISC-V psABI requires that 32-bit function arguments and
results passed in 64-bit integer registers be properly sign-extended, so
this is explicitly done at the conclusion of the function.
Fix this by making the backend use a sequence of a DImode SGTU operation
followed by a SImode SEQZ operation instead. The latter operation is
known by the middle end to produce a properly sign-extended 32-bit
result and therefore combine gets rid of the sign-extension operation
that follows and actually folds it into the very same XORI machine
operation resulting in:
instead (although the SEQZ alias SLTIU against immediate 1 machine
instruction would equally do and is actually retained at `-O0'). This
is handled analogously for the remaining synthesized operations of this
kind, i.e. `SLE', `SGEU', and `SGE'.
gcc/
* config/riscv/riscv.cc (riscv_emit_int_order_test): Use EQ 0
rather that XOR 1 for LE and LEU operations.
gcc/testsuite/
* gcc.target/riscv/sge.c: New test.
* gcc.target/riscv/sgeu.c: New test.
* gcc.target/riscv/sle.c: New test.
* gcc.target/riscv/sleu.c: New test.
Harald Anlauf [Sun, 27 Nov 2022 20:10:18 +0000 (21:10 +0100)]
Fortran: ICE with elemental and dummy argument with VALUE attribute [PR107819]
gcc/fortran/ChangeLog:
PR fortran/107819
* trans-stmt.cc (gfc_conv_elemental_dependencies): In checking for
elemental dependencies, treat dummy argument with VALUE attribute
as implicitly having intent(in).
gcc/testsuite/ChangeLog:
PR fortran/107819
* gfortran.dg/elemental_dependency_7.f90: New test.
Jonathan Wakely [Mon, 28 Nov 2022 13:28:53 +0000 (13:28 +0000)]
libstdc++: Fix src/c++17/memory_resource for H8 targets [PR107801]
This fixes compilation failures for H8 multilibs. For the normal
multilib (ILP16L32?), the chunk struct does not have the expected size,
because uint32_t is type long and has alignment 4 (by default). This
forces sizeof(chunk) to be 12 instead of the expected 10. We can fix
that by using bitset::size_type instead of uint32_t, so that we only use
a 16-bit size when size_t and pointers are 16-bit types.
For the I32LP16 multilibs that use -mint32 int is wider than size_t
and so arithmetic expressions involving size_t promote to int. This
means we need some explicit casts back to size_t.
libstdc++-v3/ChangeLog:
PR libstdc++/107801
* src/c++17/memory_resource.cc (chunk::_M_bytes): Change type
from uint32_t to bitset::size_type. Adjust static assertion.
(__pool_resource::_Pool::replenish): Cast to size_t after
multiplication instead of before.
(__pool_resource::_M_alloc_pools): Ensure both arguments to
std::max have type size_t.
Jonathan Wakely [Mon, 28 Nov 2022 12:16:21 +0000 (12:16 +0000)]
libstdc++: Fix std::string_view for I32LP16 targets
For H8/300 with -msx -mn -mint32 the type of (_M_len - __pos) is int,
because int is wider than size_t so the operands are promoted.
libstdc++-v3/ChangeLog:
* include/std/string_view (basic_string_view::copy) Use explicit
template argument for call to std::min<size_t>.
(basic_string_view::substr): Likewise.
Jonathan Wakely [Mon, 28 Nov 2022 10:52:23 +0000 (10:52 +0000)]
libstdc++: Fix _Hash_bytes for I16LP32 targets [PR107885]
For H8/300 size_t is 32 bits wide, but (unsigned char)buf[2] << 16
promotes to int which is only 16 bits wide. The shift is then undefined.
This fixes it by converting to size_t before shifting.
libstdc++-v3/ChangeLog:
PR libstdc++/107885
* libsupc++/hash_bytes.cc (_Hash_bytes): Convert to size_t
instead of implicit integer promotion to 16 bits.
Fei Gao [Mon, 28 Nov 2022 16:31:09 +0000 (11:31 -0500)]
RISC-V: fix stack access before allocation.
In current riscv stack frame allocation, 2 steps are used. The first step allocates memories at least for callee saved GPRs and FPRs, and the second step allocates the rest if stack size is greater than signed 12-bit range. But it's observed in some cases, like gcc.target/riscv/stack_frame.c in my patch, callee saved FPRs fail to be included in the first step allocation, so we get generated instructions like this:
li t0,-16384
addi sp,sp,-48
addi t0,t0,752
...
fsw fs4,-4(sp) #issue here of accessing before allocation
...
add sp,sp,t0
"fsw fs4,-4(sp)" has issue here of accessing stack before allocation. Although "add sp,sp,t0" reserves later the memory for fs4, it exposes a risk when an interrupt comes in between "fsw fs4,-4(sp)" and "add sp,sp,t0", resulting in a corruption in the stack storing fs4 after interrupt context saving and a failure to get the correct value of fs4 later.
This patch fixes issue above, adapts testcases identified in regression tests, and add a new testcase for the change.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_first_stack_step): Fix computation
of MIN_FIRST_STEP to cover FP save area too.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr93304.c: Adapt testcase for the change, constrain
match to assembly instructions only.
* gcc.target/riscv/rvv/base/spill-11.c: Adapt testcase for the change.
* gcc.target/riscv/stack_frame.c: New test.
c++: Allow module name to be a single letter on Windows
On Windows, the ':' character is special and when the module name is
a single character, like 'A', then the flatname would be (for
example) 'A:Foo'. On Windows, 'A:Foo' is treated as an absolute
path by the module loader and is likely not found.
Without this patch, the test case pr98944_c.C fails with:
In module imported at /src/gcc/testsuite/g++.dg/modules/pr98944_b.C:7:1,
of module A:Foo, imported at /src/gcc/testsuite/g++.dg/modules/pr98944_c.C:7:
A:Internals: error: header module expected, module 'A:Internals' found
A:Internals: error: failed to read compiled module: Bad file data
A:Internals: note: compiled module file is 'gcm.cache/A-Internals.gcm'
In module imported at /src/gcc/testsuite/g++.dg/modules/pr98944_c.C:7:8:
A:Foo: error: failed to read compiled module: Bad import dependency
A:Foo: note: compiled module file is 'gcm.cache/A-Foo.gcm'
A:Foo: fatal error: returning to the gate for a mechanical issue
compilation terminated.
gcc/cp/ChangeLog:
* module.cc: On Windows, 'A:Foo' is supposed to be a module
and not a path.
Jonathan Wakely [Mon, 28 Nov 2022 09:44:52 +0000 (09:44 +0000)]
libstdc++: Make 16-bit std::subtract_with_carry_engine work [PR107466]
This implements the proposed resolution of LWG 3809, so that
std::subtract_with_carry_engine can be used with a 16-bit result_type.
Currently this produces a narrowing error when instantiating the
std::linear_congruential_engine to create the initial state. It also
truncates the default_seed constant when passing it as a result_type
argument.
Change the type of the constant to uint_least32_t and pass 0u when the
default_seed should be used.
libstdc++-v3/ChangeLog:
PR libstdc++/107466
* include/bits/random.h (subtract_with_carry_engine): Use 32-bit
type for default seed. Use 0u as default argument for
subtract_with_carry_engine(result_type) constructor and
seed(result_type) member function.
* include/bits/random.tcc (subtract_with_carry_engine): Use
32-bit type for default seed and engine used for initial state.
* testsuite/26_numerics/random/subtract_with_carry_engine/cons/lwg3809.cc:
New test.
Eric Botcazou [Sat, 26 Nov 2022 09:11:18 +0000 (10:11 +0100)]
ada: Adjust runtime library and User's Guide to PIE default on Linux
gcc/ada/
* libgnat/g-traceb.ads: Minor tweaks in the commentary.
(Executable_Load_Address): New function.
* doc/gnat_ugn/gnat_and_program_execution.rst (Non-Symbolic
Traceback): Adjust to PIE default on Linux.
(Symbolic Traceback): Likewise.
* doc/gnat_ugn/gnat_utility_programs.rst (gnatsymbolize): Likewise.
* gnat_ugn.texi: Regenerate.
Joel Brobecker [Fri, 25 Nov 2022 13:53:53 +0000 (17:53 +0400)]
ada: doc/share/conf.py: Switch the HTML documentation to using the RTD theme
This commit adjust the sphinx configuration to use the "Read The Docs"
theme, which has the advantage of allowing the navigation bar
(containing among other things a search bar, and the TOC) to stay
fixed while scrolling the contents of the page being read. This is
particularly useful to allow access to those features while reading
a long page, for instance.
gcc/ada/
* doc/share/conf.py (extensions): Add 'sphinx_rtd_theme'.
(html_theme): Set to 'sphinx_rtd_theme'.
Claire Dross [Fri, 25 Nov 2022 15:28:47 +0000 (16:28 +0100)]
ada: Annotate GNAT.Source_Info with an abstract state
So it can be used safely from SPARK code. The abstract state represents
the source code information that is accessed by the functions defined
in Source_Info. It is volatile as it is updated asyncronously when
moving in the code.
gcc/ada/
* libgnat/g-souinf.ads (Source_Code_Information): Add a new
volatile abstract state and add it in the global contract of all
functions defined in Source_Info.
Eric Botcazou [Fri, 25 Nov 2022 09:28:18 +0000 (10:28 +0100)]
ada: Fix internal error on conversion as in/out actual with -gnatVa
The problem is that the regular expansion of the conversion around the
call to the subprogram is disabled by the expansion of the validity check
around the same call, as documented in Expand_Actuals:
-- This case is given higher priority because the subsequent check
-- for type conversion may add an extra copy of the variable and
-- prevent proper value propagation back in the original object.
Now the two mechanisms need to cooperate in order for the code to compile.
gcc/ada/
* exp_ch6.adb (Expand_Actuals.Add_Call_By_Copy_Code): Deal with a
reference to a validation variable in the actual.
(Expand_Actuals.Add_Validation_Call_By_Copy_Code): Minor tweak.
(Expand_Actuals): Call Add_Validation_Call_By_Copy_Code directly
only if Add_Call_By_Copy_Code is not to be invoked.
Yannick Moy [Thu, 24 Nov 2022 15:27:37 +0000 (16:27 +0100)]
ada: Implement change to SPARK RM rule on state refinement
SPARK RM 7.1.4(4) does not mandate anymore that a package with abstract
states has a completing body, unless the package state is mentioned in
Part_Of specifications. Implement that change.
gcc/ada/
* sem_prag.adb (Check_Part_Of_Abstract_State): Add verification
related to use of Part_Of, so that constituents in private childs
that refer to state in a sibling or parent unit force that unit to
have a body.
* sem_util.adb (Check_State_Refinements): Drop the requirement to
have always a package body for state refinement, when the package
state is mentioned in no Part_Of specification.
* sem_ch3.adb (Analyze_Declarations): Refresh SPARK refs in comment.
* sem_ch7.adb (Analyze_Package_Declaration): Likewise.
Richard Biener [Mon, 28 Nov 2022 09:25:44 +0000 (10:25 +0100)]
tree-optimization/107493 - SCEV analysis with conversions
This shows another case where trying to validate conversions during
the CHREC SCC analysis fails because said analysis assumes we are
converting a complete SCC. Like the constraint on the initial
conversion seen restrict all conversions handled to sign-changes.
PR tree-optimization/107493
* tree-scalar-evolution.cc (scev_dfs::follow_ssa_edge_expr):
Only handle no-op and sign-changing conversions.
Tobias Burnus [Mon, 28 Nov 2022 10:11:43 +0000 (11:11 +0100)]
gcn: Fix __builtin_gcn_first_call_this_thread_p
Contrary naive expectation, unspec_volatile (via prologue_use) did not
prevent the cprop pass (at -O2) to remove the access to the s[0:1]
(PRIVATE_SEGMENT_BUFFER_ARG) register as the volatile got just put on
the preceeding pseudoregister. Solution: Use gen_rtx_USE instead.
Additionally, this patch removes (gen_)prologue_use_di as it is then no
longer used.
Finally, as we already do bit manipulation, instead of using the full
64bit side - and then just keeping the value of 's0', just move directly
to use only s1 of s[0:1] and do the bit manipulations there, generating
more readable assembly code and better matching the '#else' branch.
gcc/ChangeLog:
* config/gcn/gcn.cc (gcn_expand_builtin_1): Work on s1 instead
of s[0:1] and use USE to prevent removal of setting that register.
* config/gcn/gcn.md (prologue_use_di): Remove.
Tobias Burnus [Mon, 28 Nov 2022 10:08:32 +0000 (11:08 +0100)]
OpenMP/Fortran: Permit end-clause on directive
gcc/fortran/ChangeLog:
* openmp.cc (OMP_DO_CLAUSES, OMP_SCOPE_CLAUSES,
OMP_SECTIONS_CLAUSES): Add 'nowait'.
(OMP_SINGLE_CLAUSES): Add 'nowait' and 'copyprivate'.
(gfc_match_omp_distribute_parallel_do,
gfc_match_omp_distribute_parallel_do_simd,
gfc_match_omp_parallel_do,
gfc_match_omp_parallel_do_simd,
gfc_match_omp_parallel_sections,
gfc_match_omp_teams_distribute_parallel_do,
gfc_match_omp_teams_distribute_parallel_do_simd): Disallow 'nowait'.
(gfc_match_omp_workshare): Match 'nowait' clause.
(gfc_match_omp_end_single): Use clause matcher for 'nowait'.
(resolve_omp_clauses): Reject 'nowait' + 'copyprivate'.
* parse.cc (decode_omp_directive): Break too long line.
(parse_omp_do, parse_omp_structured_block): Diagnose duplicated
'nowait' clause.
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.2): Mark end-directive as Y.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/copyprivate-1.f90: New test.
* gfortran.dg/gomp/copyprivate-2.f90: New test.
* gfortran.dg/gomp/nowait-2.f90: Move dg-error tests ...
* gfortran.dg/gomp/nowait-4.f90: ... to this new file.
* gfortran.dg/gomp/nowait-5.f90: New test.
* gfortran.dg/gomp/nowait-6.f90: New test.
* gfortran.dg/gomp/nowait-7.f90: New test.
* gfortran.dg/gomp/nowait-8.f90: New test.
Jakub Jelinek [Mon, 28 Nov 2022 09:13:43 +0000 (10:13 +0100)]
i386: Fix up ix86_abi handling [PR106875]
The following testcase fails since my changes to make also
opts_set saved/restored upon function target/optimization changes
(before it has been acting as "has this option be ever explicit
anywhere?").
The problem is that for ix86_abi we depend on the opts_set
value for it in ix86_option_override_internal:
SET_OPTION_IF_UNSET (opts, opts_set, ix86_abi, DEFAULT_ABI);
but as it is a TargetSave, the backend code is required to
save/restore it manually (it does that) and since gcc 11 also
to save/restore the opts_set bit for it (which isn't done).
We don't do that for various other TargetSave which
ix86_function_specific_{save,restore} saves/restores, but as long
as we never test opts_set for it, it doesn't really matter.
One possible fix would be to introduce some new TargetSave into
which ix86_function_specific_{save,restore} would save/restore a bitmask
of the opts_set bits. The following patch uses an easier fix, by
making it a TargetVariable instead the saving/restoring is handled
by the generated code.
The differences in options.h are just slight movements on where
*ix86_abi stuff appears in it, ditto for options.cc, the real
differences are just in options-save.cc, where cl_target_option_save
gets:
+ ptr->x_ix86_abi = opts->x_ix86_abi;
...
+ if (opts_set->x_ix86_abi) mask |= HOST_WIDE_INT_1U << 3;
(plus adjustments of following TargetVariables mask related stuff),
cl_target_option_restore gets:
+ opts->x_ix86_abi = ptr->x_ix86_abi;
...
+ opts_set->x_ix86_abi = static_cast<enum calling_abi>((mask & 1) != 0);
+ mask >>= 1;
plus the movements in other functions too. So, by it being a
TargetVariable, the only thing that changed is that we don't need to
handle it manually in ix86_function_specific_{save,restore} because it
is handled automatically including the opts_set stuff.
2022-11-28 Jakub Jelinek <jakub@redhat.com>
PR target/106875
* config/i386/i386.opt (x_ix86_abi): Remove TargetSave.
(ix86_abi): Replace it with TargetVariable.
* config/i386/i386-options.cc (ix86_function_specific_save,
ix86_function_specific_restore): Don't save and restore x_ix86_abi.
arm: Add integer vector overloading of vsubq_x instrinsic
In the past we had only defined the vsubq_x generic overload of the
vsubq_x_* intrinsics for float vector types. This would cause them
to fall back to the `__ARM_undef` failure state if they was called
through the generic version.
This patch simply adds these overloads.
gcc/ChangeLog:
* config/arm/arm_mve.h (__arm_vsubq_x FP): New overloads.
(__arm_vsubq_x Integer): New.
arm: Explicitly specify other float types for _Generic overloading [PR107515]
This patch adds explicit references to other float types
to __ARM_mve_typeid in arm_mve.h. Resolves PR 107515:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107515
arm: propagate fixed overloading of MVE intrinsic scalar parameters
This is a mechanical patch that propagates the change proposed in
my previous patch for vaddq[_m]_n
across all other polymorphic MVE intrinsic overloads of scalar types.
The above commits changed the definitions of the intrinsics from using
`[u]int[8/16/32]_t` types for the scalar argument to using `int`. This
allowed `int` to be supported in user code through the overloaded
`#defines`, but seems to have broken the `[u]int[8/16/32]_t` types
The solution implemented by this patch is to explicitly use a new
_Generic mapping from all the `[u]int[8/16/32]_t` types for int. With this
change, both `int` and `[u]int[8/16/32]_t` parameters are supported from
user code and are handled by the overloading mechanism correctly.
Note that in these scalar cases it is safe to pass the raw p<n>, rather
than the typeof-ed __p<n>, because we are not at risk of the _Generics
being exponentially expanded on the `n` scalar argument to an `_n`
intrinsic. Using p<n> instead will give a more accurate error message
to the user, should something be wrong with that argument.
Richard Biener [Mon, 28 Nov 2022 08:19:33 +0000 (09:19 +0100)]
tree-optimization/107876 - unswitching of switch
The following shows a missed update of dominators when unswitching
removes unreachable edges from switch stmts it unswitches. Fixed
by wiping dominator info in that case.
PR tree-optimization/107876
* tree-ssa-loop-unswitch.cc (clean_up_after_unswitching): Wipe
dominator info if we removed an edge.
Lulu Cheng [Thu, 17 Nov 2022 09:08:36 +0000 (17:08 +0800)]
LoongArch: Optimize immediate load.
The immediate number is split in the Split pass, not in the expand pass.
Because loop2_invariant pass will extract the instructions that do not change
in the loop out of the loop, some instructions will not meet the extraction
conditions if the machine performs immediate decomposition while expand pass,
so the immediate decomposition will be transferred to the split process.
gcc/ChangeLog:
* config/loongarch/loongarch.cc (enum loongarch_load_imm_method):
Remove the member METHOD_INSV that is not currently used.
(struct loongarch_integer_op): Define a new member curr_value,
that records the value of the number stored in the destination
register immediately after the current instruction has run.
(loongarch_build_integer): Assign a value to the curr_value member variable.
(loongarch_move_integer): Adds information for the immediate load instruction.
* config/loongarch/loongarch.md (*movdi_32bit): Redefine as define_insn_and_split.
(*movdi_64bit): Likewise.
(*movsi_internal): Likewise.
(*movhi_internal): Likewise.
* config/loongarch/predicates.md: Return true as long as it is CONST_INT, ensure
that the immediate number is not optimized by decomposition during expand
optimization loop.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/imm-load.c: New test.
* gcc.target/loongarch/imm-load1.c: New test.