Richard Biener [Mon, 28 Nov 2022 09:25:44 +0000 (10:25 +0100)]
tree-optimization/107493 - SCEV analysis with conversions
This shows another case where trying to validate conversions during
the CHREC SCC analysis fails because said analysis assumes we are
converting a complete SCC. Like the constraint on the initial
conversion seen restrict all conversions handled to sign-changes.
PR tree-optimization/107493
* tree-scalar-evolution.cc (scev_dfs::follow_ssa_edge_expr):
Only handle no-op and sign-changing conversions.
Tobias Burnus [Mon, 28 Nov 2022 10:11:43 +0000 (11:11 +0100)]
gcn: Fix __builtin_gcn_first_call_this_thread_p
Contrary naive expectation, unspec_volatile (via prologue_use) did not
prevent the cprop pass (at -O2) to remove the access to the s[0:1]
(PRIVATE_SEGMENT_BUFFER_ARG) register as the volatile got just put on
the preceeding pseudoregister. Solution: Use gen_rtx_USE instead.
Additionally, this patch removes (gen_)prologue_use_di as it is then no
longer used.
Finally, as we already do bit manipulation, instead of using the full
64bit side - and then just keeping the value of 's0', just move directly
to use only s1 of s[0:1] and do the bit manipulations there, generating
more readable assembly code and better matching the '#else' branch.
gcc/ChangeLog:
* config/gcn/gcn.cc (gcn_expand_builtin_1): Work on s1 instead
of s[0:1] and use USE to prevent removal of setting that register.
* config/gcn/gcn.md (prologue_use_di): Remove.
Tobias Burnus [Mon, 28 Nov 2022 10:08:32 +0000 (11:08 +0100)]
OpenMP/Fortran: Permit end-clause on directive
gcc/fortran/ChangeLog:
* openmp.cc (OMP_DO_CLAUSES, OMP_SCOPE_CLAUSES,
OMP_SECTIONS_CLAUSES): Add 'nowait'.
(OMP_SINGLE_CLAUSES): Add 'nowait' and 'copyprivate'.
(gfc_match_omp_distribute_parallel_do,
gfc_match_omp_distribute_parallel_do_simd,
gfc_match_omp_parallel_do,
gfc_match_omp_parallel_do_simd,
gfc_match_omp_parallel_sections,
gfc_match_omp_teams_distribute_parallel_do,
gfc_match_omp_teams_distribute_parallel_do_simd): Disallow 'nowait'.
(gfc_match_omp_workshare): Match 'nowait' clause.
(gfc_match_omp_end_single): Use clause matcher for 'nowait'.
(resolve_omp_clauses): Reject 'nowait' + 'copyprivate'.
* parse.cc (decode_omp_directive): Break too long line.
(parse_omp_do, parse_omp_structured_block): Diagnose duplicated
'nowait' clause.
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.2): Mark end-directive as Y.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/copyprivate-1.f90: New test.
* gfortran.dg/gomp/copyprivate-2.f90: New test.
* gfortran.dg/gomp/nowait-2.f90: Move dg-error tests ...
* gfortran.dg/gomp/nowait-4.f90: ... to this new file.
* gfortran.dg/gomp/nowait-5.f90: New test.
* gfortran.dg/gomp/nowait-6.f90: New test.
* gfortran.dg/gomp/nowait-7.f90: New test.
* gfortran.dg/gomp/nowait-8.f90: New test.
Jakub Jelinek [Mon, 28 Nov 2022 09:13:43 +0000 (10:13 +0100)]
i386: Fix up ix86_abi handling [PR106875]
The following testcase fails since my changes to make also
opts_set saved/restored upon function target/optimization changes
(before it has been acting as "has this option be ever explicit
anywhere?").
The problem is that for ix86_abi we depend on the opts_set
value for it in ix86_option_override_internal:
SET_OPTION_IF_UNSET (opts, opts_set, ix86_abi, DEFAULT_ABI);
but as it is a TargetSave, the backend code is required to
save/restore it manually (it does that) and since gcc 11 also
to save/restore the opts_set bit for it (which isn't done).
We don't do that for various other TargetSave which
ix86_function_specific_{save,restore} saves/restores, but as long
as we never test opts_set for it, it doesn't really matter.
One possible fix would be to introduce some new TargetSave into
which ix86_function_specific_{save,restore} would save/restore a bitmask
of the opts_set bits. The following patch uses an easier fix, by
making it a TargetVariable instead the saving/restoring is handled
by the generated code.
The differences in options.h are just slight movements on where
*ix86_abi stuff appears in it, ditto for options.cc, the real
differences are just in options-save.cc, where cl_target_option_save
gets:
+ ptr->x_ix86_abi = opts->x_ix86_abi;
...
+ if (opts_set->x_ix86_abi) mask |= HOST_WIDE_INT_1U << 3;
(plus adjustments of following TargetVariables mask related stuff),
cl_target_option_restore gets:
+ opts->x_ix86_abi = ptr->x_ix86_abi;
...
+ opts_set->x_ix86_abi = static_cast<enum calling_abi>((mask & 1) != 0);
+ mask >>= 1;
plus the movements in other functions too. So, by it being a
TargetVariable, the only thing that changed is that we don't need to
handle it manually in ix86_function_specific_{save,restore} because it
is handled automatically including the opts_set stuff.
2022-11-28 Jakub Jelinek <jakub@redhat.com>
PR target/106875
* config/i386/i386.opt (x_ix86_abi): Remove TargetSave.
(ix86_abi): Replace it with TargetVariable.
* config/i386/i386-options.cc (ix86_function_specific_save,
ix86_function_specific_restore): Don't save and restore x_ix86_abi.
arm: Add integer vector overloading of vsubq_x instrinsic
In the past we had only defined the vsubq_x generic overload of the
vsubq_x_* intrinsics for float vector types. This would cause them
to fall back to the `__ARM_undef` failure state if they was called
through the generic version.
This patch simply adds these overloads.
gcc/ChangeLog:
* config/arm/arm_mve.h (__arm_vsubq_x FP): New overloads.
(__arm_vsubq_x Integer): New.
arm: Explicitly specify other float types for _Generic overloading [PR107515]
This patch adds explicit references to other float types
to __ARM_mve_typeid in arm_mve.h. Resolves PR 107515:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107515
arm: propagate fixed overloading of MVE intrinsic scalar parameters
This is a mechanical patch that propagates the change proposed in
my previous patch for vaddq[_m]_n
across all other polymorphic MVE intrinsic overloads of scalar types.
The above commits changed the definitions of the intrinsics from using
`[u]int[8/16/32]_t` types for the scalar argument to using `int`. This
allowed `int` to be supported in user code through the overloaded
`#defines`, but seems to have broken the `[u]int[8/16/32]_t` types
The solution implemented by this patch is to explicitly use a new
_Generic mapping from all the `[u]int[8/16/32]_t` types for int. With this
change, both `int` and `[u]int[8/16/32]_t` parameters are supported from
user code and are handled by the overloading mechanism correctly.
Note that in these scalar cases it is safe to pass the raw p<n>, rather
than the typeof-ed __p<n>, because we are not at risk of the _Generics
being exponentially expanded on the `n` scalar argument to an `_n`
intrinsic. Using p<n> instead will give a more accurate error message
to the user, should something be wrong with that argument.
Richard Biener [Mon, 28 Nov 2022 08:19:33 +0000 (09:19 +0100)]
tree-optimization/107876 - unswitching of switch
The following shows a missed update of dominators when unswitching
removes unreachable edges from switch stmts it unswitches. Fixed
by wiping dominator info in that case.
PR tree-optimization/107876
* tree-ssa-loop-unswitch.cc (clean_up_after_unswitching): Wipe
dominator info if we removed an edge.
Lulu Cheng [Thu, 17 Nov 2022 09:08:36 +0000 (17:08 +0800)]
LoongArch: Optimize immediate load.
The immediate number is split in the Split pass, not in the expand pass.
Because loop2_invariant pass will extract the instructions that do not change
in the loop out of the loop, some instructions will not meet the extraction
conditions if the machine performs immediate decomposition while expand pass,
so the immediate decomposition will be transferred to the split process.
gcc/ChangeLog:
* config/loongarch/loongarch.cc (enum loongarch_load_imm_method):
Remove the member METHOD_INSV that is not currently used.
(struct loongarch_integer_op): Define a new member curr_value,
that records the value of the number stored in the destination
register immediately after the current instruction has run.
(loongarch_build_integer): Assign a value to the curr_value member variable.
(loongarch_move_integer): Adds information for the immediate load instruction.
* config/loongarch/loongarch.md (*movdi_32bit): Redefine as define_insn_and_split.
(*movdi_64bit): Likewise.
(*movsi_internal): Likewise.
(*movhi_internal): Likewise.
* config/loongarch/predicates.md: Return true as long as it is CONST_INT, ensure
that the immediate number is not optimized by decomposition during expand
optimization loop.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/imm-load.c: New test.
* gcc.target/loongarch/imm-load1.c: New test.
Sandra Loosemore [Tue, 15 Nov 2022 03:40:12 +0000 (03:40 +0000)]
OpenMP: Generate SIMD clones for functions with "declare target"
This patch causes the IPA simdclone pass to generate clones for
functions with the "omp declare target" attribute as if they had
"omp declare simd", provided the function appears to be suitable for
SIMD execution. The filter is conservative, rejecting functions
that write memory or that call other functions not known to be safe.
A new option -fopenmp-target-simd-clone is added to control this
transformation; it's enabled for offload processing at -O2 and higher.
gcc/ChangeLog:
* common.opt (fopenmp-target-simd-clone): New option.
(target_simd_clone_device): New enum to go with it.
* doc/invoke.texi (-fopenmp-target-simd-clone): Document.
* flag-types.h (enum omp_target_simd_clone_device_kind): New.
* omp-simd-clone.cc (auto_simd_fail): New function.
(auto_simd_check_stmt): New function.
(plausible_type_for_simd_clone): New function.
(ok_for_auto_simd_clone): New function.
(simd_clone_create): Add force_local argument, make the symbol
have internal linkage if it is true.
(expand_simd_clones): Also check for cloneable functions with
"omp declare target". Pass explicit_p argument to
simd_clone.compute_vecsize_and_simdlen target hook.
* opts.cc (default_options_table): Add -fopenmp-target-simd-clone.
* target.def (TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN):
Add bool explicit_p argument.
* doc/tm.texi: Regenerated.
* config/aarch64/aarch64.cc
(aarch64_simd_clone_compute_vecsize_and_simdlen): Update.
* config/gcn/gcn.cc
(gcn_simd_clone_compute_vecsize_and_simdlen): Update.
* config/i386/i386.cc
(ix86_simd_clone_compute_vecsize_and_simdlen): Update.
ChangeLog:
* Makefile.def: Add libsframe as new module with its dependencies.
* Makefile.in: Regenerated.
* configure.ac: Add libsframe to host_libs.
* configure: Regenerated.
Jonathan Wakely [Fri, 25 Nov 2022 11:40:37 +0000 (11:40 +0000)]
libstdc++: Fix orphaned/nested output of configure checks
This moves two AC_MSG_RESULT lines for <uchar.h> features so that they
are only printed when the corresponding AC_MSG_CHECKING actually
happened. This fixes configure output like:
checking for uchar.h... no
no
checking for int64_t... yes
Also move the AC_MSG_CHECKING for libbacktrace support so it doesn't
come after AC_CHECK_HEADERS output. This fixes:
checking whether to build libbacktrace support... checking for sys/mman.h... (cached) yes
yes
libstdc++-v3/ChangeLog:
* acinclude.m4 (GLIBCXX_CHECK_UCHAR_H): Don't use AC_MSG_RESULT
unless the AC_MSG_CHECKING happened.
* configure: Regenerate.
Jonathan Wakely [Thu, 24 Nov 2022 21:09:03 +0000 (21:09 +0000)]
libstdc++: Call predicate with non-const values in std::erase_if [PR107850]
As specified in the standard, the predicate for std::erase_if has to be
invocable as non-const with a non-const lvalues argument. Restore
support for predicates that only accept non-const arguments.
It's not strictly nevessary to change it for the set and unordered_set
overloads, because they only give const access to the elements anyway.
I've done it for them too just to keep them all consistent.
Jonathan Wakely [Thu, 24 Nov 2022 21:16:41 +0000 (21:16 +0000)]
libstdc++: Do not define operator!= in <random> for C++20
These overloads are not needed in C++20 as they can be synthesized by
the compiler. Removing them means less code to compile when including
these headers.
libstdc++-v3/ChangeLog:
* include/bits/random.h [three_way_comparison] (operator!=):
Do not define inequality operators when C++20 three way
comparisons are supported.
* include/ext/random [three_way_comparison] (operator!=):
Likewise.
Jonathan Wakely [Thu, 24 Nov 2022 21:13:36 +0000 (21:13 +0000)]
libstdc++: Add always_inline to trivial iterator operations
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator_base_funcs.h (__distance):
Add always_inline attribute to overload for random
access iterators.
(advance, distance, next, prev): Add always_inline attribute to
inline functions that just forward to another function.
Tamar Christina [Fri, 25 Nov 2022 12:57:24 +0000 (12:57 +0000)]
sve2: Fix expansion of division [PR107830]
SVE has an actual division optab, and when using -Os we don't
optimize the division away. This means that we need to distinguish
between a div which we can optimize and one we cannot even during
expansion.
gcc/ChangeLog:
PR target/107830
* config/aarch64/aarch64.cc
(aarch64_vectorize_can_special_div_by_constant): Check validity during
codegen phase as well.
gcc/testsuite/ChangeLog:
PR target/107830
* gcc.target/aarch64/sve2/pr107830-1.c: New test.
* gcc.target/aarch64/sve2/pr107830-2.c: New test.
Tobias Burnus [Fri, 25 Nov 2022 12:48:17 +0000 (13:48 +0100)]
libgomp: Add no-target-region rev offload test + fix plugin-nvptx
OpenMP permits that a 'target device(ancestor:1)' is called without being
enclosed in a target region - using the current device (i.e. the host) in
that case. This commit adds a testcase for this.
In case of nvptx, the missing on-device 'GOMP_target_ext' call causes that
it and also the associated on-device GOMP_REV_OFFLOAD_VAR variable are not
linked in from nvptx's libgomp.a. Thus, handle the failing cuModuleGetGlobal
gracefully by disabling reverse offload and assuming that the failure is fine.
libgomp/ChangeLog:
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Use unsigned int
for 'i' to match 'fn_entries'; regard absent GOMP_REV_OFFLOAD_VAR
as valid and the code having no reverse-offload code.
* testsuite/libgomp.c-c++-common/reverse-offload-2.c: New test.
Eric Botcazou [Fri, 25 Nov 2022 09:49:20 +0000 (10:49 +0100)]
Fix thinko in operator_bitwise_xor::op1_range
There is a thinko in the op1_range method of ranger's operator_bitwise_xor
class in a boolean context: if the result is known to be true, it may infer
that a specific operand is false without any basis.
Richard Biener [Fri, 25 Nov 2022 07:27:42 +0000 (08:27 +0100)]
tree-optimization/107865 - ICE with outlining of loops
The following makes sure to clear loops number of iterations when
outlining them as part of a SESE region as can happen with
auto-parallelization. The referenced SSA names become stale otherwise.
PR tree-optimization/107865
* tree-cfg.cc (move_sese_region_to_fn): Free the number of
iterations of moved loops.
* gfortran.dg/graphite/pr107865.f90: New testcase.
Kewen.Lin [Fri, 25 Nov 2022 03:17:28 +0000 (21:17 -0600)]
Adjust the symbol for SECTION_LINK_ORDER linked_to section [PR99889]
As discussed in PR98125, -fpatchable-function-entry with
SECTION_LINK_ORDER support doesn't work well on powerpc64
ELFv1 because the filled "Symbol" in
.section name,"flags"o,@type,Symbol
sits in .opd section instead of in the function_section
like .text or named .text*.
Since we already generates one label LPFE* which sits in
function_section of current_function_decl, this patch is
to reuse it as the symbol for the linked_to section. It
avoids the above ABI specific issue when using the symbol
concluded from current_function_decl.
Besides, with this support some previous workarounds can
be reverted.
PR target/99889
gcc/ChangeLog:
* config/rs6000/rs6000.cc (rs6000_print_patchable_function_entry):
Adjust to call function default_print_patchable_function_entry.
* targhooks.cc (default_print_patchable_function_entry_1): Remove and
move the flags preparation ...
(default_print_patchable_function_entry): ... here, adjust to use
current_function_funcdef_no for label no.
* targhooks.h (default_print_patchable_function_entry_1): Remove.
* varasm.cc (default_elf_asm_named_section): Adjust code for
__patchable_function_entries section support with LPFE label.
gcc/testsuite/ChangeLog:
* g++.dg/pr93195a.C: Remove the skip on powerpc*-*-* 64-bit.
* gcc.target/aarch64/pr92424-2.c: Adjust LPFE1 with LPFE0.
* gcc.target/aarch64/pr92424-3.c: Likewise.
* gcc.target/i386/pr93492-2.c: Likewise.
* gcc.target/i386/pr93492-3.c: Likewise.
* gcc.target/i386/pr93492-4.c: Likewise.
* gcc.target/i386/pr93492-5.c: Likewise.
Wilco Dijkstra [Wed, 23 Nov 2022 17:27:19 +0000 (17:27 +0000)]
AArch64: Add fma_reassoc_width [PR107413]
Add a reassocation width for FMA in per-CPU tuning structures. Keep
the existing setting of 1 for cores with 2 FMA pipes (this disables
reassociation), and use 4 for cores with 4 FMA pipes. This improves
SPECFP2017 on Neoverse V1 by ~1.5%.
gcc/
PR tree-optimization/107413
* config/aarch64/aarch64.cc (struct tune_params): Add
fma_reassoc_width to all CPU tuning structures.
(aarch64_reassociation_width): Use fma_reassoc_width.
* config/aarch64/aarch64-protos.h (struct tune_params): Add
fma_reassoc_width.
Jakub Jelinek [Thu, 24 Nov 2022 10:51:34 +0000 (11:51 +0100)]
c++: Further -fcontract* option description fixes
During testing I've missed my previous patch just changed:
-FAIL: compiler driver --help=c++ option(s): "^ +-.*[^:.]\$" absent from output: " -fcontract-build-level=[off|default|audit] Specify max contract level to generate runtime checks for"
+FAIL: compiler driver --help=c++ option(s): "^ +-.*[^:.]\$" absent from output: " -fcontract-role=<name>:<semantics> Specify the semantics for all levels in a role (default, review), or a custom contract role with given semantics (ex: opt:assume,assume,assume)"
rather than actually fixed it, the test only reports the first such problem.
This patch fixes the remaining ones.
2022-11-24 Jakub Jelinek <jakub@redhat.com>
* c.opt (fcontract-role=, fcontract-semantic=): Terminate descriptions
with a dot.
Jakub Jelinek [Thu, 24 Nov 2022 10:29:54 +0000 (11:29 +0100)]
asan: Fix up error recovery for too large frames [PR107317]
asan_emit_stack_protection and functions it calls have various asserts that
verify sanity of the stack protection instrumentation. But, that
verification can easily fail if we've diagnosed a frame offset overflow.
asan_emit_stack_protection just emits some extra code in the prologue,
if we've reported errors, we aren't producing assembly, so it doesn't
really matter if we don't include the protection code, compilation
is going to fail anyway.
2022-11-24 Jakub Jelinek <jakub@redhat.com>
PR middle-end/107317
* asan.cc: Include diagnostic-core.h.
(asan_emit_stack_protection): Return NULL early if seen_error ().
Eric Botcazou [Tue, 22 Nov 2022 12:03:00 +0000 (13:03 +0100)]
ada: Add assertion for the implementation of storage models
We cannot generate a call to memset for an aggregate with an Others choice
when the target of the assignment has a storage model with Copy_To routine.
gcc/ada/
* gcc-interface/trans.cc (gnat_to_gnu) <N_Assignment_Statement>: Add
assertion that memset is not supposed to be used when the target has
a storage model with Copy_To routine.
Justin Squirek [Wed, 23 Nov 2022 07:56:45 +0000 (07:56 +0000)]
ada: Spurious error on Lock_Free protected type with discriminants
This patch corrects an issue in the compiler whereby unprefixed discriminants
appearing in protected subprograms were unable to be properly resolved -
leading to spurious resolution errors.
gcc/ada/
* sem_ch8.adb
(Find_Direct_Name): Remove bypass to reanalyze incorrectly
analyzed discriminals.
(Set_Entity_Or_Discriminal): Avoid resetting the entity field of a
discriminant reference to be the internally generated renaming
when we are in strict preanalysis mode.
Florian Weimer [Thu, 24 Nov 2022 10:00:54 +0000 (11:00 +0100)]
c: Propagate erroneous types to declaration specifiers [PR107805]
Without this change, finish_declspecs cannot tell that whether there
was an erroneous type specified, or no type at all. This may result
in additional diagnostics for implicit ints, or missing diagnostics
for multiple types.
PR c/107805
gcc/c/
* c-decl.cc (declspecs_add_type): Propagate error_mark_bode
from type to specs.
gcc/testsuite/
* gcc.dg/pr107805-1.c: New test.
* gcc.dg/pr107805-2.c: Likewise.
Jakub Jelinek [Thu, 24 Nov 2022 09:38:42 +0000 (10:38 +0100)]
libstdc++: Another merge from fast_float upstream [PR107468]
Upstream fast_float came up with a cheaper test for
fegetround () == FE_TONEAREST using one float addition, one subtraction
and one comparison. If we know we are rounding to nearest, we can use
fast path in more cases as before.
The following patch merges those changes into libstdc++.
2022-11-24 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/107468
* src/c++17/fast_float/MERGE: Adjust for merge from upstream.
* src/c++17/fast_float/fast_float.h: Merge from fast_float 2ef9abbcf6a11958b6fa685a89d0150022e82e78 commit.
Jakub Jelinek [Thu, 24 Nov 2022 09:37:50 +0000 (10:37 +0100)]
libstdc++: Workaround buggy printf on Solaris in to_chars/float128_c++23.cc test [PR107815]
As mentioned in the PR, Solaris apparently can handle right
printf ("%.0Lf\n", 1e+202L * __DBL_MAX__);
which prints 511 chars long number, but can't handle
printf ("%.0Lf\n", 1e+203L * __DBL_MAX__);
nor
printf ("%.0Lf\n", __LDBL_MAX__);
properly, instead of printing 512 chars long number for the former and
4933 chars long number for the second, it handles them as
if user asked for "%.0Le\n" in those cases.
The following patch disables the single problematic value that fails
in the test, and also fixes commented out debugging printouts.
2022-11-24 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/107815
* testsuite/20_util/to_chars/float128_c++23.cc (test): Disable
__FLT128_MAX__ test on Solaris. Fix up commented out debugging
printouts.
Martin Liska [Thu, 24 Nov 2022 07:40:10 +0000 (08:40 +0100)]
analyzer: fix Clang warnings
Fixes the following warnings:
gcc/analyzer/varargs.cc:655:8: warning: 'matches_call_types_p' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
gcc/analyzer/varargs.cc:707:50: warning: unused parameter 'cd' [-Wunused-parameter]
gcc/analyzer/varargs.cc:707:8: warning: 'matches_call_types_p' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
Aldy Hernandez [Tue, 22 Nov 2022 09:44:18 +0000 (10:44 +0100)]
Remove use_equiv_p in vr-values.cc
With no equivalences, the use_equiv_p argument in various methods in
simplify_using_ranges is always false. This means we can remove all
calls to compare_names, along with the function.
gcc/ChangeLog:
* vr-values.cc (simplify_using_ranges::compare_names): Remove.
(vrp_evaluate_conditional_warnv_with_ops): Remove call to
compare_names.
(simplify_using_ranges::vrp_visit_cond_stmt): Remove use_equiv_p
argument to vrp_evaluate_conditional_warnv_with_ops.
* vr-values.h (class simplify_using_ranges): Remove
compare_names.
Remove use_equiv_p to vrp_evaluate_conditional_warnv_with_ops.
Aldy Hernandez [Sat, 19 Nov 2022 10:11:25 +0000 (11:11 +0100)]
Remove unused legacy VRP code.
Removes unused legacy VRP code. The legacy mode in value_range's is
still around, as it can't be trivially deleted.
With this patch vr-values.cc melts away to simplify_using_ranges, but
I have avoided any renaming of actual files, since we have plans for
consolidation of other folding with ranges for the next release.
The issue is that at the write to *argc we don't know if argc could
point within *args, and so we conservatiely set *args to be unknown.
At the write "(*args)[0] = __builtin_malloc(42)" we have the result of
the allocation written through an unknown pointer, so we mark the
heap_allocated_region as having escaped.
Unfortunately, within store::canonicalize we overzealously purge the
heap allocated region, losing the information that it has escaped, and
thus errnoeously report a leak.
The first part of the fix is to update store::canonicalize so that it
doesn't purge heap_allocated_regions that are marked as escaping.
Doing so fixes the leak false positive, but leads to various state
explosions relating to anywhere we have a malloc/free pair in a loop,
where the analysis of the iteration appears to only have been reaching
a fixed state due to a bug in the state merger code that was erroneously
merging state about the region allocated in one iteration with that
of another. On touching that, the analyzer fails to reach a fixed state
on any loops containing a malloc/free pair, since each analysis of a
malloc was creating a new heap_allocated_region instance.
Hence the second part of the fix is to revamp how heap_allocated_regions
are managed within the analyzer. Rather than create a new one at each
analysis of a malloc call, instead we reuse them across the analysis,
only creating a new one if the current path's state is referencing all
of the existing ones. Hence the heap_allocated_region instances get
used in a fixed order along every analysis path, so e.g. at:
if (flag)
p = malloc (4096);
else
p = malloc (1024);
both paths now use the same heap_allocated_region for their malloc
calls - but we still end up with two enodes after the CFG merger, by
rejecting merger of states with non-equal dynamic extents.
gcc/analyzer/ChangeLog:
PR analyzer/106473
* call-summary.cc
(call_summary_replay::convert_region_from_summary_1): Update for
change to creation of heap-allocated regions.
* program-state.cc (test_program_state_1): Likewise.
(test_program_state_merging): Likewise.
* region-model-impl-calls.cc (kf_calloc::impl_call_pre): Likewise.
(kf_malloc::impl_call_pre): Likewise.
(kf_operator_new::impl_call_pre): Likewise.
(kf_realloc::impl_call_postsuccess_with_move::update_model): Likewise.
* region-model-manager.cc
(region_model_manager::create_region_for_heap_alloc): Convert
to...
(region_model_manager::get_or_create_region_for_heap_alloc):
...this, reusing an existing region if it's unreferenced in the
client state.
* region-model-manager.h (region_model_manager::get_num_regions): New.
(region_model_manager::create_region_for_heap_alloc): Convert to...
(region_model_manager::get_or_create_region_for_heap_alloc): ...this.
* region-model.cc (region_to_value_map::can_merge_with_p): Reject
merger when the values are different.
(region_model::create_region_for_heap_alloc): Convert to...
(region_model::get_or_create_region_for_heap_alloc): ...this.
(region_model::get_referenced_base_regions): New.
(selftest::test_state_merging): Update for change to creation of
heap-allocated regions.
(selftest::test_malloc_constraints): Likewise.
(selftest::test_malloc): Likewise.
* region-model.h: Include "sbitmap.h".
(region_model::create_region_for_heap_alloc): Convert to...
(region_model::get_or_create_region_for_heap_alloc): ...this.
(region_model::get_referenced_base_regions): New decl.
* store.cc (store::canonicalize): Don't purge a heap-allocated region
that's been marked as escaping.
gcc/testsuite/ChangeLog:
PR analyzer/106473
* gcc.dg/analyzer/aliasing-pr106473.c: New test.
* gcc.dg/analyzer/allocation-size-2.c: Add
-fanalyzer-fine-grained".
* gcc.dg/analyzer/allocation-size-3.c: Likewise.
* gcc.dg/analyzer/explode-1.c: Mark leak with XFAIL.
* gcc.dg/analyzer/explode-3.c: New test.
* gcc.dg/analyzer/malloc-reuse.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Hongyu Wang [Sat, 19 Nov 2022 01:38:00 +0000 (09:38 +0800)]
i386: Only enable small loop unrolling in backend [PR 107692]
Followed by the discussion in pr107692, -munroll-only-small-loops
Does not turns on/off -funroll-loops, and current check in
pass_rtl_unroll_loops::gate would cause -fno-unroll-loops do not take
effect. Revert the change about targetm.loop_unroll_adjust and apply
the backend option change to strictly follow the rule that
-funroll-loops takes full control of loop unrolling, and
munroll-only-small-loops just change its behavior to unroll small size
loops.
gcc/ChangeLog:
PR target/107692
* common/config/i386/i386-common.cc (ix86_optimization_table):
Enable loop unroll O2, disable -fweb and -frename-registers
by default.
* config/i386/i386-options.cc
(ix86_override_options_after_change):
Disable small loop unroll when funroll-loops enabled, reset
cunroll_grow_size when it is not explicitly enabled.
(ix86_option_override_internal): Call
ix86_override_options_after_change instead of calling
ix86_recompute_optlev_based_flags and ix86_default_align
separately.
* config/i386/i386.cc (ix86_loop_unroll_adjust): Adjust unroll
factor if -munroll-only-small-loops enabled.
* loop-init.cc (pass_rtl_unroll_loops::gate): Do not enable
loop unrolling for -O2-speed.
(pass_rtl_unroll_loops::execute): Rmove
targetm.loop_unroll_adjust check.
Rainer Orth [Wed, 23 Nov 2022 20:54:26 +0000 (21:54 +0100)]
analyzer: Use __builtin_alloca in gcc.dg/analyzer/call-summaries-2.c
gcc.dg/analyzer/call-summaries-2.c currently FAILs on Solaris:
FAIL: gcc.dg/analyzer/call-summaries-2.c (test for excess errors)
Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/analyzer/call-summaries-2.c:468:12:
warning: implicit declaration of function 'alloca' [-Wimplicit-function-declaration]
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/analyzer/call-summaries-2.c:468:12:
warning: incompatible implicit declaration of built-in function 'alloca' [-Wbuiltin-declaration-mismatch]
alloca is only declared in <alloca.h>, which isn't included indirectly
anywhere. To avoid this, I switched the test to use __builtin_alloca
instead, following the vast majority of analyzer tests that use alloca.
Tested no i386-pc-solaris2.11, sparc-sun-solaris2.11, and
x86_64-pc-linux-gnu.
Jakub Jelinek [Wed, 23 Nov 2022 18:09:31 +0000 (19:09 +0100)]
c: Fix compile time hog in c_genericize [PR107127]
The complex multiplications result in deeply nested set of many SAVE_EXPRs,
which takes even on fast machines over 5 minutes to walk.
This patch fixes that by using walk_tree_without_duplicates where it is
instant.
2022-11-23 Andrew Pinski <apinski@marvell.com>
Jakub Jelinek <jakub@redhat.com>
PR c/107127
* c-gimplify.cc (c_genericize): Use walk_tree_without_duplicates
instead of walk_tree for c_genericize_control_r.
Jakub Jelinek [Wed, 23 Nov 2022 13:43:48 +0000 (14:43 +0100)]
diagnostics: Fix selftest ICE in certain locales [PR107722]
As reported in the PR, since special_fname_builtin () call has been
introduced, the diagnostics code compares filename against _("<built-in>")
rather than "<built-in>", which means that if self tests are performed
with the string being translated, one self-test fails.
The following patch fixes that.
2022-11-23 Jakub Jelinek <jakub@redhat.com>
PR bootstrap/107722
* diagnostic.cc (test_diagnostic_get_location_text): Test
special_fname_builtin () rather than "<built-in>" and expect
special_fname_builtin () concatenated with ":" for it.
Jakub Jelinek [Wed, 23 Nov 2022 10:53:54 +0000 (11:53 +0100)]
libstdc++: Fix libstdc++ build on some targets [PR107811]
fast_float library relies on size_t being 32-bit or larger and float/double
being IEEE single/double. Otherwise we only use strtod/strtof.
In 3 spots I've used fast_float namespace stuff unconditionally in one
function, which breaks the build if fast_float is disabled.
2022-11-23 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/107811
* src/c++17/floating_from_chars.cc (__floating_from_chars_hex): Guard
fast_float uses with #if USE_LIB_FAST_FLOAT and for mantissa_bits and
exponent_bits provide a fallback.
Jakub Jelinek [Wed, 23 Nov 2022 08:29:04 +0000 (09:29 +0100)]
c++: Fix up -fcontract* options
I've noticed
+FAIL: compiler driver --help=c++ option(s): "^ +-.*[^:.]\$" absent from output: " -fcontract-build-level=[off|default|audit] Specify max contract level to generate runtime checks for"
error, this is due to missing dot at the end of the description.
The second part of the first hunk should fix that, but while at it,
I find it weird that some options don't have RejectNegative, yet
for options that accept an argument a negative option looks weird
and isn't really handled.
Though, shall we have those [on|off] options at all?
Those are inconsistent with all other boolean options gcc has.
Every other boolean option is -fwhatever for it being on
and -fno-whatever for it being off, shouldn't the options be
without arguments and accept negatives (-fcontract-assumption-mode
vs. -fno-contract-assumption-mode etc.)?
* config/loongarch/constraints.md (ZD): New constraint.
* config/loongarch/loongarch-def.c: Initial number of parallel prefetch.
* config/loongarch/loongarch-tune.h (struct loongarch_cache):
Define number of parallel prefetch.
* config/loongarch/loongarch.cc (loongarch_option_override_internal):
Set up parameters to be used in prefetching algorithm.
* config/loongarch/loongarch.md (prefetch): New template.