gcc.gnu.org Git - gcc.git/log

rtl-optimization/54052 - RTL SSA PHI insertion compile-time hog

The following tries to address the PHI insertion compile-time hog in
RTL fwprop observed with the PR54052 testcase where the loop computing
the "unfiltered" set of variables possibly needing PHI nodes for each
block exhibits quadratic compile-time and memory-use.

It does so by pruning the local DEFs with LR_OUT of the block, removing
regs that can never be LR_IN (defined by this block) in the dominance
frontier.

PR rtl-optimization/54052
* rtl-ssa/blocks.cc (function_info::place_phis): Filter
local defs by LR_OUT.

(cherry picked from commit c7151283dc747769d4ac4f216d8f519bda2569b5)

Daily bump.

diagnostics: fix corrupt json/SARIF on stderr [PR114348]

Various values of -fdiagnostics-format= request machine-readable output
on stderr, using JSON, but in various places we use fnotice to write
free-form text to stderr, such as "compilation terminated", leading to
corrupt JSON.

Fix by having fnotice skip the output for such cases.

Backported from r14-9554-g0bf99b1b7eda2f (using a variable rather
than a vfunc of class diagnostic_output_format, since the latter
was added in gcc 14)

gcc/ChangeLog:
PR middle-end/114348
* diagnostic.cc (output_format): New variable.
(fnotice): Bail out if the user requested one of the
machine-readable diagnostic output formats on stderr.
(diagnostic_output_format_init): Set output_format.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

Fix ICE in -fdiagnostics-generate-patch [PR112684]

Backported from r14-8255-ge254d1224df306.

gcc/ChangeLog:
PR middle-end/112684
* toplev.cc (toplev::main): Don't ICE in
-fdiagnostics-generate-patch when exiting after options,
since no edit context will have been created.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

diagnostics: fix ICE on sarif output when source file is unreadable [PR111700]

Backported from r14-4474-g94caa6a6b4bd73.

gcc/ChangeLog:
PR driver/111700
* input.cc (file_cache::add_file): Update leading comment to
clarify that it can fail.
(file_cache::lookup_or_add_file): Likewise.
(get_source_file_content): Gracefully handle lookup_or_add_file
failing.

gcc/testsuite/ChangeLog:
PR driver/111700
* c-c++-common/diagnostic-format-sarif-file-pr111700.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: fix ICE and false positive with -Wanalyzer-deref-before-check [PR114408]

Backported from commit r14-9646-g80a0cb37456c49 (moving testcase to gcc.dg
and handling conflict in kf.cc)

gcc/analyzer/ChangeLog:
PR analyzer/114408
* engine.cc (impl_run_checkers): Free up any dominance info that
we may have created.
* kf.cc (class kf_ubsan_handler): New.
(register_sanitizer_builtins): New.
(register_known_functions): Call register_sanitizer_builtins.

gcc/testsuite/ChangeLog:
PR analyzer/114408
* gcc.dg/analyzer/deref-before-check-pr114408.c: New test.
* c-c++-common/ubsan/analyzer-ice-pr114408.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: fix ICE due to type mismatch when replaying call summary [PR114473]

gcc/analyzer/ChangeLog:
PR analyzer/114473
* call-summary.cc
(call_summary_replay::convert_svalue_from_summary): Assert that
the types match.
(call_summary_replay::convert_region_from_summary): Likewise.
(call_summary_replay::convert_region_from_summary_1): Add missing
cast for the deref of RK_SYMBOLIC case.

gcc/testsuite/ChangeLog:
PR analyzer/114473
* gcc.dg/analyzer/call-summaries-pr114473.c: New test.

(cherry picked from commit r14-9697-gfdd59818e2abf6)

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: fix -Wanalyzer-deref-before-check false positive seen in loop header macro [PR109251]

Backported from commit r14-9586-g9093f275e0a343 (moving tests from
c-c++-common to gcc.dg)

gcc/analyzer/ChangeLog:
PR analyzer/109251
* sm-malloc.cc (deref_before_check::emit): Reject cases where the
check is in a loop header within a macro expansion.
(deref_before_check::loop_header_p): New.

gcc/testsuite/ChangeLog:
PR analyzer/109251
* gcc.dg/analyzer/deref-before-check-pr109251-1.c: New test.
* gcc.dg/analyzer/deref-before-check-pr109251-2.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: fix -Wanalyzer-va-arg-type-mismatch false +ve on int types [PR111289]

Backported from commit r14-9076-g5651ad62b08096 (moving new tests from
c-c++-common to gcc.dg).

gcc/analyzer/ChangeLog:
PR analyzer/111289
* varargs.cc (representable_in_integral_type_p): New.
(va_arg_compatible_types_p): Add "arg_sval" param. Handle integer
types.
(kf_va_arg::impl_call_pre): Pass arg_sval to
va_arg_compatible_types_p.

gcc/testsuite/ChangeLog:
PR analyzer/111289
* gcc.dg/analyzer/stdarg-pr111289-int.c: New test.
* gcc.dg/analyzer/stdarg-pr111289-ptr.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: fix skipping of debug stmts [PR113253]

PR analyzer/113253 reports a case where the analyzer output varied
with and without -g enabled.

The root cause was that debug stmts were in the
FOR_EACH_IMM_USE_FAST list for SSA names, leading to the analyzer's
state purging logic differing between the -g and non-debugging cases,
and thus leading to differences in the exploration of the user's code.

Fix by skipping such stmts in the state-purging logic, and removing
debug stmts when constructing the supergraph.

gcc/analyzer/ChangeLog:
PR analyzer/113253
* region-model.cc (region_model::on_stmt_pre): Add gcc_unreachable
for debug statements.
* state-purge.cc
(state_purge_per_ssa_name::state_purge_per_ssa_name): Skip any
debug stmts in the FOR_EACH_IMM_USE_FAST list.
* supergraph.cc (supergraph::supergraph): Don't add debug stmts
to the supernodes.

gcc/testsuite/ChangeLog:
PR analyzer/113253
* gcc.dg/analyzer/deref-before-check-pr113253.c: New test.

(cherry picked from commit r14-8670-gcc7aebff74d896)

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: fix defaults in compound assignments from non-zero offsets [PR112969]

Confusion in binding_cluster::maybe_get_compound_binding about whether
offsets are relative to the start of the region or to the start of the
cluster was leading to incorrect handling of default values, leading
to false positives from -Wanalyzer-use-of-uninitialized-value, from
-Wanalyzer-exposure-through-uninit-copy, and other logic errors.

Fixed thusly.

Backported from commit r14-8428-g6426d466779fa8 (keeping tests
in gcc.dg, rather than c-c++-common).

gcc/analyzer/ChangeLog:
PR analyzer/112969
* store.cc (binding_cluster::maybe_get_compound_binding): When
populating default_map, express the bit-range of the default key
for REG relative to REG, rather than to the base region.

gcc/testsuite/ChangeLog:
PR analyzer/112969
* gcc.dg/analyzer/compound-assignment-5.c (test_3): Remove
xfails, reorder tests.
* gcc.dg/analyzer/compound-assignment-pr112969.c: New test.
* gcc.dg/plugin/infoleak-pr112969.c: New test.
* gcc.dg/plugin/plugin.exp: Add infoleak-pr112969.c to
analyzer_kernel_plugin.c tests.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: casting all zeroes should give all zeroes [PR113333]

In particular, accessing the result of *calloc (1, SZ) (if non-NULL)
should be known to be all zeroes.

(backported from commit r14-7265-gd235bf2e807c5f)

gcc/analyzer/ChangeLog:
PR analyzer/113333
* region-model-manager.cc
(region_model_manager::maybe_fold_unaryop): Casting all zeroes
should give all zeroes.

gcc/testsuite/ChangeLog:
PR analyzer/113333
* gcc.dg/analyzer/calloc-1.c: Add tests.
* gcc.dg/analyzer/data-model-9.c: Update expected results.
* gcc.dg/analyzer/pr96639.c: Update expected results.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: fix deref-before-check false positives due to inlining [PR112790]

Backported from commit r14-6918-g5743e1899d5964 (moving testcase from
c-c++-common to gcc.dg).

gcc/analyzer/ChangeLog:
PR analyzer/112790
* checker-event.cc (class inlining_info): Move to...
* inlining-iterator.h (class inlining_info): ...here.
* sm-malloc.cc: Include "analyzer/inlining-iterator.h".
(maybe_complain_about_deref_before_check): Reject stmts that were
inlined from another function.

gcc/testsuite/ChangeLog:
PR analyzer/112790
* gcc.dg/analyzer/deref-before-check-pr112790.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: fix ICE for 2 bits before the start of base region [PR112889]

Cncrete bindings were using -1 and -2 in the offset field to signify
deleted and empty hash slots, but these are valid values, leading to
assertion failures inside hash_map::put on a debug build, and probable
bugs in a release build.

(gdb) call k.dump(true)
start: -2, size: 1, next: -1

(gdb) p k.is_empty()
$6 = true

Fix by using the size field rather than the offset.

Backported from commit r14-6297-g775aeabcb870b7 (moving the testcase
from c-c++-common to gcc.dg).

gcc/analyzer/ChangeLog:
PR analyzer/112889
* store.h (concrete_binding::concrete_binding): Strengthen
assertion to require size to be be positive, rather than just
non-zero.
(concrete_binding::mark_deleted): Use size rather than start bit
offset.
(concrete_binding::mark_empty): Likewise.
(concrete_binding::is_deleted): Likewise.
(concrete_binding::is_empty): Likewise.

gcc/testsuite/ChangeLog:
PR analyzer/112889
* gcc.dg/analyzer/ice-pr112889.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

jit: dump string literal initializers correctly

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
gcc/jit/ChangeLog:
* jit-recording.cc (recording::global::write_to_dump): Fix
dump of string literal initializers.

(cherry picked from commit r14-4923-gac66744d94226a)

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

testsuite, analyzer: add test case [PR108171]

The ICE in PR analyzer/108171 appears to be a dup of the recently fixed
PR analyzer/110882 and is likewise fixed by it; adding this test case.

gcc/testsuite/ChangeLog:
PR analyzer/108171
* gcc.dg/analyzer/pr108171.c: New test.

(cherry picked from commit r14-2957-gf80efa49b7a163)

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: fix ICE on zero-sized arrays [PR110882]

gcc/analyzer/ChangeLog:
PR analyzer/110882
* region.cc (int_size_in_bits): Fail on zero-sized types.

gcc/testsuite/ChangeLog:
PR analyzer/110882
* gcc.dg/analyzer/pr110882.c: New test.

(cherry picked from commit r14-2955-gc62f93d1e0383d)

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: fix ICE on division of tainted floating-point values [PR110700]

gcc/analyzer/ChangeLog:
PR analyzer/110700
* region-model-manager.cc
(region_model_manager::get_or_create_int_cst): Assert that we have
an integral or pointer type.
* sm-taint.cc (taint_state_machine::check_for_tainted_divisor):
Don't check non-integral types.

gcc/testsuite/ChangeLog:
PR analyzer/110700
* gcc.dg/analyzer/taint-divisor-2.c: New test.

(cherry picked from commit r14-2658-gb86c0fe327a519)

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

jit.exp: handle dwarf version mismatch in jit-check-debug-info [PR110466]

gcc/testsuite/ChangeLog:
PR jit/110466
* jit.dg/jit.exp (jit-check-debug-info): Gracefully handle too
early versions of gdb that don't support our dwarf version, via
"unsupported".

(cherry picked from commit r14-2223-gc3c0ba5436170e)

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

jit: avoid using __vector in testcase [PR110466]

r13-4531-gd2e782cb99c311 added test coverage to libgccjit's vector
support, but used __vector, which doesn't work on Power. Additionally
the size param to gcc_jit_type_get_vector was wrong.

Fixed thusly.

gcc/testsuite/ChangeLog:
PR jit/110466
* jit.dg/test-expressions.c (run_test_of_comparison): Fix size
param to gcc_jit_type_get_vector.
(verify_comparisons): Use a typedef rather than __vector.

(cherry picked from commit r14-2222-g6735d660839533)

Co-authored-by: Marek Polacek <polacek@redhat.com>
Signed-off-by: David Malcolm <dmalcolm@redhat.com>

testsuite: Add more allocation size tests for conjured svalues [PR110014]

This patch adds the reproducers reported in PR 110014 as test cases. The
false positives in those cases are already fixed with PR 109577.

2023-06-09 Tim Lange <mail@tim-lange.me>

PR analyzer/110014

gcc/testsuite/ChangeLog:

* gcc.dg/analyzer/realloc-pr110014.c: New tests.

(cherry picked from commit r14-1685-g39adc5eebd61fd276f3f1ef9d7228756a35bd0cb)

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: Fix allocation size false positive on conjured svalue [PR109577]

Currently, the analyzer tries to prove that the allocation size is a
multiple of the pointee's type size.  This patch reverses the behavior
to try to prove that the expression is not a multiple of the pointee's
type size.  With this change, each unhandled case should be gracefully
considered as correct.  This fixes the bug reported in PR 109577 by
Paul Eggert.

Regression-tested on Linux x86-64 with -m32 and -m64.

2023-06-09  Tim Lange  <mail@tim-lange.me>

PR analyzer/109577

gcc/analyzer/ChangeLog:

* constraint-manager.cc (class sval_finder): Visitor to find
childs in svalue trees.
(constraint_manager::sval_constrained_p): Add new function to
check whether a sval might be part of an constraint.
* constraint-manager.h: Add sval_constrained_p function.
* region-model.cc (class size_visitor): Reverse behavior to not
emit a warning on not explicitly considered cases.
(region_model::check_region_size):
Adapt to size_visitor changes.

gcc/testsuite/ChangeLog:

* gcc.dg/analyzer/allocation-size-2.c: Change expected output
and add new test case.
* gcc.dg/analyzer/pr109577.c: New test.

(cherry picked from commit r14-1684-g1d57a2232575913ad1085bac0ba5e22b58185179)

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: add caching to globals with initializers [PR110112]

PR analyzer/110112 notes that -fanalyzer is extremely slow on a source
file with large read-only static arrays, repeatedly building the
same compound_svalue representing the full initializer, and repeatedly
building svalues representing parts of the the full initialiazer.

This patch adds caches for both of these; together they reduce the time
taken by -fanalyzer -O2 on the testcase in the bug for an optimized
build:
  91.2s : no caches (status quo)
  32.4s : cache in decl_region::get_svalue_for_constructor
   3.7s : cache in region::get_initial_value_at_main
   3.1s : both caches (this patch)

gcc/analyzer/ChangeLog:
PR analyzer/110112
* region-model.cc (region_model::get_initial_value_for_global):
Move code to region::calc_initial_value_at_main.
* region.cc (region::get_initial_value_at_main): New function.
(region::calc_initial_value_at_main): New function, based on code
in region_model::get_initial_value_for_global.
(region::region): Initialize m_cached_init_sval_at_main.
(decl_region::get_svalue_for_constructor): Add a cache, splitting
out body to...
(decl_region::calc_svalue_for_constructor): ...this new function.
* region.h (region::get_initial_value_at_main): New decl.
(region::calc_initial_value_at_main): New decl.
(region::m_cached_init_sval_at_main): New field.
(decl_region::decl_region): Initialize m_ctor_svalue.
(decl_region::calc_svalue_for_constructor): New decl.
(decl_region::m_ctor_svalue): New field.

(cherry picked from commit r14-1664-gfe9771b59f576f)

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

[PR114415][scheduler]: Fixing wrong code generation

For the test case, the insn scheduler (working for live range
shrinkage) moves insns modifying stack memory before an insn reserving
the stack memory. Comments in the patch contains more details about
the problem and its solution.

gcc/ChangeLog:

PR rtl-optimization/114415
* sched-deps.cc (add_insn_mem_dependence): Add memory check for mem argument.
(sched_analyze_1): Treat stack pointer modification as memory read.
(sched_analyze_2, sched_analyze_insn): Add memory guard for processing pending_read_mems.
* sched-int.h (deps_desc): Add comment to pending_read_mems.

gcc/testsuite/ChangeLog:

PR rtl-optimization/114415
* gcc.target/i386/pr114415.c: New test.

Fix range-ops operator_addr.

Lack of symbolic information prevents op1_range from being able to draw
the same conclusions as fold_range can.

PR tree-optimization/111009
gcc/
* range-op.cc (operator_addr_expr::op1_range): Be more restrictive.
* value-range.h (contains_zero_p): New.

gcc/testsuite/
* gcc.dg/pr111009.c: New.

Daily bump.

testsuite: Fix up vector-subaccess-1.C test for ia32 [PR89224]

The test FAILs on i686-linux due to
.../gcc/testsuite/g++.dg/torture/vector-subaccess-1.C:16:6: warning: SSE vector argument without SSE enabled changes the ABI [-Wpsabi]
excess warnings.

This fixes it by adding -Wno-psabi, like commonly done in other tests.

2024-05-09 Jakub Jelinek <jakub@redhat.com>

PR c++/89224
* g++.dg/torture/vector-subaccess-1.C: Add -Wno-psabi as additional
options.

(cherry picked from commit 8fb65ec816ff8f0d529b6d30821abace4328c9a2)

AVR: target/114981 - Support __builtin_powi[l] / __powidf2.

This supports __powidf2 by means of a double wrapper for already
existing f7_powi (renamed to __f7_powi by f7-renames.h).
It tweaks the implementation so that it does not perform trivial
multiplications with 1.0 any more, but instead uses a move.
It also fixes the last statement of f7_powi, which was wrong.
Notice that f7_powi was unused until now.

PR target/114981
libgcc/config/avr/libf7/
* libf7-common.mk (F7_ASM_PARTS): Add D_powi
* libf7-asm.sx (F7MOD_D_powi_, __powidf2): New module and function.
* libf7.c (f7_powi): Fix last (wrong) statement.
Tweak trivial multiplications with 1.0.

gcc/testsuite/
* gcc.target/avr/pr114981-powil.c: New test.

(cherry picked from commit de4eea7d7ea86e54843507c68d6672eca9d8c7bb)

reassoc: Fix up optimize_range_tests_to_bit_test [PR114965]

The optimize_range_tests_to_bit_test optimization normally emits a range
test first:
          if (entry_test_needed)
            {
              tem = build_range_check (loc, optype, unshare_expr (exp),
                                       false, lowi, high);
              if (tem == NULL_TREE || is_gimple_val (tem))
                continue;
            }
so during the bit test we already know that exp is in the [lowi, high]
range, but skips it if we have range info which tells us this isn't
necessary.
Also, normally it emits shifts by exp - lowi counter, but has an
optimization to use just exp counter if the mask isn't a more expensive
constant in that case and lowi is > 0 and high is smaller than prec.

The following testcase is miscompiled because the two abnormal cases
are triggered.  The range of exp is [43, 43][48, 48][95, 95], so we on
64-bit arch decide we don't need the entry test, because 95 - 43 < 64.
And we also decide to use just exp as counter, because the range test
tests just for exp == 43 || exp == 48, so high is smaller than 64 too.
Because 95 is in the exp range, we can't do that, we'd either need to
do a range test first, i.e.
if (exp - 43U <= 48U - 43U) if ((1UL << exp) & mask1))
or need to subtract lowi from the shift counter, i.e.
if ((1UL << (exp - 43)) & mask2)
but can't do both unless r.upper_bound () is < prec.

The following patch ensures that.

2024-05-08  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/114965
* tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): Don't try to
optimize away exp - lowi subtraction from shift count unless entry
test is emitted or unless r.upper_bound () is smaller than prec.

* gcc.c-torture/execute/pr114965.c: New test.

(cherry picked from commit 9adec2d91e62a479474ae79df5b455fd4b8463ba)

expansion: Use __trunchfbf2 calls rather than __extendhfbf2 [PR114907]

The HF and BF modes have the same size/precision and neither is
a subset nor superset of the other.
So, using either __extendhfbf2 or __trunchfbf2 is weird.
The expansion apparently emits __extendhfbf2, but on the libgcc side
we apparently have __trunchfbf2 implemented.

I think it is easier to switch to using what is available rather than
adding new entrypoints to libgcc, even alias, because this is backportable.

2024-05-07 Jakub Jelinek <jakub@redhat.com>

PR middle-end/114907
* expr.cc (convert_mode_scalar): Use trunc_optab rather than
sext_optab for HF->BF conversions.
* optabs-libfuncs.cc (gen_trunc_conv_libfunc): Likewise.

* gcc.dg/pr114907.c: New test.

(cherry picked from commit 28ee13db2e9d995bd3728c4ff3a3545e24b39cd2)

tree-inline: Remove .ASAN_MARK calls when inlining functions into no_sanitize callers [PR114956]

In r9-5742 we've started allowing to inline always_inline functions into
functions which have disabled e.g. address sanitization even when the
always_inline function is implicitly from command line options sanitized.

This mostly works fine because most of the asan instrumentation is done only
late after ipa, but as the following testcase the .ASAN_MARK ifn calls
gimplifier adds can result in ICEs.

Fixed by dropping those during inlining, similarly to how we drop
.TSAN_FUNC_EXIT calls.

2024-05-07 Jakub Jelinek <jakub@redhat.com>

PR sanitizer/114956
* tree-inline.cc: Include asan.h.
(copy_bb): Remove also .ASAN_MARK calls if id->dst_fn has asan/hwasan
sanitization disabled.

* gcc.dg/asan/pr114956.c: New test.

(cherry picked from commit d4e25cf4f7c1f51a8824cc62bbb85a81a41b829a)

gimple-ssa-sprintf: Use [0, 1] range for %lc with (wint_t) 0 argument [PR114876]

Seems when Martin S. implemented this, he coded there strict reading
of the standard, which said that %lc with (wint_t) 0 argument is handled
as wchar_t[2] temp = { arg, 0 }; %ls with temp arg and so shouldn't print
any values.  But, most of the libc implementations actually handled that
case like %c with '\0' argument, adding a single NUL character, the only
known exception is musl.
Recently, C23 changed this in response to GB-141 and POSIX in
https://austingroupbugs.net/view.php?id=1647
so that it should have the same behavior as %c with '\0'.

Because there is implementation divergence, the following patch uses
a range rather than hardcoding it to all 1s (i.e. the %c behavior),
though the likely case is still 1 (forward looking plus most of
implementations).
The res.knownrange = true; assignment removed is redundant due to
the same assignment done unconditionally before the if statement,
rest is formatting fixes.

I don't think the min >= 0 && min < 128 case is right either, I'd think
it should be min >= 0 && max < 128, otherwise it is just some possible
inputs are (maybe) ASCII and there can be others, but this code is a total
mess anyway, with the min, max, likely (somewhere in [min, max]?) and then
unlikely possibly larger than max, dunno, perhaps for at least some chars
in the ASCII range the likely case could be for the ascii case; so perhaps
just the one_2_one_ascii shouldn't set max to 1 and mayfail should be true
for max >= 128.  Anyway, didn't feel I should touch that right now.

2024-04-30  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/114876
* gimple-ssa-sprintf.cc (format_character): For min == 0 && max == 0,
set max, likely and unlikely members to 1 rather than 0.  Remove
useless res.knownrange = true;.  Formatting fixes.

* gcc.dg/pr114876.c: New test.
* gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Adjust expected
diagnostics.

(cherry picked from commit 6c6b70f07208ca14ba783933988c04c6fc2fff42)

c++: Fix constexpr evaluation of parameters passed by invisible reference [PR111284]

My r9-6136 changes to make a copy of constexpr function bodies before
genericization modifies it broke the constant evaluation of non-POD
arguments passed by value.
In the callers such arguments are passed as reference to usually a
TARGET_EXPR, but on the callee side until genericization they are just
direct uses of a PARM_DECL with some class type.
In cxx_bind_parameters_in_call I've used convert_from_reference to
pretend it is passed by value and then cxx_eval_constant_expression
is called there and evaluates that as an rvalue, followed by
adjust_temp_type if the types don't match exactly (e.g. const Foo
argument and passing to it reference to Foo TARGET_EXPR).

The reason this doesn't work is that when the TARGET_EXPR in the caller
is constant initialized, this for it is the address of the TARGET_EXPR_SLOT,
but if the code later on pretends the PARM_DECL is just initialized to the
rvalue of the constant evaluation of the TARGET_EXPR, it is as if there
is a bitwise copy of the TARGET_EXPR to the callee, so this in the callee
is then address of the PARM_DECL in the callee.

The following patch attempts to fix that by constexpr evaluation of such
arguments in the caller as an lvalue instead of rvalue, and on the callee
side when seeing such a PARM_DECL, if we want an lvalue, lookup the value
(lvalue) saved in ctx->globals (if any), and if wanting an rvalue,
recursing with vc_prvalue on the looked up value (because it is there
as an lvalue, nor rvalue).

adjust_temp_type doesn't work for lvalues of non-scalarish types, for
such types it relies on changing the type of a CONSTRUCTOR, but on the
other side we know what we pass to the argument is addressable, so
the patch on type mismatch takes address of the argument value, casts
to reference to the desired type and dereferences it.

2024-04-25 Jakub Jelinek <jakub@redhat.com>

PR c++/111284
* constexpr.cc (cxx_bind_parameters_in_call): For PARM_DECLs with
TREE_ADDRESSABLE types use vc_glvalue rather than vc_prvalue for
cxx_eval_constant_expression and if it doesn't have the same
type as it should, cast the reference type to reference to type
before convert_from_reference and instead of adjust_temp_type
take address of the arg, cast to reference to type and then
convert_from_reference.
(cxx_eval_constant_expression) <case PARM_DECL>: For lval case
on parameters with TREE_ADDRESSABLE types lookup result in
ctx->globals if possible. Otherwise if lookup in ctx->globals
was successful for parameter with TREE_ADDRESSABLE type,
recurse with vc_prvalue on the returned value.

* g++.dg/cpp1z/constexpr-111284.C: New test.

(cherry picked from commit f541757ba4632e204169dd08a5f10c782199af42)

openmp: Copy DECL_LANG_SPECIFIC and DECL_LANG_FLAG_? to tree-nested decl copy [PR114825]

tree-nested.cc creates in 2 spots artificial VAR_DECLs, one of them is used
both for debug info and OpenMP/OpenACC lowering purposes, the other solely for
OpenMP/OpenACC lowering purposes.
When the decls are used in OpenMP/OpenACC lowering, the OMP langhooks (mostly
Fortran, C just a little and C++ doesn't have nested functions) then inspect
the flags on the vars and based on that decide how to lower the corresponding
clauses.

Unfortunately we weren't copying DECL_LANG_SPECIFIC and DECL_LANG_FLAG_?, so
the langhooks made decisions on the default flags on those instead.
As the original decl isn't necessarily a VAR_DECL, could be e.g. PARM_DECL,
using copy_node wouldn't work properly, so this patch just copies those
flags in addition to other flags it was copying already. And I've removed
code duplication by introducing a helper function which does copying common
to both uses.

2024-04-25 Jakub Jelinek <jakub@redhat.com>

PR fortran/114825
* tree-nested.cc (get_debug_decl): New function.
(get_nonlocal_debug_decl): Use it.
(get_local_debug_decl): Likewise.

* gfortran.dg/gomp/pr114825.f90: New test.

(cherry picked from commit 14d48516e588ad2b35e2007b3970bdcb1b3f145c)

libstdc++: Workaround kernel-headers on s390x-linux

We see
FAIL: 17_intro/headers/c++1998/all_attributes.cc   (test for excess errors)
FAIL: 17_intro/headers/c++2011/all_attributes.cc   (test for excess errors)
FAIL: 17_intro/headers/c++2014/all_attributes.cc   (test for excess errors)
FAIL: 17_intro/headers/c++2017/all_attributes.cc   (test for excess errors)
FAIL: 17_intro/headers/c++2020/all_attributes.cc   (test for excess errors)
FAIL: 17_intro/names.cc  -std=gnu++17 (test for excess errors)
on s390x-linux.
The first 5 are due to kernel-headers not using uglified attribute names,
where <asm/types.h> contains
__attribute__((packed, aligned(4)))
I've filed a downstream bugreport for this in
https://bugzilla.redhat.com/show_bug.cgi?id=2276084
(not really sure where to report kernel-headers issues upstream), while the
last one is due to <sys/ucontext.h> from glibc containing:
  #ifdef __USE_MISC
  # define __ctx(fld) fld
  #else
  # define __ctx(fld) __ ## fld
  #endif
  ...
  typedef union
    {
      double  __ctx(d);
      float   __ctx(f);
    } fpreg_t;
and g++ predefining -D_GNU_SOURCE which implies define __USE_MISC.

The following patch adds a workaround for this on the libstdc++ testsuite
side.

2024-04-22  Jakub Jelinek  <jakub@redhat.com>

* testsuite/17_intro/names.cc (d, f): Undefine on s390*-linux*.
* testsuite/17_intro/headers/c++1998/all_attributes.cc (packed): Don't
define on s390.
* testsuite/17_intro/headers/c++2011/all_attributes.cc (packed):
Likewise.
* testsuite/17_intro/headers/c++2014/all_attributes.cc (packed):
Likewise.
* testsuite/17_intro/headers/c++2017/all_attributes.cc (packed):
Likewise.
* testsuite/17_intro/headers/c++2020/all_attributes.cc (packed):
Likewise.

(cherry picked from commit cf5f7791056b3ed993bc8024be767a86157514a9)

Fix PR 110066: crash with -pg -static on riscv

The problem -fasynchronous-unwind-tables is on by default for riscv linux
We need turn it off for crt*.o because it would make __EH_FRAME_BEGIN__ point
to .eh_frame data from crtbeginT.o instead of the user-defined object
during static linking.

This turns it off.

OK?

libgcc/ChangeLog:

* config.host (riscv*-*-linux*): Add t-crtstuff to tmake_file.
(riscv*-*-freebsd*): Likewise.
* config/riscv/t-crtstuff: New file.

(cherry picked from commit bbc1a102735c72e3c5a4dede8ab382813d12b058)

tree-optimization/114375 - disallow SLP discovery of permuted mask loads

We cannot currently handle permutations of mask loads in code generation
or permute optimization. But we simply drop any permutation on the
floor, so the following instead rejects the SLP build rather than
producing wrong-code. I've also made sure to reject them in
vectorizable_load for completeness.

PR tree-optimization/114375
* tree-vect-slp.cc (vect_build_slp_tree_2): Compute the
load permutation for masked loads but reject it when any
such is necessary.
* tree-vect-stmts.cc (vectorizable_load): Reject masked
VMAT_ELEMENTWISE and VMAT_STRIDED_SLP as those are not
supported.

* gcc.dg/vect/vect-pr114375.c: New testcase.

(cherry picked from commit 94c3508c5a14d1948fe3bffa9e16c6f3d9c2836a)

cfgrtl: Fix MEM_EXPR update in duplicate_insn_chain [PR114924]

The PR shows that when cfgrtl.cc:duplicate_insn_chain attempts to
update the MR_DEPENDENCE_CLIQUE information for a MEM_EXPR we can end up
accidentally dropping (e.g.) an ARRAY_REF from the MEM_EXPR and end up
replacing it with the underlying MEM_REF. This leads to an
inconsistency in the MEM_EXPR information, and could lead to wrong code.

While the walk down to the MEM_REF is necessary to update
MR_DEPENDENCE_CLIQUE, we should use the outer tree expression for the
MEM_EXPR. This patch does that.

gcc/ChangeLog:

PR rtl-optimization/114924
* cfgrtl.cc (duplicate_insn_chain): When updating MEM_EXPRs,
don't strip (e.g.) ARRAY_REFs from the final MEM_EXPR.

(cherry picked from commit fe40d525619eee9c2821126390df75068df4773a)

middle-end: Fix ICE in poly-int.h due to SLP.

Adds a check to ensure that the input vector arguments
to a function are not variable length. Previously, only the
output vector of a function was checked.

The ICE in question is within the neon-sve-bridge.c test,
and is related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111268

gcc/ChangeLog:
PR tree-optimization/111268
* tree-vect-slp.cc (vectorizable_slp_permutation_1):
Add variable-length check for vector input arguments
to a function.

(cherry picked from commit 4571b4d413a4ba5f1e2d429a2623180ad1c73c0f)

[Committed] Avoid FAIL of gcc.target/i386/pr110792.c

My apologies (again), I managed to mess up the 64-bit version of the
test case for PR 110792.  Unlike the 32-bit version, the 64-bit case
contains exactly the same load instructions, just in a different order
making the correct and incorrect behaviours impossible to distinguish
with a scan-assembler-not.  Somewhere between checking that this test
failed in a clean tree without the patch, and getting the escaping
correct, I'd failed to notice that this also FAILs in the patched tree.
Doh!  Instead of removing the test completely, I've left it as a
compilation test.

The original fix is tested by the 32-bit test case.

Committed to mainline as obvious.  Sorry for the incovenience.

2023-08-06  Roger Sayle  <roger@nextmovesoftware.com>

gcc/testsuite/ChangeLog
PR target/110792
* gcc.target/i386/pr110792.c: Remove dg-final scan-assembler-not.

(cherry picked from commit 529909f9e92dd3b0ed0383f45a44d2b5f8a58958)

PR target/110792: Early clobber issues with rot32di2_doubleword on i386.

This patch is a conservative fix for PR target/110792, a wrong-code
regression affecting doubleword rotations by BITS_PER_WORD, which
effectively swaps the highpart and lowpart words, when the source to be
rotated resides in memory. The issue is that if the register used to
hold the lowpart of the destination is mentioned in the address of
the memory operand, the current define_insn_and_split unintentionally
clobbers it before reading the highpart.

Hence, for the testcase, the incorrectly generated code looks like:

        salq    $4, %rdi // calculate address
        movq    WHIRL_S+8(%rdi), %rdi // accidentally clobber addr
        movq    WHIRL_S(%rdi), %rbp // load (wrong) lowpart

Traditionally, the textbook way to fix this would be to add an
explicit early clobber to the instruction's constraints.

(define_insn_and_split "<insn>32di2_doubleword"
- [(set (match_operand:DI 0 "register_operand" "=r,r,r")
+ [(set (match_operand:DI 0 "register_operand" "=r,r,&r")
        (any_rotate:DI (match_operand:DI 1 "nonimmediate_operand" "0,r,o")
                       (const_int 32)))]

but unfortunately this currently generates significantly worse code,
due to a strange choice of reloads (effectively memcpy), which ends up
looking like:

        salq    $4, %rdi // calculate address
        movdqa  WHIRL_S(%rdi), %xmm0 // load the double word in SSE reg.
        movaps  %xmm0, -16(%rsp) // store the SSE reg back to the stack
        movq    -8(%rsp), %rdi // load highpart
        movq    -16(%rsp), %rbp // load lowpart

Note that reload's "&" doesn't distinguish between the memory being
early clobbered, vs the registers used in an addressing mode being
early clobbered.

The fix proposed in this patch is to remove the third alternative, that
allowed offsetable memory as an operand, forcing reload to place the
operand into a register before the rotation.  This results in:

        salq    $4, %rdi
        movq    WHIRL_S(%rdi), %rax
        movq    WHIRL_S+8(%rdi), %rdi
        movq    %rax, %rbp

I believe there's a more advanced solution, by swapping the order of
the loads (if first destination register is mentioned in the address),
or inserting a lea insn (if both destination registers are mentioned
in the address), but this fix is a minimal "safe" solution, that
should hopefully be suitable for backporting.

2023-08-03  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
PR target/110792
* config/i386/i386.md (<any_rotate>ti3): For rotations by 64 bits
place operand in a register before gen_<insn>64ti2_doubleword.
(<any_rotate>di3): Likewise, for rotations by 32 bits, place
operand in a register before gen_<insn>32di2_doubleword.
(<any_rotate>32di2_doubleword): Constrain operand to be in register.
(<any_rotate>64ti2_doubleword): Likewise.

gcc/testsuite/ChangeLog
PR target/110792
* g++.target/i386/pr110792.C: New 32-bit C++ test case.
* gcc.target/i386/pr110792.c: New 64-bit C test case.

(cherry picked from commit 790c1f60a5662b16eb19eb4b81922995863c7571)

c++: Add testcase for this PR [PR97990]

This testcase was fixed by r14-5934-gf26d68d5d128c8 but we should add
one to make sure it does not regress again.

Committed as obvious after a quick test on the testcase.

PR c++/97990

gcc/testsuite/ChangeLog:

* g++.dg/torture/vector-struct-1.C: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
(cherry picked from commit 5f1438db419c9eb8901d1d1d7f98fb69082aec8e)

middle-end/112732 - stray TYPE_ALIAS_SET in type variant

The following fixes a stray TYPE_ALIAS_SET in a type variant built
by build_opaque_vector_type which is diagnosed by type checking
enabled with -flto.

PR middle-end/112732
* tree.cc (build_opaque_vector_type): Reset TYPE_ALIAS_SET
of the newly built type.

(cherry picked from commit f26d68d5d128c86faaceeb81b1e8f22254ad53df)

tree-optimization/112281 - loop distribution and zero dependence distances

The following fixes an omission in dependence testing for loop
distribution. When the overall dependence distance is not zero but
the dependence direction in the innermost common loop is = there is
a conflict between the partitions and we have to merge them.

PR tree-optimization/112281
* tree-loop-distribution.cc
(loop_distribution::pg_add_dependence_edges): For = in the
innermost common loop record a partition conflict.

* gcc.dg/torture/pr112281-1.c: New testcase.
* gcc.dg/torture/pr112281-2.c: Likewise.

(cherry picked from commit 3b34902417259031823bff7f853f615a60464bbd)

tree-optimization/112991 - re-do PR112961 fix

The following does away with the fake edge adding as in the original
PR112961 fix and instead exposes handling of entry PHIs as additional
parameter of the region VN run.

PR tree-optimization/112991
PR tree-optimization/112961
* tree-ssa-sccvn.h (do_rpo_vn): Add skip_entry_phis argument.
* tree-ssa-sccvn.cc (do_rpo_vn): Likewise.
(do_rpo_vn_1): Likewise, merge with auto-processing.
(run_rpo_vn): Adjust.
(pass_fre::execute): Likewise.
* tree-if-conv.cc (tree_if_conversion): Revert last change.
Value-number latch block but disable value-numbering of
entry PHIs.
* tree-ssa-uninit.cc (execute_early_warn_uninitialized): Adjust.

* gcc.dg/torture/pr112991.c: New testcase.
* g++.dg/vect/pr112961.cc: Likewise.

(cherry picked from commit 93db32a4146afd2a6d90410691351a56768167c9)

middle-end/113396 - int128 array index and value-ranges

The following fixes bogus truncation of a value-range for an int128
array index when computing the maximum extent for a variable array
reference. Instead of possibly slowing things down by using
widest_int the following makes sure the range bounds fit within
the constraints offset_int were designed for.

PR middle-end/113396
* tree-dfa.cc (get_ref_base_and_extent): Use index range
bounds only if they fit within the address-range constraints
of offset_int.

* gcc.dg/torture/pr113396.c: New testcase.

(cherry picked from commit 6a55e39bdb1fdb570730c08413ebbe744e493411)

Fortran: Generate new charlens for shared symbol typespecs [PR89462]

2024-04-25 Paul Thomas <pault@gcc.gnu.org>
Jakub Jelinek <jakub@gcc.gnu.org>

gcc/fortran
PR fortran/89462
* decl.cc (build_sym): Add an extra argument 'elem'. If 'elem'
is greater than 1, gfc_new_charlen is called to generate a new
charlen, registered in the symbol namespace.
(variable_decl, enumerator_decl): Set the new argument in the
calls to build_sym.

gcc/testsuite/
PR fortran/89462
* gfortran.dg/pr89462.f90: New test.

(cherry picked from commit 1fd5a07444776d76cdd6a2eee7df0478201197a5)

Fortran: Fix ICE in gfc_trans_create_temp_array from bad type [PR93678]

2024-04-25 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/93678
* trans-expr.cc (gfc_conv_procedure_call): Use the interface,
where possible, to obtain the type of character procedure
pointers of class entities.

gcc/testsuite/
PR fortran/93678
* gfortran.dg/pr93678.f90: New test.

(cherry picked from commit c058105bc47a0701e157d1028e60f48554561f9f)

Fortran: Fix ICE in gfc_trans_pointer_assignment [PR113956]

2024-04-09 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/113956
* trans-expr.cc (gfc_trans_pointer_assignment): Remove assert
causing the ICE since it was unnecesary.

gcc/testsuite/
PR fortran/113956
* gfortran.dg/pr113956.f90: New test.

(cherry picked from commit 88aea122a7ee639230bf17a9eda4bf8a5eb7e282)

Fortran: Fix ICE in trans-stmt.cc(gfc_trans_call) [PR114535]

2024-04-09 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/114535
* resolve.cc (resolve_symbol): Remove last chunk that checked
for finalization of unreferenced symbols.

gcc/testsuite/
PR fortran/114535
* gfortran.dg/pr114535d.f90: New test.
* gfortran.dg/pr114535iv.f90: Additional source.

(cherry picked from commit de82b0cf981e49a0bda957c0ac31146b17407e23)

c++/c-common: Fix convert_vector_to_array_for_subscript for qualified vector types [PR89224]

After r7-987-gf17a223de829cb, the access for the elements of a vector type would lose the qualifiers.
So if we had `constvector[0]`, the type of the element of the array would not have const on it.
This was due to a missing build_qualified_type for the inner type of the vector when building the array type.
We need to add back the call to build_qualified_type and now the access has the correct qualifiers. So the
overloads and even if it is a lvalue or rvalue is correctly done.

Note we correctly now reject the testcase gcc.dg/pr83415.c which was incorrectly accepted after r7-987-gf17a223de829cb.

Built and tested for aarch64-linux-gnu.

PR c++/89224

gcc/c-family/ChangeLog:

* c-common.cc (convert_vector_to_array_for_subscript): Call build_qualified_type
for the inner type.

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_array_reference): Compare main variants
for the vector/array types instead of the types directly.

gcc/testsuite/ChangeLog:

* g++.dg/torture/vector-subaccess-1.C: New test.
* gcc.dg/pr83415.c: Change warning to error.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
(cherry picked from commit 4421d35167b3083e0f2e4c84c91fded09a30cf22)

libstdc++: Fix conversion of simd to vector builtin

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/114803
* include/experimental/bits/simd_builtin.h
(_SimdBase2::operator __vector_type_t): There is no __builtin()
function in _SimdWrapper, instead use its conversion operator.
* testsuite/experimental/simd/pr114803_vecbuiltin_cvt.cc: New
test.

(cherry picked from commit 7ef139146a8923a8719873ca3fdae175668e8d63)

libstdc++: Silence irrelevant warnings in <experimental/simd>

Avoid
-Wnarrowing in C code;
-Wtautological-compare in unconditional static_assert (necessary for
faking a dependency on a template parameter)

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd.h: Ignore -Wnarrowing for
arm_neon.h.
(__int_for_sizeof): Replace tautological compare with checking
for invalid template parameter value.
* include/experimental/bits/simd_builtin.h (__extract_part):
Remove tautological compare by combining two static_assert.

(cherry picked from commit e7a3ad29c9c832b6ae999cbfb0af89e121959030)

libstdc++: Add include guard to simd-internal header

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/numeric_traits.h: Add include guard.

(cherry picked from commit 3cfe94ad28102618c14a91c0a83d9e5cc7df69d7)

libstdc++: Avoid ill-formed types on ARM

This resolves failing tests in check-simd.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/114750
* include/experimental/bits/simd_builtin.h
(_SimdImplBuiltin::_S_load, _S_store): Fall back to copying
scalars if the memory type cannot be vectorized for the target.

(cherry picked from commit 0fc7f3c6adc8543f55ec35b309016d9d9c4ddd35)

libstdc++: Add masked ++/-- implementation for sizeof < 16

This resolves further failures (-Wreturn-type warnings) and test
failures for where-* tests targeting AVX-512.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_x86.h (_S_masked_unary):
Cast inputs < 16 bytes to 16 byte vectors before calling the
right subtraction builtin. Before returning, truncate to the
return vector type.

(cherry picked from commit a6c630c314b099f64d79055964d88b257459cf13)

libstdc++: Fix call signature of builtins from masked ++/--

This resolves failures in the "expensive" where-* test of check-simd
when targeting AVX-512.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_x86.h (_S_masked_unary): Call
the 4- and 8-byte variants of __builtin_ia32_subp[ds] without
rounding direction argument.

(cherry picked from commit 0ac2c0f0687b321ab54de271d788b4e0a287b4e2)

libstdc++: Avoid vector casts while still avoiding PR90424

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/109822
* include/experimental/bits/simd_builtin.h (_S_store): Rewrite
to avoid casts to other vector types. Implement store as
succession of power-of-2 sized memcpy to avoid PR90424.

(cherry picked from commit 9165ede56ababd6471e7a2ce4eab30f3d5129e14)

libstdc++: Replace use of incorrect non-temporal store

The call to the base implementation sometimes didn't find a matching
signature because the _Abi parameter of _SimdImpl* was "wrong" after
conversion. It has to call into <new ABI tag>::_SimdImpl instead of the
current ABI tag's _SimdImpl. This also reduces the number of possible
template instantiations.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/110054
* include/experimental/bits/simd_builtin.h (_S_masked_store):
Call into deduced ABI's SimdImpl after conversion.
* include/experimental/bits/simd_x86.h (_S_masked_store_nocvt):
Don't use _mm_maskmoveu_si128. Use the generic fall-back
implementation. Also fix masked stores without SSE2, which
were not doing anything before.

(cherry picked from commit 27e45b7597d6fb1a71927d658a0294797b720c0a)

libstdc++: Protect against macros

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd.h (__bit_cast): Use
__gnu__::__vector_size__ instead of gnu::vector_size.

(cherry picked from commit ce2188e4320cbb46d6246bd3f478ba20440c62f3)

libstdc++: Fix condition for supported SIMD types on ARMv8

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/110050
* include/experimental/bits/simd.h (__vectorized_sizeof): With
__have_neon_a32 only single-precision float works (in addition
to integers).

(cherry picked from commit 2fbbaa77c8468ed2bdf2cfa1a5890991e4e98eef)

tree-optimization/114121 - wrong VN with context sensitive range info

When VN ends up exploiting range-info specifying the ao_ref offset
and max_size we have to make sure to reflect this in the hashtable
entry for the recorded expression. The PR113831 fix handled the
case where we can encode this in the operands themselves but this
bug shows the issue is more widespread.

So instead of altering the operands the following instead records
this extra info that's possibly used, only throwing it away when
the value-numbering didn't come up with a non-VARYING value which
is an important detail to preserve CSE as opposed to constant
folding which is where all cases currently known popped up.

With this the original PR113831 fix can be reverted.

PR tree-optimization/113831
PR tree-optimization/114121
* tree-ssa-sccvn.h (vn_reference_s::offset,
vn_reference_s::max_size): New fields.
(vn_reference_insert_pieces): Adjust prototype.
* tree-ssa-pre.cc (phi_translate_1): Preserve offset/max_size.
* tree-ssa-sccvn.cc (vn_reference_eq): Compare offset and
size, allow using "don't know" state.
(vn_walk_cb_data::finish): Pass along offset/max_size.
(vn_reference_lookup_or_insert_for_pieces): Take offset and
max_size as argument and use it.
(vn_reference_lookup_3): Properly adjust offset and max_size
according to the adjusted ao_ref.
(vn_reference_lookup_pieces): Initialize offset and max_size.
(vn_reference_lookup): Likewise.
(vn_reference_lookup_call): Likewise.
(vn_reference_insert): Likewise.
(visit_reference_op_call): Likewise.
(vn_reference_insert_pieces): Take offset and max_size
as argument and use it.

* gcc.dg/torture/pr113831.c: New testcase.

(cherry picked from commit c841144a94363ff26e40ab3f26b14702c32987a8)

RISC-V: Fix vsetvli local eliminate [PR114747]

vsetvli local eliminate is only consider the current demand instead of
full demand, and it will use that incomplete info to remove vsetvli.

Give following example from PR114747:

vsetvli a5,a1,e8,m4,ta,mu       # 57, ratio=2, sew=8, lmul=4
vsetvli zero,a5,e16,m8,ta,ma    # 58, ratio=2, sew=16, lmul=8
vle8.v  v8,0(a0)        # 13, demand ratio=2
vzext.vf2       v24,v8  # 14, demand sew=16 and lmul=8

Insn #58 will removed because #57 has satisfied demand of #13, but it's
not consider #14.

It should doing more demand analyze, but this bug only present in GCC 13
branch, and we should not change too much on this release branch, so the best
way is make the check more conservative - remove only if the target
vsetvl_discard_result having same SEW and LMUL as the source vsetvli.

gcc/ChangeLog:

PR target/114747
* config/riscv/riscv-vsetvl.cc (local_eliminate_vsetvl_insn):
Check target vsetvl_discard_result and source vsetvli has same
SEW and LMUL.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/pr114747.c: New.

Daily bump.

AVR: ipa/92606 - Don't optimize PROGMEM data against non-PROGMEM.

ipa/92606: Inter-procedural analysis optimizes data across
address-spaces and PROGMEM. As of v14, the PROGMEM part is
still not fixed (and there is still no target hook as proposed
in PR92932). Just disable respective bogus optimization.

PR ipa/92606
gcc/
* config/avr/avr.cc (avr_option_override): Set
flag_ipa_icf_variables = 0.
gcc/testsuite/
* gcc.target/avr/torture/pr92606.c: New test.

(cherry picked from commit 08e752e72363ae7fd5a5fcb70913a0f7b240387b)

tree-optimization/114799 - SLP and patterns

The following plugs a hole with computing whether a SLP node has any
pattern stmts which is important to know when we want to replace it
by a CTOR from external defs.

PR tree-optimization/114799
* tree-vect-slp.cc (vect_get_and_check_slp_defs): Properly
update ->any_pattern when swapping operands.

* gcc.dg/vect/bb-slp-pr114799.c: New testcase.

(cherry picked from commit 18e8e55487238237f37f621668fdee316624981a)

tree-optimization/114787 - more careful loop update with CFG cleanup

When CFG cleanup removes a backedge we have to be more careful with
loop update. In particular we need to clear niter info and estimates
and if we remove the last backedge of a loop we have to also mark
it for removal to prevent a following basic block merging to associate
loop info with an unrelated header.

PR tree-optimization/114787
* tree-cfg.cc (remove_edge_and_dominated_blocks): When
removing a loop backedge clear niter info and when removing
the last backedge of a loop mark that loop for removal.

* gcc.dg/torture/pr114787.c: New testcase.

(cherry picked from commit cc48418cfc2e555d837ae9138cbfac23acb3cdf9)

RISC-V: Add testcase for pr114734

gcc/testsuite/ChangeLog:

PR middle-end/114734

* gcc.target/riscv/rvv/autovec/pr114734.c: New test.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
(cherry picked from commit ff4dc8b10a421cdb0c56f7f8c238609de4f9fbe2)

middle-end/114734 - wrong code with expand_call_mem_ref

When expand_call_mem_ref looks at the definition of the address
argument to eventually expand a &TARGET_MEM_REF argument together
with a masked load it fails to honor constraints imposed by SSA
coalescing decisions. The following fixes this.

PR middle-end/114734
* internal-fn.cc (expand_call_mem_ref): Use
get_gimple_for_ssa_name to get at the def stmt of the address
argument to honor SSA coalescing constraints.

(cherry picked from commit 4d3a5618de5a949c61605f545f90e81bc0000502)

tree-optimization/114246 - invalid call argument from DSE

The following makes sure to strip type conversions added by
build_fold_addr_expr before placing the result in a call argument.

PR tree-optimization/114246
* tree-ssa-dse.cc (increment_start_addr): Strip useless
type conversions from the adjusted address.

* gcc.dg/torture/pr114246.c: New testcase.

(cherry picked from commit 0249744a9fe0775c2c895727aeebec4c59fd5f95)

tree-optimization/113630 - invalid code hoisting

The following avoids code hoisting (but also PRE insertion) of
expressions that got value-numbered to another one that are not
a valid replacement (but still compute the same value). This time
because the access path ends in a structure with different size,
meaning we consider a related access as not trapping because of the
size of the base of the access.

PR tree-optimization/113630
* tree-ssa-pre.cc (compute_avail): Avoid registering a
reference with a representation with not matching base
access size.

* gcc.dg/torture/pr113630.c: New testcase.

(cherry picked from commit 724b64304ff5c8ac08a913509afd6fde38d7b767)

Fortran: Add error for subroutine passed to a variable dummy [PR106999]

2024-04-02 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/106999
* interface.cc (gfc_compare_interfaces): Add error for a
subroutine proc pointer passed to a variable formal.
(compare_parameter): If a procedure pointer is being passed to
a non-procedure formal arg, and there is an an interface, use
gfc_compare_interfaces to check and provide a more useful error
message.

gcc/testsuite/
PR fortran/106999
* gfortran.dg/pr106999.f90: New test.

(cherry picked from commit a7aa9455a8b9cb080649a7357b7360f2d99bcbf1)

Fortran: Fix wrong recursive errors and class initialization [PR112407]

2024-04-02 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/112407
* resolve.cc (resolve_procedure_expression): Change the test for
for recursion in the case of hidden procedures from modules.
(resolve_typebound_static): Add warning for possible recursive
calls to typebound procedures.
* trans-expr.cc (gfc_trans_class_init_assign): Do not apply
default initializer to class dummy where component initializers
are all null.

gcc/testsuite/
PR fortran/112407
* gfortran.dg/pr112407a.f90: New test.
* gfortran.dg/pr112407b.f90: New test.

(cherry picked from commit 35408b3669fac104cd380582b32e32c64a603d8b)

Fortran: Fix a gimplifier ICE/wrong result with finalization [PR36337]

2024-03-29 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/36337
PR fortran/110987
PR fortran/113885
* trans-expr.cc (gfc_trans_assignment_1): Place finalization
block before rhs post block for elemental rhs.
* trans.cc (gfc_finalize_tree_expr): Check directly if a type
has no components, rather than the zero components attribute.
Treat elemental zero component expressions in the same way as
scalars.

gcc/testsuite/
PR fortran/113885
* gfortran.dg/finalize_54.f90: New test.
* gfortran.dg/finalize_55.f90: New test.

gcc/testsuite/
PR fortran/110987
* gfortran.dg/finalize_56.f90: New test.

(cherry picked from commit 3c793f0361bc66d2a6bf0b3e1fb3234fc511e2a6)

Fortran: Fix ICE and clear incorrect error messages [PR114739]

2024-05-06 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/114739
* primary.cc (gfc_match_varspec): Check for default type before
checking for derived types with the right component name.

gcc/testsuite/
PR fortran/114739
* gfortran.dg/pr114739.f90: New test.
* gfortran.dg/derived_comp_array_ref_8.f90: Add 'implicit none'
for consistency with expected error message.
* gfortran.dg/nullify_4.f90: ditto
* gfortran.dg/pointer_init_6.f90: ditto
* gfortran.dg/pr107397.f90: ditto
* gfortran.dg/pr88138.f90: ditto

Daily bump.

Objective-C, NeXT, v2: Correct a regression in code-gen.

There have been several changes in the ABI of Objective-C which
depend on the OS version targetted. In this case Protocols and
LabelProtocols should be made weak/hidden/extern from macOS 10.7
however there was a mistake in the code causing this to occur
from macOS 10.6. Fixed thus.

gcc/objc/ChangeLog:

* objc-next-runtime-abi-02.cc (WEAK_PROTOCOLS_AFTER): New.
(next_runtime_abi_02_protocol_decl): Use WEAK_PROTOCOLS_AFTER
to determine this ABI change.
(build_v2_protocol_list_address_table): Likewise.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 9b5c0be59d0f94df0517820f00b4520b5abddd8c)

Daily bump.

tree-optimization/114749 - reset partial vector decision for no-SLP retry

The following makes sure to reset LOOP_VINFO_USING_PARTIAL_VECTORS_P
to its default of false when re-trying without SLP as otherwise
analysis may run into bogus asserts.

PR tree-optimization/114749
* tree-vect-loop.cc (vect_analyze_loop_2): Reset
LOOP_VINFO_USING_PARTIAL_VECTORS_P when re-trying without SLP.

(cherry picked from commit bf2b5231312e1cea45732cb8df6ffa2b2c9115b6)

tree-optimization/114736 - SLP DFS walk issue

The following fixes a DFS walk issue when identifying to be ignored
latch edges. We have (bogus) SLP_TREE_REPRESENTATIVEs for VEC_PERM
nodes so those have to be explicitly ignored as possibly being PHIs.

PR tree-optimization/114736
* tree-vect-slp.cc (vect_optimize_slp_pass::is_cfg_latch_edge):
Do not consider VEC_PERM_EXPRs as PHI use.

* gfortran.dg/vect/pr114736.f90: New testcase.

(cherry picked from commit f949481a1f7ab973608a4ffcc0e342ab5a74e8e4)

gcov-profile/114715 - missing coverage for switch

The following avoids missing coverage for the line of a switch statement
which happens when gimplification emits a BIND_EXPR wrapping the switch
as that prevents us from setting locations on the containing statements
via annotate_all_with_location. Instead set the location of the GIMPLE
switch directly.

PR gcov-profile/114715
* gimplify.cc (gimplify_switch_expr): Set the location of the
GIMPLE switch.

* gcc.misc-tests/gcov-24.c: New testcase.

(cherry picked from commit 9d573f71e80e9f6f4aac912fc8fc128aa2697e3a)

lto/114655 - -flto=4 at link time doesn't override -flto=auto at compile time

The following adjusts -flto option processing in lto-wrapper to have
link-time -flto override any compile time setting.

PR lto/114655
* lto-wrapper.cc (merge_flto_options): Add force argument.
(merge_and_complain): Do not force here.
(run_gcc): But here to make the link-time -flto option override
any compile-time one.

(cherry picked from commit 32fb04adae90a0ea68e64e8fc3cb04b613b2e9f3)

tree-optimization/114733 - neg induction fails for 1 element vectors

The neg induction vectorization code isn't prepared to deal with
single element vectors.

PR tree-optimization/114733
* tree-vect-loop.cc (vectorizable_nonlinear_induction): Reject
neg induction vectorization of single element vectors.

* gcc.dg/vect/pr114733.c: New testcase.

(cherry picked from commit 45a41ace55d0ffb1097e374868242329788ec82a)

tree-optimization/114485 - neg induction with partial vectors

We can't use vect_update_ivs_after_vectorizer for partial vectors,
the following fixes vect_can_peel_nonlinear_iv_p accordingly.

PR tree-optimization/114485
* tree-vect-loop-manip.cc (vect_can_peel_nonlinear_iv_p):
vect_step_op_neg isn't OK for partial vectors but only
for unknown niter.

* gcc.dg/vect/pr114485.c: New testcase.

(cherry picked from commit 85621f98d245004a6c9787dde21e0acc17ab2c50)

ifcvt: Don't lower bitfields with non-constant offsets [PR 111882]

This patch stops lowering of bitfields by ifcvt when they have non-constant
offsets as we are not likely to be able to do anything useful with those during
vectorization. That also fixes the issue reported in PR 111882, which was
being caused by an offset with a side-effect being lowered, but constants have
no side-effects so we will no longer run into that problem.

gcc/ChangeLog:

PR tree-optimization/111882
* tree-if-conv.cc (get_bitfield_rep): Return NULL_TREE for bitfields
with non-constant offsets.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr111882.c: New test.

(cherry picked from commit 24cf1f600b8ad34c68a51f48884e72d01f729893)

Daily bump.

tree-optimization/114672 - WIDEN_MULT_PLUS_EXPR type mismatch

The following makes sure to restrict WIDEN_MULT*_EXPR to a mode
precision final compute type as the mode is used to find the optab
and type checking chokes when seeing bit-precisions later which
would likely also not properly expanded to RTL.

PR tree-optimization/114672
* tree-ssa-math-opts.cc (convert_plusminus_to_widen): Only
allow mode-precision results.

* gcc.dg/torture/pr114672.c: New testcase.

(cherry picked from commit 912753cc5f18d786e334dd425469fa7f93155661)

libstdc++: Fix infinite loop in std::istream::ignore(n, delim) [PR93672]

A negative delim value passed to std::istream::ignore can never match
any character in the stream, because the comparison is done using
traits_type::eq_int_type(sb->sgetc(), delim) and sgetc() never returns
negative values (except at EOF). The optimized version of ignore for the
std::istream specialization uses traits_type::find to locate the delim
character in the streambuf, which _can_ match a negative delim on
platforms where char is signed, but then we do another comparison using
eq_int_type which fails. The code then keeps looping forever, with
traits_type::find locating the character and traits_type::eq_int_type
saying it's not a match, so traits_type::find is used again and finds
the same character again.

A possible fix would be to check with eq_int_type after a successful
find, to see whether we really have a match. However, that would be
suboptimal since we know that a negative delimiter will never match
using eq_int_type. So a better fix is to adjust the check at the top of
the function that handles delim==eof(), so that we treat all negative
delim values as equivalent to EOF. That way we don't bother using find
to search for something that will never match with eq_int_type.

The version of ignore in the primary template doesn't need a change,
because it doesn't use traits_type::find, instead characters are
extracted one-by-one and always matched using eq_int_type. That avoids
the inconsistency between find and eq_int_type. The specialization for
std::wistream does use traits_type::find, but traits_type::to_int_type
is equivalent to an implicit conversion from wchar_t to wint_t, so
passing a wchar_t directly to ignore without using to_int_type works.

libstdc++-v3/ChangeLog:

PR libstdc++/93672
* src/c++98/istream.cc (istream::ignore(streamsize, int_type)):
Treat all negative delimiter values as eof().
* testsuite/27_io/basic_istream/ignore/char/93672.cc: New test.
* testsuite/27_io/basic_istream/ignore/wchar_t/93672.cc: New
test.

(cherry picked from commit 2d694414ada8e3b58f504c1b175d31088529632e)

libstdc++: Reverse arguments in constraint for std::optional's <=> [PR104606]

This is a workaround for a possible compiler bug that causes constraint
recursion in the operator<=>(const optional<T>&, const U&) overload.

libstdc++-v3/ChangeLog:

PR libstdc++/104606
* include/std/optional (operator<=>(const optional<T>&, const U&)):
Reverse order of three_way_comparable_with template arguments.
* testsuite/20_util/optional/relops/104606.cc: New test.

(cherry picked from commit 7f65d8267fbfd19cf21a3dc71d27e989e75044a3)

rs6000: Add OPTION_MASK_POWER8 [PR101865]

The bug in PR101865 is the _ARCH_PWR8 predefine macro is conditional upon
TARGET_DIRECT_MOVE, which can be false for some -mcpu=power8 compiles if the
-mno-altivec or -mno-vsx options are used.  The solution here is to create
a new OPTION_MASK_POWER8 mask that is true for -mcpu=power8, regardless of
Altivec or VSX enablement.

Unfortunately, the only way to create an OPTION_MASK_* mask is to create
a new option, which we have done here, but marked it as WarnRemoved since
we do not want users using it.  For stage1, we will look into how we can
create ISA mask flags for use in the compiler without the need for explicit
options.

2024-04-12  Will Schmidt  <will_schmidt@linux.ibm.com>
    Peter Bergner  <bergner@linux.ibm.com>

gcc/
PR target/101865
* config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported): Use
TARGET_POWER8.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Use
OPTION_MASK_POWER8.
* config/rs6000/rs6000-cpus.def (POWERPC_MASKS): Add OPTION_MASK_POWER8.
(ISA_2_7_MASKS_SERVER): Likewise.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Update
comment.  Use OPTION_MASK_POWER8 and TARGET_POWER8.
* config/rs6000/rs6000.h (TARGET_SYNC_HI_QI): Use TARGET_POWER8.
* config/rs6000/rs6000.md (define_attr "isa"): Add p8.
(define_attr "enabled"): Handle it.
(define_insn "prefetch"): Use TARGET_POWER8.
* config/rs6000/rs6000.opt (mpower8-internal): New.

gcc/testsuite/
PR target/101865
* gcc.target/powerpc/predefine-p7-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-noaltivec-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-noaltivec.c: New test.
* gcc.target/powerpc/predefine-p8-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-pragma-vsx.c: New test.
* gcc.target/powerpc/predefine-p9-novsx.c: New test.

(cherry picked from commit aa57af93ba22865be747f926e4e5f219e7f8758a)

rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

This is a cleanup patch in preparation to fixing the real bug in PR101865.
TARGET_DIRECT_MOVE is redundant with TARGET_P8_VECTOR, so alias it to that.
Also replace all usages of OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR
and delete the now dead mask.

2024-04-09  Peter Bergner  <bergner@linux.ibm.com>

gcc/
PR target/101865
* config/rs6000/rs6000.h (TARGET_DIRECT_MOVE): Define.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Replace
OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR.  Delete redundant
OPTION_MASK_DIRECT_MOVE usage.  Delete TARGET_DIRECT_MOVE dead code.
(rs6000_opt_masks): Neuter the "direct-move" option.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Replace
OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR.  Delete useless
comment.
* config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Delete
OPTION_MASK_DIRECT_MOVE.
(OTHER_P8_VECTOR_MASKS): Likewise.
(POWERPC_MASKS): Likewise.
* config/rs6000/rs6000.opt (mdirect-move): Remove Mask and Var.

(cherry picked from commit 7924e352523b37155ed9d76dc426701de9d11a22)

Daily bump.

c++: problematic assert in reference_binding [PR113141]

r14-9946 / r14-9947 fixed this PR properly for GCC 14.

For GCC 13, let's just remove the problematic assert.

PR c++/113141

gcc/cp/ChangeLog:

* call.cc (reference_binding): Remove badness criteria sanity
check in the recursive case.

gcc/testsuite/ChangeLog:

* g++.dg/conversion/ref12.C: New test.
* g++.dg/cpp0x/initlist-ref1.C: new test.

c++: alias CTAD and template template parm [PR114377]

To match all the other places that pull a _TEMPLATE_PARM out of a
_DECL (get_template_parm_index, etc.).

This change is too small to be legally significant for copyright.

PR c++/114377

gcc/cp/ChangeLog:

* pt.cc (find_template_parameter_info::found): Use TREE_TYPE for
TEMPLATE_DECL instead of DECL_INITIAL.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/class-deduction-alias19.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 801e82acd6b4f0cf863529875947e394899ea7b9)

c++: binding reference to comma expr [PR114561]

We represent a reference binding where the referent type is more qualified
by a ck_ref_bind around a ck_qual. We performed the ck_qual and then tried
to undo it with STRIP_NOPS, but that doesn't work if the conversion is
buried in COMPOUND_EXPR. So instead let's avoid performing that fake
conversion in the first place.

PR c++/114561
PR c++/114562

gcc/cp/ChangeLog:

* call.cc (convert_like_internal): Avoid adding qualification
conversion in direct reference binding.

gcc/testsuite/ChangeLog:

* g++.dg/conversion/ref10.C: New test.
* g++.dg/conversion/ref11.C: New test.

(cherry picked from commit 5d7e9a35024f065b25f61747859c7cb7a770c92b)

c++: __is_constructible ref binding [PR100667]

The requirement that a type argument be complete is excessive in the case of
direct reference binding to the same type, which does not rely on any
properties of the type. This is LWG 2939.

PR c++/100667

gcc/cp/ChangeLog:

* semantics.cc (same_type_ref_bind_p): New.
(finish_trait_expr): Use it.

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_constructible8.C: New test.

(cherry picked from commit 8bb3ef3f6e335e8794590fb712a2661d11d21973)

Daily bump.

libstdc++: Do not apply localized formatting to NaN and inf [PR114863]

We don't want to add grouping to strings like "-inf", and there is no
radix character to replace either.

libstdc++-v3/ChangeLog:

PR libstdc++/114863
* include/std/format (__formatter_fp::format): Only use
_M_localized for finite values.
* testsuite/std/format/functions/format.cc: Check localized
formatting of NaN and initiny.

(cherry picked from commit 7501c0a397fcf609a1ff5f083746b6330b89ee11)

match.pd: Only merge truncation with conversion for -fno-signed-zeros

This optimisation does not honour signed zeros, so should not be
enabled except with -fno-signed-zeros.

gcc/ChangeLog:

* match.pd: Fix truncation pattern for -fno-signed-zeroes

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/no_merge_trunc_signed_zero.c: New test.

(cherry picked from commit 7dd3b2b09cbeb6712ec680a0445cb0ad41070423)