]> gcc.gnu.org Git - gcc.git/log
gcc.git
11 months agoFix avx512ne2ps2bf16 wrong code [PR 111127]
Hongyu Wang [Thu, 24 Aug 2023 06:41:42 +0000 (14:41 +0800)]
Fix avx512ne2ps2bf16 wrong code [PR 111127]

Correct the parameter order for avx512ne2ps2bf16_maskz expander

gcc/ChangeLog:

PR target/111127
* config/i386/sse.md (avx512f_cvtne2ps2bf16_<mode>_maskz):
Adjust paramter order.

gcc/testsuite/ChangeLog:

PR target/111127
* gcc.target/i386/pr111127.c: New test.

11 months agoDaily bump.
GCC Administrator [Fri, 25 Aug 2023 00:18:19 +0000 (00:18 +0000)]
Daily bump.

11 months agoi386: Optimize pinsrq of 0 with index 1 into movq [PR94866]
Uros Bizjak [Thu, 24 Aug 2023 20:23:52 +0000 (22:23 +0200)]
i386: Optimize pinsrq of 0 with index 1 into movq [PR94866]

Add new pattern involving vec_merge RTX that is produced by combine from the
combination of sse4_1_pinsrq and *movdi_internal:

    7: r86:DI=0
    8: r85:V2DI=vec_merge(vec_duplicate(r86:DI),r87:V2DI,0x2)
      REG_DEAD r87:V2DI
      REG_DEAD r86:DI
Successfully matched this instruction:
(set (reg:V2DI 85 [ a ])
    (vec_merge:V2DI (reg:V2DI 87)
        (const_vector:V2DI [
                (const_int 0 [0]) repeated x2
            ])
        (const_int 1 [0x1])))

PR target/94866

gcc/ChangeLog:

* config/i386/sse.md (*sse2_movq128_<mode>_1): New insn pattern.

gcc/testsuite/ChangeLog:

* g++.target/i386/pr94866.C: New test.

11 months agoFix tests for PR 106537.
Jose E. Marchesi [Thu, 24 Aug 2023 15:10:52 +0000 (17:10 +0200)]
Fix tests for PR 106537.

This patch fixes the tests for PR 106537 (support for
-W[no]-compare-distinct-pointer-types) which were expecting the
warning when checking for equality/inequality of void pointers with
non-function pointers.

gcc/testsuite/ChangeLog:

PR c/106537
* gcc.c-torture/compile/pr106537-1.c: Comparing void pointers to
non-function pointers is legit.
* gcc.c-torture/compile/pr106537-2.c: Likewise.

11 months agoanalyzer: implement kf_strcat [PR105899]
David Malcolm [Thu, 24 Aug 2023 14:24:40 +0000 (10:24 -0400)]
analyzer: implement kf_strcat [PR105899]

gcc/analyzer/ChangeLog:
PR analyzer/105899
* call-details.cc
(call_details::check_for_null_terminated_string_arg): Split into
overloads, one taking just an arg_idx, the other a new
"include_terminator" param.
* call-details.h: Likewise.
* kf.cc (class kf_strcat): New.
(kf_strcpy::impl_call_pre): Update for change to
check_for_null_terminated_string_arg.
(register_known_functions): Register kf_strcat.
* region-model.cc
(region_model::check_for_null_terminated_string_arg): Split into
overloads, one taking just an arg_idx, the other a new
"include_terminator" param.  When returning an svalue, handle
"include_terminator" being false by subtracting one.
* region-model.h
(region_model::check_for_null_terminated_string_arg): Split into
overloads, one taking just an arg_idx, the other a new
"include_terminator" param.

gcc/ChangeLog:
PR analyzer/105899
* doc/invoke.texi (Static Analyzer Options): Add "strcat" to the
list of functions known to the analyzer.

gcc/testsuite/ChangeLog:
PR analyzer/105899
* gcc.dg/analyzer/strcat-1.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
11 months agoanalyzer: handle strlen(BITS_WITHIN) [PR105899]
David Malcolm [Thu, 24 Aug 2023 14:24:40 +0000 (10:24 -0400)]
analyzer: handle strlen(BITS_WITHIN) [PR105899]

gcc/analyzer/ChangeLog:
PR analyzer/105899
* region-model.cc (fragment::has_null_terminator): Handle
SK_BITS_WITHIN.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
11 months agoanalyzer: handle INIT_VAL(ELEMENT_REG(STRING_REG), CONSTANT_SVAL) [PR105899]
David Malcolm [Thu, 24 Aug 2023 14:24:40 +0000 (10:24 -0400)]
analyzer: handle INIT_VAL(ELEMENT_REG(STRING_REG), CONSTANT_SVAL) [PR105899]

gcc/analyzer/ChangeLog:
PR analyzer/105899
* region-model-manager.cc
(region_model_manager::get_or_create_initial_value): Simplify
INIT_VAL(ELEMENT_REG(STRING_REG), CONSTANT_SVAL) to
CONSTANT_SVAL(STRING[N]).

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
11 months agoanalyzer: handle strlen(INIT_VAL(STRING_REG)) [PR105899]
David Malcolm [Thu, 24 Aug 2023 14:24:39 +0000 (10:24 -0400)]
analyzer: handle strlen(INIT_VAL(STRING_REG)) [PR105899]

gcc/analyzer/ChangeLog:
PR analyzer/105899
* region-model.cc (fragment::has_null_terminator): Move STRING_CST
handling to fragment::string_cst_has_null_terminator; also use it to
handle INIT_VAL(STRING_REG).
(fragment::string_cst_has_null_terminator): New, from above.

gcc/testsuite/ChangeLog:
PR analyzer/105899
* gcc.dg/analyzer/strcpy-3.c (test_2): New.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
11 months agoanalyzer: reimplement kf_memcpy_memmove
David Malcolm [Thu, 24 Aug 2023 14:24:39 +0000 (10:24 -0400)]
analyzer: reimplement kf_memcpy_memmove

gcc/analyzer/ChangeLog:
* kf.cc (kf_memcpy_memmove::impl_call_pre): Reimplement using
region_model::copy_bytes.
* region-model.cc (region_model::read_bytes): New.
(region_model::copy_bytes): New.
* region-model.h (region_model::read_bytes): New decl.
(region_model::copy_bytes): New decl.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
11 months agoanalyzer: eliminate region_model::get_string_size [PR105899]
David Malcolm [Thu, 24 Aug 2023 14:24:38 +0000 (10:24 -0400)]
analyzer: eliminate region_model::get_string_size [PR105899]

gcc/analyzer/ChangeLog:
PR analyzer/105899
* region-model.cc (region_model::get_string_size): Delete both.
* region-model.h (region_model::get_string_size): Delete both
decls.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
11 months agoanalyzer: reimplement kf_strcpy [PR105899]
David Malcolm [Thu, 24 Aug 2023 14:24:38 +0000 (10:24 -0400)]
analyzer: reimplement kf_strcpy [PR105899]

This patch reimplements the analyzer's implementation of strcpy using
the region_model::scan_for_null_terminator infrastructure, so that e.g.
it can complain about out-of-bounds reads/writes, unterminated strings,
etc.

gcc/analyzer/ChangeLog:
PR analyzer/105899
* kf.cc (kf_strcpy::impl_call_pre): Reimplement using
check_for_null_terminated_string_arg.
* region-model.cc (region_model::get_store_bytes): Shortcut
reading all of a string_region.
(region_model::scan_for_null_terminator): Use get_store_value for
the bytes rather than "unknown" when returning an unknown length.
(region_model::write_bytes): New.
* region-model.h (region_model::write_bytes): New decl.

gcc/testsuite/ChangeLog:
PR analyzer/105899
* gcc.dg/analyzer/out-of-bounds-diagram-16.c: New test.
* gcc.dg/analyzer/strcpy-1.c: Add test coverage.
* gcc.dg/analyzer/strcpy-3.c: Likewise.
* gcc.dg/analyzer/strcpy-4.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
11 months agoanalyzer: handle symbolic bindings in scan_for_null_terminator [PR105899]
David Malcolm [Thu, 24 Aug 2023 14:24:38 +0000 (10:24 -0400)]
analyzer: handle symbolic bindings in scan_for_null_terminator [PR105899]

gcc/analyzer/ChangeLog:
PR analyzer/105899
* region-model.cc (iterable_cluster::iterable_cluster): Add
symbolic binding keys to m_symbolic_bindings.
(iterable_cluster::has_symbolic_bindings_p): New.
(iterable_cluster::m_symbolic_bindings): New field.
(region_model::scan_for_null_terminator): Treat clusters with
symbolic bindings as having unknown strlen.

gcc/testsuite/ChangeLog:
PR analyzer/105899
* gcc.dg/analyzer/sprintf-1.c: Include "analyzer-decls.h".
(test_strlen_1): New.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
11 months agoanalyzer: add logging to impl_path_context
David Malcolm [Thu, 24 Aug 2023 14:24:38 +0000 (10:24 -0400)]
analyzer: add logging to impl_path_context

gcc/analyzer/ChangeLog:
* engine.cc (impl_path_context::impl_path_context): Add logger
param.
(impl_path_context::bifurcate): Add log message.
(impl_path_context::terminate_path): Likewise.
(impl_path_context::m_logger): New field.
(exploded_graph::process_node): Pass logger to path_ctxt ctor.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
11 months agotree-optimization/111123 - indirect clobbers thrown away too early
Richard Biener [Thu, 24 Aug 2023 07:32:54 +0000 (09:32 +0200)]
tree-optimization/111123 - indirect clobbers thrown away too early

The testcase in the PR shows that late uninit diagnostic relies
on indirect clobbers in CTORs but we throw those away in the fab
pass which is too early.  The reasoning was they were supposed
to keep SSA names live but that's no longer the case since DCE
doesn't treat them as keeping SSA uses live.

The following instead removes them before out-of-SSA coalescing
which is the thing that's still affected by them.

PR tree-optimization/111123
* tree-ssa-ccp.cc (pass_fold_builtins::execute): Do not
remove indirect clobbers here ...
* tree-outof-ssa.cc (rewrite_out_of_ssa): ... but here.
(remove_indirect_clobbers): New function.

* g++.dg/warn/Wuninitialized-pr111123-1.C: New testcase.

11 months agoCheck that passes do not forget to define profile
Jan Hubicka [Thu, 24 Aug 2023 13:10:46 +0000 (15:10 +0200)]
Check that passes do not forget to define profile

This patch extends verifier to check that all probabilities and counts are
initialized if profile is supposed to be present.  This is a bit complicated
by the posibility that we inline !flag_guess_branch_probability function
into function with profile defined and in this case we need to stop
verification.  For this reason I added flag to cfg structure tracking this.

Bootstrapped/regtested x86_64-linux, comitted.

gcc/ChangeLog:

* cfg.h (struct control_flow_graph): New field full_profile.
* auto-profile.cc (afdo_annotate_cfg): Set full_profile to true.
* cfg.cc (init_flow): Set full_profile to false.
* graphite.cc (graphite_transform_loops): Set full_profile to false.
* lto-streamer-in.cc (input_cfg): Initialize full_profile flag.
* predict.cc (pass_profile::execute): Set full_profile to true.
* symtab-thunks.cc (expand_thunk): Set full_profile to true.
* tree-cfg.cc (gimple_verify_flow_info): Verify that profile is full
if full_profile is set.
* tree-inline.cc (initialize_cfun): Initialize full_profile.
(expand_call_inline): Combine full_profile.

11 months agolibstdc++: Add test for illegal pointer arithmetic in format [PR111102]
Paul Dreik [Thu, 24 Aug 2023 10:43:43 +0000 (11:43 +0100)]
libstdc++: Add test for illegal pointer arithmetic in format [PR111102]

libstdc++-v3/ChangeLog:

PR libstdc++/111102
* testsuite/std/format/string.cc: Check wide character format
strings with out-of-range widths.

11 months agolibstdc++: fix illegal pointer arithmetic in format [PR111102]
Paul Dreik [Thu, 24 Aug 2023 10:43:43 +0000 (11:43 +0100)]
libstdc++: fix illegal pointer arithmetic in format [PR111102]

When parsing a format string, the width is parsed into an unsigned short
but the result is not checked in the case the format string is not a
char string (such as a wide string). In case the parse fails, a null
pointer is returned which is used for pointer arithmetic which is
undefined behaviour.

Signed-off-by: Paul Dreik <gccpatches@pauldreik.se>
libstdc++-v3/ChangeLog:

PR libstdc++/111102
* include/std/format (__format::__parse_integer): Check for
non-null pointer.

11 months agolibstdc++: Fix -Wunused-but-set-variable in std::format_to test
Jonathan Wakely [Thu, 24 Aug 2023 10:42:17 +0000 (11:42 +0100)]
libstdc++: Fix -Wunused-but-set-variable in std::format_to test

libstdc++-v3/ChangeLog:

* testsuite/std/format/functions/format_to.cc: Avoid warning for
unused variables.

11 months agolibstdc++: Tweak some preprocessor conditions for feature tests
Jonathan Wakely [Thu, 17 Aug 2023 23:13:51 +0000 (00:13 +0100)]
libstdc++: Tweak some preprocessor conditions for feature tests

Update a preprocessor condition using __cplusplus and _GLIBCXX_HOSTED
to use the relevant feature test macro for <syncstream>.

Also add comments to some conditions saying which C++ standard revision
the check corresponds to.

libstdc++-v3/ChangeLog:

* include/std/atomic: Add comment to #ifdef and fix indentation.
* include/std/ostream: Check __glibcxx_syncbuf instead of
__cplusplus and _GLIBCXX_HOSTED.
* include/std/thread: Add comment to #ifdef.

11 months agolibstdc++: Implement new SI prefixes in <ratio> for C++23 (P2734R0)
Jonathan Wakely [Wed, 23 Aug 2023 14:51:49 +0000 (15:51 +0100)]
libstdc++: Implement new SI prefixes in <ratio> for C++23 (P2734R0)

This is a no-op for libstdc++, because our intmax_t is a 64-bit type and
so is incapable of representing the largest and smallest ratios from
C++11, let alone the new ones. I've added them to the file anyway (and
defined the feature test macro) so that if somebody ports libstdc++ to a
target with 128-bit intmax_t then they'll be present.

libstdc++-v3/ChangeLog:

* include/bits/version.def (__cpp_lib_ratio): Define.
* include/bits/version.h: Regenerate.
* include/std/ratio (quecto, ronto, yocto, zepto)
(zetta, yotta, ronna, quetta): Define.
* testsuite/20_util/ratio/operations/ops_overflow_neg.cc: Adjust
dg-error line numbers.

11 months agoFix confusion about load_p in vect_build_slp_tree_1
Richard Biener [Thu, 24 Aug 2023 11:46:12 +0000 (13:46 +0200)]
Fix confusion about load_p in vect_build_slp_tree_1

load_p is set and used as to whether the stmt is a memory operation,
not whether it is only a load.  The following renames it to ldst_p
to avoid this confusion.  It also replaces checking for a VUSE
with checking STMT_VINFO_DATA_REF since VUSE checking doesn't
work for pattern matched stores where no virtual operands are
present.  Where we want to distinguish between loads and stores
we then check DR_IS_READ/WRITE.

I've made a classification mistake with .MASK_STORE support and
this hits other complications when dealing with single-lane SLP.

* tree-vect-slp.cc (vect_build_slp_tree_1): Rename
load_p to ldst_p, fix mistakes and rely on
STMT_VINFO_DATA_REF.

11 months agolibstdc++: Add pretty printer for std::locale
Jonathan Wakely [Wed, 23 Aug 2023 11:10:16 +0000 (12:10 +0100)]
libstdc++: Add pretty printer for std::locale

Print the locale's name, except when it uses the same named C locale for
all categories except one, in which case print something like:
std::locale = "en_GB.UTF-8" with "LC_CTYPE=en_US.UTF-8"

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (StdLocalePrinter): New
printer class.
* testsuite/libstdc++-prettyprinters/locale.cc: New test.

11 months agolibstdc++: Declutter std::optional and std:variant pretty printers [PR110944]
Jonathan Wakely [Tue, 22 Aug 2023 13:26:51 +0000 (14:26 +0100)]
libstdc++: Declutter std::optional and std:variant pretty printers [PR110944]

As the PR says, including the template arguments in the GDB output of
these class templates can result in very long names, especially for
std::variant. You can use 'whatis' or other GDB commands to get details
of the type, we don't need to include it in the value.

We could consider including the type if it's not too long, but I think
consistency is better (and we already omit the template arguments for
std::vector and other class templates).

libstdc++-v3/ChangeLog:

PR libstdc++/110944
* python/libstdcxx/v6/printers.py (StdExpOptionalPrinter): Do
not show template arguments.
(StdVariantPrinter): Likewise.
* testsuite/libstdc++-prettyprinters/compat.cc: Adjust expected
output.
* testsuite/libstdc++-prettyprinters/cxx17.cc: Likewise.
* testsuite/libstdc++-prettyprinters/libfundts.cc: Likewise.

11 months agoFix profile update in gimple-harden-conditionals.cc
Jan Hubicka [Thu, 24 Aug 2023 11:46:10 +0000 (13:46 +0200)]
Fix profile update in gimple-harden-conditionals.cc

gcc/ChangeLog:

* gimple-harden-conditionals.cc (insert_check_and_trap): Set count
of newly build trap bb.

11 months agoRISC-V: Add COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS testcases
Juzhe-Zhong [Wed, 16 Aug 2023 13:20:10 +0000 (21:20 +0800)]
RISC-V: Add COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS testcases

This patch is depending on middle-end patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627621.html

We already had COND_LEN_FNMA/COND_LEN_FMS/COND_FNMS patterns.

Remove TARGET_PREFERRED_ELSE_VALUE since it forbid the COND_LEN_FMS/COND_LEN_FNMS STMT fold.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_preferred_else_value): Remove it since
it forbid COND_LEN_FMS/COND_LEN_FNMS STMT fold.
(TARGET_PREFERRED_ELSE_VALUE): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c: Adapt test.
* gcc.target/riscv/rvv/autovec/binop/vadd-rv64gcv-nofm.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-10.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-11.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-12.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-4.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-5.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-6.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-7.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-8.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-9.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm_run-10.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm_run-11.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm_run-12.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm_run-5.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm_run-6.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm_run-7.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm_run-8.c: New test.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm_run-9.c: New test.

11 months agoRISC-V: Enable pressure-aware scheduling by default.
Robin Dapp [Fri, 18 Aug 2023 13:57:16 +0000 (15:57 +0200)]
RISC-V: Enable pressure-aware scheduling by default.

this patch enables pressure-aware scheduling for riscv.  There have been
various requests for it so I figured I'd just go ahead and send
the patch.

There is some slight regression in code quality for a number of
vector tests where we spill more due to different instructions order.
The ones I looked at were a mix of bad luck and/or brittle tests.
Comparing the size of the generated assembly or the number of vsetvls
for SPECint also didn't show any immediate benefit but that's obviously
not a very fine-grained analysis.

As cost and scheduling models mature I expect the situation to improve
and for now I think it's generally favorable to enable pressure-aware
scheduling so we can work with it rather than trying to find every
possible problem in advance.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add -fsched-pressure.
* config/riscv/riscv.cc (riscv_option_override): Set sched
pressure algorithm.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/narrow_constraint-1.c: Add
-fno-sched-pressure.
* gcc.target/riscv/rvv/base/narrow_constraint-17.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-18.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-19.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-20.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-21.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-22.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-23.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-24.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-25.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-26.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-27.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-28.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-29.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-30.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-31.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-4.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-5.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-8.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-9.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-10.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-11.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-12.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-3.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-9.c: Ditto.

11 months agoRISC-V: Allow const 17-31 for vector shift.
Robin Dapp [Fri, 18 Aug 2023 14:16:54 +0000 (16:16 +0200)]
RISC-V: Allow const 17-31 for vector shift.

This patch adds a missing constraint in order to be able to print (and
not ICE) vector immediates 17-31 for vector shifts.

Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_print_operand): Allow vk operand.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/shift-immediate.c: New test.

11 months agoRISC-V: Add missing conversion tests.
Robin Dapp [Wed, 12 Jul 2023 11:55:51 +0000 (13:55 +0200)]
RISC-V: Add missing conversion tests.

This adds some missing tests for vf[nw]cvt.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-run.c:
Add tests.
* gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-rv32gcv.c:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-rv64gcv.c:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-template.h:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfncvt-itof-rv32gcv.c:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfncvt-itof-rv64gcv.c:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfncvt-itof-template.h:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfncvt-itof-zvfh-run.c:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfwcvt-ftoi-rv32gcv.c:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfwcvt-ftoi-rv64gcv.c:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfwcvt-ftoi-template.h:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfwcvt-ftoi-zvfh-run.c:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfwcvt-itof-run.c:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfwcvt-itof-rv32gcv.c:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfwcvt-itof-rv64gcv.c:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfwcvt-itof-template.h:
Ditto.
* gcc.target/riscv/rvv/autovec/conversions/vfwcvt-itof-zvfh-run.c:
Ditto.

11 months agoRISC-V: Fix reduc_strict_run-1 test case.
Robin Dapp [Tue, 15 Aug 2023 15:15:58 +0000 (17:15 +0200)]
RISC-V: Fix reduc_strict_run-1 test case.

This patch fixes the reduc_strict_run-1 testcase by introducing
a variable that holds the reference result.  This is necessary
because in presence of _Float16 emulation an intermediate
result used in a comparison is computed in higher precision.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-1.c:
Add variable to hold reference result.

11 months agotree-optimization/111125 - avoid BB vectorization in novector loops
Richard Biener [Thu, 24 Aug 2023 09:10:43 +0000 (11:10 +0200)]
tree-optimization/111125 - avoid BB vectorization in novector loops

When a loop is marked with

  #pragma GCC novector

the following makes sure to also skip BB vectorization for contained
blocks.  That avoids gcc.dg/vect/bb-slp-29.c failing on aarch64
because of extra BB vectorization therein.  I'm not specifically
dealing with sub-loops of novector loops, the desired semantics
isn't documented.

PR tree-optimization/111125
* tree-vect-slp.cc (vect_slp_function): Split at novector
loop entry, do not push blocks in novector loops.

11 months agoc: Add support for [[__extension__ ...]]
Richard Sandiford [Thu, 24 Aug 2023 10:49:58 +0000 (11:49 +0100)]
c: Add support for [[__extension__ ...]]

[[]] attributes are a recent addition to C, but as a GNU extension,
GCC allows them to be used in C11 and earlier.  Normally this use
would trigger a pedwarn (for -pedantic, -Wc11-c2x-compat, etc.).

This patch allows the pedwarn to be suppressed by starting the
attribute-list with __extension__.

Also, :: is not a single lexing token prior to C2X, so it wasn't
possible to use scoped attributes in C11, even as a GNU extension.
The patch allows two colons to be used in place of :: when
__extension__ is used.  No attempt is made to check whether the
two colons are immediately adjacent.

gcc/
* doc/extend.texi: Document the C [[__extension__ ...]] construct.

gcc/c/
* c-parser.cc (c_parser_std_attribute): Conditionally allow
two colons to be used in place of ::.
(c_parser_std_attribute_list): New function, split out from...
(c_parser_std_attribute_specifier): ...here.  Allow the attribute-list
to start with __extension__.  When it does, also allow two colons
to be used in place of ::.

gcc/testsuite/
* gcc.dg/c2x-attr-syntax-6.c: New test.
* gcc.dg/c2x-attr-syntax-7.c: Likewise.

11 months agogimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold
Juzhe-Zhong [Tue, 22 Aug 2023 01:58:34 +0000 (09:58 +0800)]
gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold

Hi, Richard and Richi.

Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math.
It's supported in tree-ssa-math-opts.cc. However, GCC failed to support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS.

Consider this following case:
  __attribute__ ((noipa)) void ternop_##TYPE (TYPE *__restrict dst,            \
      TYPE *__restrict a,              \
      TYPE *__restrict b, int n)       \
  {                                                                            \
    for (int i = 0; i < n; i++)                                                \
      dst[i] -= a[i] * b[i];                                           \
  }

  TEST_TYPE (float)                                                            \

TEST_ALL ()

Gimple IR for RVV:

...
_39 = -vect__8.14_26;
vect__10.16_21 = .COND_LEN_FMA ({ -1, ... }, vect__6.11_30, _39, vect__4.8_34, vect__4.8_34, _46, 0);
...

This is because this following piece of codes in tree-ssa-math-opts.cc:

      if (len)
fma_stmt
  = gimple_build_call_internal (IFN_COND_LEN_FMA, 7, cond, mulop1, op2,
addop, else_value, len, bias);
      else if (cond)
fma_stmt = gimple_build_call_internal (IFN_COND_FMA, 5, cond, mulop1,
       op2, addop, else_value);
      else
fma_stmt = gimple_build_call_internal (IFN_FMA, 3, mulop1, op2, addop);
      gimple_set_lhs (fma_stmt, gimple_get_lhs (use_stmt));
      gimple_call_set_nothrow (fma_stmt, !stmt_can_throw_internal (cfun,
   use_stmt));
      gsi_replace (&gsi, fma_stmt, true);
      /* Follow all SSA edges so that we generate FMS, FNMA and FNMS
 regardless of where the negation occurs.  */
      gimple *orig_stmt = gsi_stmt (gsi);
      if (fold_stmt (&gsi, follow_all_ssa_edges))
{
  if (maybe_clean_or_replace_eh_stmt (orig_stmt, gsi_stmt (gsi)))
    gcc_unreachable ();
  update_stmt (gsi_stmt (gsi));
}

'fold_stmt' failed to fold NEGATE_EXPR + COND_LEN_FMA ====> COND_LEN_FNMA.

This patch support STMT fold into:

vect__10.16_21 = .COND_LEN_FNMA ({ -1, ... }, vect__8.14_26, vect__6.11_30, vect__4.8_34, { 0.0, ... }, _46, 0);

Note that COND_LEN_FNMA has 7 arguments and COND_LEN_ADD has 6 arguments.

Extend maximum num ops:
-  static const unsigned int MAX_NUM_OPS = 5;
+  static const unsigned int MAX_NUM_OPS = 7;

Bootstrap and Regtest on X86 passed.
Tested on aarch64 Qemu.

Fully tested COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS on RISC-V backend.

gcc/ChangeLog:

* genmatch.cc (decision_tree::gen): Support
COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold.
* gimple-match-exports.cc (gimple_simplify): Ditto.
(gimple_resimplify6): New function.
(gimple_resimplify7): New function.
(gimple_match_op::resimplify): Support
COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold.
(convert_conditional_op): Ditto.
(build_call_internal): Ditto.
(try_conditional_simplification): Ditto.
(gimple_extract): Ditto.
* gimple-match.h (gimple_match_cond::gimple_match_cond): Ditto.
* internal-fn.cc (CASE): Ditto.

11 months agotree-optimization/111115 - SLP of masked stores
Richard Biener [Wed, 23 Aug 2023 12:28:26 +0000 (14:28 +0200)]
tree-optimization/111115 - SLP of masked stores

The following adds the capability to do SLP on .MASK_STORE, I do not
plan to add interleaving support.

PR tree-optimization/111115
gcc/
* tree-vectorizer.h (vect_slp_child_index_for_operand): New.
* tree-vect-data-refs.cc (can_group_stmts_p): Also group
.MASK_STORE.
* tree-vect-slp.cc (arg3_arg2_map): New.
(vect_get_operand_map): Handle IFN_MASK_STORE.
(vect_slp_child_index_for_operand): New function.
(vect_build_slp_tree_1): Handle statements with no LHS,
masked store ifns.
(vect_remove_slp_scalar_calls): Likewise.
* tree-vect-stmts.cc (vect_check_store_rhs): Lookup the
SLP child corresponding to the ifn value index.
(vectorizable_store): Likewise for the mask index.  Support
masked stores.
(vectorizable_load): Lookup the SLP child corresponding to the
ifn mask index.

gcc/testsuite/
* lib/target-supports.exp (check_effective_target_vect_masked_store):
Supported with check_avx_available.
* gcc.dg/vect/slp-mask-store-1.c: New testcase.

11 months agotree-optimization/111125 - properly cost BB reduction remain stmt handling
Richard Biener [Thu, 24 Aug 2023 08:30:12 +0000 (10:30 +0200)]
tree-optimization/111125 - properly cost BB reduction remain stmt handling

We assume that all root stmts which compose the total reduction chain
are vectorized but fail to account for the cost of adding back the
scalar defs we are not vectorizing.  The following rectifies this,
fixing the gcc.dg/tree-ssa/slsr-11.c FAIL on aarch64.

PR tree-optimization/111125
* tree-vect-slp.cc (vectorizable_bb_reduc_epilogue): Account
for the remain_defs processing.

11 months agoaarch64: Account for different Advanced SIMD fusing options
Richard Sandiford [Thu, 24 Aug 2023 09:18:05 +0000 (10:18 +0100)]
aarch64: Account for different Advanced SIMD fusing options

The scalar FNMADD/FNMSUB and SVE FNMLA/FNMLS instructions mean
that either side of a subtraction can start an accumulator chain.
However, Advanced SIMD doesn't have an equivalent instruction.
This means that, for Advanced SIMD, a subtraction can only be
fused if the second operand is a multiplication.

Also, if both sides of a subtraction are multiplications,
and if the second operand is used multiple times, such as:

     c * d - a * b
     e * f - a * b

then the first rather than second multiplication operand will tend
to be fused.  On Advanced SIMD, this leads to:

     tmp1 = a * b
     tmp2 = -tmp1
      ... = tmp2 + c * d   // FMLA
      ... = tmp2 + e * f   // FMLA

where one of the FMLAs also requires a MOV.

This patch tries to account for this in the vector cost model.
It improves roms performance by 2-3% on Neoverse V1.  It's also
needed to avoid a regression in fotonik for Neoverse N2 and
Neoverse V2 with the patch for PR110625.

gcc/
* config/aarch64/aarch64.cc: Include ssa.h.
(aarch64_multiply_add_p): Require the second operand of an
Advanced SIMD subtraction to be a multiplication.  Assume that
such an operation won't be fused if the second operand is used
multiple times and if the first operand is also a multiplication.

gcc/testsuite/
* gcc.target/aarch64/neoverse_v1_2.c: New test.
* gcc.target/aarch64/neoverse_v1_3.c: Likewise.

11 months agoVECT: Apply LEN_FOLD_EXTRACT_LAST into loop vectorizer
Juzhe-Zhong [Thu, 24 Aug 2023 02:08:36 +0000 (10:08 +0800)]
VECT: Apply LEN_FOLD_EXTRACT_LAST into loop vectorizer

Hi.

This patch is apply LEN_FOLD_EXTRACT_LAST into loop vectorizer.

Consider this following case:

/* Simple condition reduction.  */

int __attribute__ ((noinline, noclone))
condition_reduction (int *a, int min_v)
{
  int last = 66; /* High start value.  */

  for (int i = 0; i < N; i++)
    if (a[i] < min_v)
      last = i;

  return last;
}

With this patch, we can generate this following IR:

  _44 = .SELECT_VL (ivtmp_42, POLY_INT_CST [4, 4]);
  _34 = vect_vec_iv_.5_33 + { POLY_INT_CST [4, 4], ... };
  ivtmp_36 = _44 * 4;
  vect__4.8_39 = .MASK_LEN_LOAD (vectp_a.6_37, 32B, { -1, ... }, _44, 0);

  mask__11.9_41 = vect__4.8_39 < vect_cst__40;
  last_5 = .LEN_FOLD_EXTRACT_LAST (last_14, mask__11.9_41, vect_vec_iv_.5_33, _44, 0);
  ...

gcc/ChangeLog:

* tree-vect-loop.cc (vectorizable_reduction): Apply
LEN_FOLD_EXTRACT_LAST.
* tree-vect-stmts.cc (vectorizable_condition): Ditto.

11 months agotree-optimization/111128 - fix shift pattern recog
Richard Biener [Thu, 24 Aug 2023 08:00:20 +0000 (10:00 +0200)]
tree-optimization/111128 - fix shift pattern recog

The following fixes placement of shift operand sanitization with
MIN when the original shift operand was external but the actual
one is not.

PR tree-optimization/111128
* tree-vect-patterns.cc (vect_recog_over_widening_pattern):
Emit external shift operand inline if we promoted it with
another pattern stmt.

* gcc.dg/torture/pr111128.c: New testcase.

11 months agotestsuite/111125 - disable BB vectorization for the test
Richard Biener [Thu, 24 Aug 2023 08:55:06 +0000 (10:55 +0200)]
testsuite/111125 - disable BB vectorization for the test

The test is for loop vectorization producing non-canonical
multiplications.  We can now BB vectorize the whole function
when the target supports .REDUC_PLUS for V2SImode but we don't
have a dejagnu selector for that.  Disable BB vectorization
like we disabled epilogue vectorization.

PR testsuite/111125
* gcc.dg/vect/pr53773.c: Disable BB vectorization.

11 months agoRISC-V: Fix one typo in autovec.md pattern comment
Pan Li [Thu, 24 Aug 2023 08:05:55 +0000 (16:05 +0800)]
RISC-V: Fix one typo in autovec.md pattern comment

vfmsac => vfnmacc
vfmsub => vfnmadd

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/autovec.md: Fix typo.

11 months agoRISC-V: Refactor RVV class by frm_op_type template arg
Pan Li [Fri, 18 Aug 2023 02:43:13 +0000 (10:43 +0800)]
RISC-V: Refactor RVV class by frm_op_type template arg

As suggested by kito, we will add new frm_opt_type template arg
to the op class, to avoid the duplicated function expand.

Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc
(class binop_frm): Removed.
(class reverse_binop_frm): Ditto.
(class widen_binop_frm): Ditto.
(class vfmacc_frm): Ditto.
(class vfnmacc_frm): Ditto.
(class vfmsac_frm): Ditto.
(class vfnmsac_frm): Ditto.
(class vfmadd_frm): Ditto.
(class vfnmadd_frm): Ditto.
(class vfmsub_frm): Ditto.
(class vfnmsub_frm): Ditto.
(class vfwmacc_frm): Ditto.
(class vfwnmacc_frm): Ditto.
(class vfwmsac_frm): Ditto.
(class vfwnmsac_frm): Ditto.
(class unop_frm): Ditto.
(class vfrec7_frm): Ditto.
(class binop): Add frm_op_type template arg.
(class unop): Ditto.
(class widen_binop): Ditto.
(class widen_binop_fp): Ditto.
(class reverse_binop): Ditto.
(class vfmacc): Ditto.
(class vfnmsac): Ditto.
(class vfmadd): Ditto.
(class vfnmsub): Ditto.
(class vfnmacc): Ditto.
(class vfmsac): Ditto.
(class vfnmadd): Ditto.
(class vfmsub): Ditto.
(class vfwmacc): Ditto.
(class vfwnmacc): Ditto.
(class vfwmsac): Ditto.
(class vfwnmsac): Ditto.
(class float_misc): Ditto.

11 months agoMATCH: [PR111109] Fix bit_ior(cond,cond) when comparisons are fp
Andrew Pinski [Wed, 23 Aug 2023 16:46:10 +0000 (16:46 +0000)]
MATCH: [PR111109] Fix bit_ior(cond,cond) when comparisons are fp

The patterns that were added in r13-4620-g4d9db4bdd458, missed that
(a > b) and (a <= b) are not inverse of each other for floating point
comparisons (if NaNs are supported). Even though there was a check for
intergal types, it was only for the result of the cond rather for the
type of what is being compared. The fix is to check to see if cmp and
icmp are inverse of each other by using the invert_tree_comparison function.

OK for trunk and GCC 13 branch? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

I added the testcase to execute/ieee as it requires support for NAN.

PR tree-optimization/111109

gcc/ChangeLog:

* match.pd (ior(cond,cond), ior(vec_cond,vec_cond)):
Add check to make sure cmp and icmp are inverse.

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/ieee/fp-cmp-cond-1.c: New test.

11 months agoMATCH: remove negate for 1bit types
Andrew Pinski [Wed, 23 Aug 2023 01:41:56 +0000 (18:41 -0700)]
MATCH: remove negate for 1bit types

For 1bit types, negate is either undefined or don't change the value.
In either cases we want to remove them.
This patch adds a match pattern to do that.
Also converting to a 1bit type we can remove the negate just like we already do
for `&1` so this patch adds that too.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Notes on the testcases:
This patch is the last part to fix PR 95929; cond-bool-2.c testcase.
bit1neg-1.c is a 1bit-field testcase where we could remove the assignment
all the way in one case (which happened on the RTL level for some targets but not all).
cond-bool-2.c is the reduced testcase of PR 95929.

PR tree-optimization/95929

gcc/ChangeLog:

* match.pd (convert?(-a)): New pattern
for 1bit integer types.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bit1neg-1.c: New test.
* gcc.dg/tree-ssa/cond-bool-1.c: New test.
* gcc.dg/tree-ssa/cond-bool-2.c: New test.

11 months agoRevert "Initial support for AVX10.1"
Haochen Jiang [Thu, 24 Aug 2023 06:38:38 +0000 (14:38 +0800)]
Revert "Initial support for AVX10.1"

This reverts commit 11ad44da01dd1c91c96e45802fd8b1c50e88703f.

11 months agoRevert "Emit a warning when disabling AVX512 with AVX10 enabled or disabling AVX10...
Haochen Jiang [Thu, 24 Aug 2023 06:38:18 +0000 (14:38 +0800)]
Revert "Emit a warning when disabling AVX512 with AVX10 enabled or disabling AVX10 with AVX512 enabled"

This reverts commit 0288ab14732a16b3787546cdd159941eb7306cf3.

11 months agoRevert "Emit a warning when AVX10 options conflict in vector width"
Haochen Jiang [Thu, 24 Aug 2023 06:38:01 +0000 (14:38 +0800)]
Revert "Emit a warning when AVX10 options conflict in vector width"

This reverts commit 26a820dc136b00b4dc37609429576b6a914cb572.

11 months agoRevert "Support AVX10.1 for AVX512DQ+AVX512VL intrins"
Haochen Jiang [Thu, 24 Aug 2023 06:37:41 +0000 (14:37 +0800)]
Revert "Support AVX10.1 for AVX512DQ+AVX512VL intrins"

This reverts commit 2485dd9b4e219307f00d683077bbaf5a2add6604.

11 months agoRevert "Support AVX10.1 for AVX512DQ+AVX512VL intrins"
Haochen Jiang [Thu, 24 Aug 2023 06:37:07 +0000 (14:37 +0800)]
Revert "Support AVX10.1 for AVX512DQ+AVX512VL intrins"

This reverts commit 1c3c405ecf23aeb3a2976350887bf2238719c71f.

11 months agoRevert "[Patch 3/6] Support AVX10.1 for AVX512DQ+AVX512VL intrins"
Haochen Jiang [Thu, 24 Aug 2023 06:36:48 +0000 (14:36 +0800)]
Revert "[Patch 3/6] Support AVX10.1 for AVX512DQ+AVX512VL intrins"

This reverts commit d14ab07ee91de0ebf80b73a22c4a23ecf2a2572e.

11 months agoRevert "[Patch 4/6] Support AVX10.1 for AVX512DQ+AVX512VL intrins"
Haochen Jiang [Thu, 24 Aug 2023 06:36:32 +0000 (14:36 +0800)]
Revert "[Patch 4/6] Support AVX10.1 for AVX512DQ+AVX512VL intrins"

This reverts commit aba10895052fcb2ab3c6d53ad98c855509877555.

11 months agoRevert "[Patch 5/6] Support AVX10.1 for AVX512DQ+AVX512VL intrins"
Haochen Jiang [Thu, 24 Aug 2023 06:35:55 +0000 (14:35 +0800)]
Revert "[Patch 5/6] Support AVX10.1 for AVX512DQ+AVX512VL intrins"

This reverts commit 0b20e0f17b47a86cddba68a2e016be0132ae9b0a.

11 months agoRevert "[Patch 6/6] Support AVX10.1 for AVX512DQ+AVX512VL intrins"
Haochen Jiang [Thu, 24 Aug 2023 06:35:26 +0000 (14:35 +0800)]
Revert "[Patch 6/6] Support AVX10.1 for AVX512DQ+AVX512VL intrins"

This reverts commit 5ccdfd0870be168031f8902e1039e77be93b131a.

11 months agoRevert "i386: Add AVX2 pragma wrapper for AVX512DQVL intrins"
Haochen Jiang [Thu, 24 Aug 2023 06:35:03 +0000 (14:35 +0800)]
Revert "i386: Add AVX2 pragma wrapper for AVX512DQVL intrins"

This reverts commit 68f7cb6cf9e8b9f2254855507f3b479552adda5f.

11 months agodebug/111080 - avoid outputting debug info for unused restrict qualified type
Richard Biener [Mon, 21 Aug 2023 08:34:30 +0000 (10:34 +0200)]
debug/111080 - avoid outputting debug info for unused restrict qualified type

The following applies some maintainance with respect to type qualifiers
and kinds added by later DWARF standards to prune_unused_types_walk.
The particular case in the bug is not handling (thus marking required)
all restrict qualified type DIEs.  I've found more DW_TAG_*_type that
are unhandled, looked up the DWARF docs and added them as well based
on common sense.

PR debug/111080
* dwarf2out.cc (prune_unused_types_walk): Handle
DW_TAG_restrict_type, DW_TAG_shared_type, DW_TAG_atomic_type,
DW_TAG_immutable_type, DW_TAG_coarray_type, DW_TAG_unspecified_type
and DW_TAG_dynamic_type as to only output them when referenced.

* gcc.dg/debug/dwarf2/pr111080.c: New testcase.

11 months agoAdjust GCC V13 to GCC 13.1 in diagnotic.
liuhongt [Tue, 22 Aug 2023 23:31:13 +0000 (07:31 +0800)]
Adjust GCC V13 to GCC 13.1 in diagnotic.

gcc/ChangeLog:

* config/i386/i386.cc (ix86_invalid_conversion): Adjust GCC
V13 to GCC 13.1.

11 months agoFix target_clone ("arch=graniterapids-d") and target_clone ("arch=arrowlake-s")
liuhongt [Tue, 22 Aug 2023 10:18:31 +0000 (18:18 +0800)]
Fix target_clone ("arch=graniterapids-d") and target_clone ("arch=arrowlake-s")

Both "graniterapid-d" and "graniterapids" are attached with
PROCESSOR_GRANITERAPID in processor_alias_table but mapped to
different __cpu_subtype in get_intel_cpu.

And get_builtin_code_for_version will try to match the first
PROCESSOR_GRANITERAPIDS in processor_alias_table which maps to
"granitepraids" here.

861      else if (new_target->arch_specified && new_target->arch > 0)
1862        for (i = 0; i < pta_size; i++)
1863          if (processor_alias_table[i].processor == new_target->arch)
1864            {
1865              const pta *arch_info = &processor_alias_table[i];
1866              switch (arch_info->priority)
1867                {
1868                default:
1869                  arg_str = arch_info->name;

This mismatch makes dispatch_function_versions check the preidcate
of__builtin_cpu_is ("graniterapids") for "graniterapids-d" and causes
the issue.
The patch explicitly adds PROCESSOR_ARROWLAKE_S and
PROCESSOR_GRANITERAPIDS_D to make a distinction.

For "alderlake","raptorlake", "meteorlake" they share same isa, cost,
tuning, and mapped to the same __cpu_type/__cpu_subtype in
get_intel_cpu, so no need to add PROCESSOR_RAPTORLAKE and others.

gcc/ChangeLog:

* common/config/i386/i386-common.cc (processor_names): Add new
member graniterapids-s and arrowlake-s.
* config/i386/i386-options.cc (processor_alias_table): Update
table with PROCESSOR_ARROWLAKE_S and
PROCESSOR_GRANITERAPIDS_D.
(m_GRANITERAPID_D): New macro.
(m_ARROWLAKE_S): Ditto.
(m_CORE_AVX512): Add m_GRANITERAPIDS_D.
(processor_cost_table): Add icelake_cost for
PROCESSOR_GRANITERAPIDS_D and alderlake_cost for
PROCESSOR_ARROWLAKE_S.
* config/i386/x86-tune.def: Hanlde m_ARROWLAKE_S same as
m_ARROWLAKE.
* config/i386/i386.h (enum processor_type): Add new member
PROCESSOR_GRANITERAPIDS_D and PROCESSOR_ARROWLAKE_S.
* config/i386/i386-c.cc (ix86_target_macros_internal): Handle
PROCESSOR_GRANITERAPIDS_D and PROCESSOR_ARROWLAKE_S

11 months agotestsuite: Xfail gcc.dg/tree-ssa/update-threading.c for CRIS, PR110628
Hans-Peter Nilsson [Thu, 24 Aug 2023 00:55:41 +0000 (02:55 +0200)]
testsuite: Xfail gcc.dg/tree-ssa/update-threading.c for CRIS, PR110628

* gcc.dg/tree-ssa/update-threading.c: Xfail for cris-*-*.

11 months agoDaily bump.
GCC Administrator [Thu, 24 Aug 2023 00:18:18 +0000 (00:18 +0000)]
Daily bump.

11 months agoImprove quality of code from LRA register elimination
Jivan Hakobyan [Wed, 23 Aug 2023 20:10:30 +0000 (14:10 -0600)]
Improve quality of code from LRA register elimination

This is primarily Jivan's work, I'm mostly responsible for the write-up and
coordinating with Vlad on a few questions.

On targets with limitations on immediates usable in arithmetic instructions,
LRA's register elimination phase can construct fairly poor code.

This example (from the GCC testsuite) illustrates the problem well.

int  consume (void *);
int foo (void) {
  int x[1000000];
  return consume (x + 1000);
}

If you compile on riscv64-linux-gnu with "-O2 -march=rv64gc -mabi=lp64d", then
you'll get this code (up to the call to consume()).

        .cfi_startproc
        li      t0,-4001792
        li      a0,-3997696
        li      a5,4001792
        addi    sp,sp,-16
        .cfi_def_cfa_offset 16
        addi    t0,t0,1792
        addi    a0,a0,1696
        addi    a5,a5,-1792
        sd      ra,8(sp)
        add     a5,a5,a0
        add     sp,sp,t0
        .cfi_def_cfa_offset 4000016
        .cfi_offset 1, -8
        add     a0,a5,sp
        call    consume

Of particular interest is the value in a0 when we call consume. We compute that
horribly inefficiently.   If we back-substitute from the final assignment to a0
we get...

a0 = a5 + sp
a0 = a5 + (sp + t0)
a0 = (a5 + a0) + (sp + t0)
a0 = ((a5 - 1792) + a0) + (sp + t0)
a0 = ((a5 - 1792) + (a0 + 1696)) + (sp + t0)
a0 = ((a5 - 1792) + (a0 + 1696)) + (sp + (t0 + 1792))
a0 = (a5 + (a0 + 1696)) + (sp + t0)  // removed offsetting terms
a0 = (a5 + (a0 + 1696)) + ((sp - 16) + t0)
a0 = (4001792 + (a0 + 1696)) + ((sp - 16) + t0)
a0 = (4001792 + (-3997696 + 1696)) + ((sp - 16) + t0)
a0 = (4001792 + (-3997696 + 1696)) + ((sp - 16) + -4001792)
a0 = (-3997696 + 1696) + (sp -16) // removed offsetting terms
a0 = sp - 3990616

That's a pretty convoluted way to compute sp - 3990616.

Something like this would be notably better (not great, but we need both the
stack adjustment and the address of the object to pass to consume):

   addi sp,sp,-16
   sd ra,8(sp)
   li t0,-4001792
   addi t0,t0,1792
   add sp,sp,t0
   li a0,4096
   addi a0,a0,-96
   add a0,sp,a0
   call consume

The problem is LRA's elimination code is not handling the case where we have
(plus (reg1) (reg2) where reg1 is an eliminable register and reg2 has a known
equivalency, particularly a constant.

If we can determine that reg2 is equivalent to a constant and treat (plus
(reg1) (reg2)) in the same way we'd treat (plus (reg1) (const_int)) then we can
get the desired code.

This eliminates about 19b instructions, or roughly 1% for deepsjeng on rv64.
There are improvements elsewhere, but they're relatively small.  This may
ultimately lessen the value of Manolis's fold-mem-offsets patch.  So we'll have
to evaluate that again once he posts a new version.

Bootstrapped and regression tested on x86_64 as well as bootstrapped on rv64.
Earlier versions have been tested against spec2017.  Pre-approved by Vlad in a
private email conversation (thanks Vlad!).

Committed to the trunk,

gcc/
* lra-eliminations.cc (eliminate_regs_in_insn): Use equivalences to
to help simplify code further.

11 months agoFortran: improve diagnostic message for COMMON with automatic object [PR32986]
Harald Anlauf [Wed, 23 Aug 2023 19:08:01 +0000 (21:08 +0200)]
Fortran: improve diagnostic message for COMMON with automatic object [PR32986]

gcc/fortran/ChangeLog:

PR fortran/32986
* resolve.cc (is_non_constant_shape_array): Add forward declaration.
(resolve_common_vars): Diagnose automatic array object in COMMON.
(resolve_symbol): Prevent confusing follow-on error.

gcc/testsuite/ChangeLog:

PR fortran/32986
* gfortran.dg/common_28.f90: New test.

11 months agoPhi analyzer - Initialize with range instead of a tree.
Andrew MacLeod [Thu, 17 Aug 2023 16:34:59 +0000 (12:34 -0400)]
Phi analyzer - Initialize with range instead of a tree.

Rangers PHI analyzer currently only allows a single initializer to a group.
This patch changes that to use an inialization range, which is
cumulative of all integer constants, plus a single symbolic value.
There is no other change to group functionality.

This patch also changes the way PHI groups are printed so they show up in the
listing as they are encountered, rather than as a list at the end.  It
was more difficult to see what was going on previously.

PR tree-optimization/110918 - Initialize with range instead of a tree.
gcc/
* gimple-range-fold.cc (fold_using_range::range_of_phi): Tweak output.
* gimple-range-phi.cc (phi_group::phi_group): Remove unused members.
Initialize using a range instead of value and edge.
(phi_group::calculate_using_modifier): Use initializer value and
process for relations after trying for iteration convergence.
(phi_group::refine_using_relation): Use initializer range.
(phi_group::dump): Rework the dump output.
(phi_analyzer::process_phi): Allow multiple constant initilizers.
Dump groups immediately as created.
(phi_analyzer::dump): Tweak output.
* gimple-range-phi.h (phi_group::phi_group): Adjust prototype.
(phi_group::initial_value): Delete.
(phi_group::refine_using_relation): Adjust prototype.
(phi_group::m_initial_value): Delete.
(phi_group::m_initial_edge): Delete.
(phi_group::m_vr): Use int_range_max.
* tree-vrp.cc (execute_ranger_vrp): Don't dump phi groups.

gcc/testsuite/
* gcc.dg/pr102983.c: Adjust output expectations.
* gcc.dg/pr110918.c: New.

11 months agoDon't process phi groups with one phi.
Andrew MacLeod [Wed, 16 Aug 2023 17:23:06 +0000 (13:23 -0400)]
Don't process phi groups with one phi.

The phi analyzer should not create a phi group containing a single phi.

* gimple-range-phi.cc (phi_analyzer::operator[]): Return NULL if
no group was created.
(phi_analyzer::process_phi): Do not create groups of one phi node.

11 months agortl: use rtx_code for gen_ccmp_first and gen_ccmp_next
Richard Earnshaw [Tue, 22 Aug 2023 14:26:59 +0000 (15:26 +0100)]
rtl: use rtx_code for gen_ccmp_first and gen_ccmp_next

Now that we have a forward declaration of rtx_code in coretypes.h, we
can adjust these hooks to take rtx_code arguments rather than an int.

gcc/ChangeLog:

* target.def (gen_ccmp_first, gen_ccmp_next): Use rtx_code for
CODE, CMP_CODE and BIT_CODE arguments.
* config/aarch64/aarch64.cc (aarch64_gen_ccmp_first): Likewise.
(aarch64_gen_ccmp_next): Likewise.
* doc/tm.texi: Regenerated.

11 months agortl: Forward declare rtx_code
Richard Earnshaw [Thu, 27 Jul 2023 16:28:30 +0000 (17:28 +0100)]
rtl: Forward declare rtx_code

Now that we require C++ 11, we can safely forward declare rtx_code
so that we can use it in target hooks.

gcc/ChangeLog
* coretypes.h (rtx_code): Add forward declaration.
* rtl.h (rtx_code): Make compatible with forward declaration.

11 months agoi386: Fix register spill failure with concat RTX [PR111010]
Uros Bizjak [Wed, 23 Aug 2023 14:39:21 +0000 (16:39 +0200)]
i386: Fix register spill failure with concat RTX [PR111010]

Disable (=&r,m,m) alternative for 32-bit targets. The combination of two
memory operands (possibly with complex addressing mode), early clobbered
output, frame pointer and PIC registers uses too much registers on
a register constrained 32-bit target.

Also merge two similar patterns using DWIH mode iterator.

PR target/111010

gcc/ChangeLog:

* config/i386/i386.md (*concat<any_or_plus:mode><dwi>3_3):
Merge pattern from *concatditi3_3 and *concatsidi3_3 using
DWIH mode iterator.  Disable (=&r,m,m) alternative for
32-bit targets.
(*concat<any_or_plus:mode><dwi>3_3): Disable (=&r,m,m)
alternative for 32-bit targets.

11 months ago[PATCH] RISC-V:add a more appropriate type attribute
Zhangjin Liao [Wed, 23 Aug 2023 14:02:47 +0000 (08:02 -0600)]
[PATCH] RISC-V:add a more appropriate type attribute

Due to the more accurate type attribute added to the clz, ctz, and pcnt
operations in https://github.com/gcc-mirror/gcc/commit/07e2576d6f3 the
same type attribute should be used here.

gcc/ChangeLog:

* config/riscv/bitmanip.md (*<bitmanip_optab>disi2_sext): Add a more
appropriate type attribute.

11 months agoRISC-V: Add conditional unary neg/abs/not autovec patterns
Lehua Ding [Wed, 23 Aug 2023 03:25:20 +0000 (11:25 +0800)]
RISC-V: Add conditional unary neg/abs/not autovec patterns

Hi,

This patch add conditional unary neg/abs/not autovec patterns to RISC-V backend.
For this C code:

void
test_3 (float *__restrict a, float *__restrict b, int *__restrict pred, int n)
{
  for (int i = 0; i < n; i += 1)
    {
      a[i] = pred[i] ? __builtin_fabsf (b[i]) : a[i];
    }
}

Before this patch:
        ...
        vsetvli a7,zero,e32,m1,ta,ma
        vfabs.v v2,v2
        vmerge.vvm      v1,v1,v2,v0
        ...

After this patch:
        ...
        vsetvli a7,zero,e32,m1,ta,mu
        vfabs.v v1,v2,v0.t
        ...

For int neg/not and FP neg patterns, Defining the corresponding cond_xxx paterns
is enough.
For the FP abs pattern, We need to change the definition of `abs<mode>2` and
`@vcond_mask_<mode><vm>` pattern from define_expand to define_insn_and_split
in order to fuse them into a new pattern `*cond_abs<mode>` at the combine pass.
A fusion process similar to the one below:

(insn 30 29 31 4 (set (reg:RVVM1SF 152 [ vect_iftmp.15 ])
        (abs:RVVM1SF (reg:RVVM1SF 137 [ vect__6.14 ]))) "float.c":15:56 discrim 1 12799 {absrvvm1sf2}
     (expr_list:REG_DEAD (reg:RVVM1SF 137 [ vect__6.14 ])
        (nil)))

(insn 31 30 32 4 (set (reg:RVVM1SF 140 [ vect_iftmp.19 ])
        (if_then_else:RVVM1SF (reg:RVVMF32BI 136 [ mask__27.11 ])
            (reg:RVVM1SF 152 [ vect_iftmp.15 ])
            (reg:RVVM1SF 139 [ vect_iftmp.18 ]))) 12707 {vcond_mask_rvvm1sfrvvmf32bi}
     (expr_list:REG_DEAD (reg:RVVM1SF 152 [ vect_iftmp.15 ])
        (expr_list:REG_DEAD (reg:RVVM1SF 139 [ vect_iftmp.18 ])
            (expr_list:REG_DEAD (reg:RVVMF32BI 136 [ mask__27.11 ])
                (nil)))))
==>

(insn 31 30 32 4 (set (reg:RVVM1SF 140 [ vect_iftmp.19 ])
        (if_then_else:RVVM1SF (reg:RVVMF32BI 136 [ mask__27.11 ])
            (abs:RVVM1SF (reg:RVVM1SF 137 [ vect__6.14 ]))
            (reg:RVVM1SF 139 [ vect_iftmp.18 ]))) 13444 {*cond_absrvvm1sf}
     (expr_list:REG_DEAD (reg:RVVM1SF 137 [ vect__6.14 ])
        (expr_list:REG_DEAD (reg:RVVMF32BI 136 [ mask__27.11 ])
            (expr_list:REG_DEAD (reg:RVVM1SF 139 [ vect_iftmp.18 ])
                (nil)))))

Best,
Lehua

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*cond_abs<mode>): New combine pattern.
(*copysign<mode>_neg): Ditto.
* config/riscv/autovec.md (@vcond_mask_<mode><vm>): Adjust.
(<optab><mode>2): Ditto.
(cond_<optab><mode>): New.
(cond_len_<optab><mode>): Ditto.
* config/riscv/riscv-protos.h (enum insn_type): New.
(expand_cond_len_unop): New helper func.
* config/riscv/riscv-v.cc (shuffle_merge_patterns): Adjust.
(expand_cond_len_unop): New helper func.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/cond/cond_unary-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-5.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-6.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-7.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-8.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary_run-5.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary_run-6.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary_run-7.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_unary_run-8.c: New test.

11 months agoFix handling of static exists in loop_ch
Jan Hubicka [Wed, 23 Aug 2023 09:17:20 +0000 (11:17 +0200)]
Fix handling of static exists in loop_ch

This patch fixes wrong return value in should_duplicate_loop_header_p.
Doing so uncovered suboptimal decisions on some jump threading testcases
where we choose to stop duplicating just before basic block that has zero
cost and duplicating so would be always a win.

This is because the heuristics trying to choose right point to duplicate
all winning blocks and to get loop to be do_while did not account
zero_cost blocks in all cases.  The patch simplifies the logic by
simply remembering zero cost blocks and handling them last after
the right stopping point is chosen.

gcc/ChangeLog:

* tree-ssa-loop-ch.cc (enum ch_decision): Fix comment.
(should_duplicate_loop_header_p): Fix return value for static exits.
(ch_base::copy_headers): Improve handling of ch_possible_zero_cost.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/copy-headers-9.c: Update template.

11 months agoAdd testcase for PR110940
Jan Hubicka [Wed, 23 Aug 2023 09:14:53 +0000 (11:14 +0200)]
Add testcase for PR110940

gcc/testsuite/ChangeLog:
PR middle-end/110940
* gcc.c-torture/compile/pr110940.c: New test.

11 months agolibffi: Backport of LoongArch support for libffi.
Lulu Cheng [Tue, 22 Aug 2023 11:56:21 +0000 (19:56 +0800)]
libffi: Backport of LoongArch support for libffi.

This is a backport of <https://github.com/libffi/libffi/commit/f259a6f6de>,
and contains modifications to commit 5a4774cd4d, as well as the LoongArch
schema portion of commit ee22ecbd11. This is needed for libgo.

libffi/ChangeLog:

PR libffi/108682
* configure.host: Add LoongArch support.
* Makefile.am: Likewise.
* Makefile.in: Regenerate.
* src/loongarch64/ffi.c: New file.
* src/loongarch64/ffitarget.h: New file.
* src/loongarch64/sysv.S: New file.

11 months agovect: Move VMAT_GATHER_SCATTER handlings from final loop nest
Kewen Lin [Wed, 23 Aug 2023 05:09:14 +0000 (00:09 -0500)]
vect: Move VMAT_GATHER_SCATTER handlings from final loop nest

Like r14-3317 which moves the handlings on memory access
type VMAT_GATHER_SCATTER in vectorizable_load final loop
nest, this one is to deal with vectorizable_store side.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_store): Move the handlings on
VMAT_GATHER_SCATTER in the final loop nest to its own loop,
and update the final nest accordingly.

11 months agovect: Move VMAT_LOAD_STORE_LANES handlings from final loop nest
Kewen Lin [Wed, 23 Aug 2023 05:09:14 +0000 (00:09 -0500)]
vect: Move VMAT_LOAD_STORE_LANES handlings from final loop nest

Like commit r14-3214 which moves the handlings on memory
access type VMAT_LOAD_STORE_LANES in vectorizable_load
final loop nest, this one is to deal with the function
vectorizable_store.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_store): Move the handlings on
VMAT_LOAD_STORE_LANES in the final loop nest to its own loop,
and update the final nest accordingly.

11 months agovect: Remove some manual release in vectorizable_store
Kewen Lin [Wed, 23 Aug 2023 05:09:14 +0000 (00:09 -0500)]
vect: Remove some manual release in vectorizable_store

To avoid some duplicates in some follow-up patches on
function vectorizable_store, this patch is to adjust some
existing vec with auto_vec and remove some manual release
invocation.  Also refactor a bit and remove some uesless
codes.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_store): Remove vec oprnds,
adjust vec result_chain, vec_oprnd with auto_vec, and adjust
gvec_oprnds with auto_delete_vec.

11 months agolibstdc++: Fix tests relying on operator new/delete overload
François Dumont [Mon, 21 Aug 2023 05:02:06 +0000 (07:02 +0200)]
libstdc++: Fix tests relying on operator new/delete overload

Fix tests that are checking for an expected allocation plan. They are failing if
an allocation is taking place outside the test main.

libstdc++-v3/ChangeLog

* testsuite/util/replacement_memory_operators.h
(counter::scope): New, capture and reset counter count at construction and
restore it at destruction.
(counter::check_new): Add scope instantiation.
* testsuite/23_containers/unordered_map/96088.cc (main):
Add counter::scope instantiation.
* testsuite/23_containers/unordered_multimap/96088.cc (main): Likewise.
* testsuite/23_containers/unordered_multiset/96088.cc (main): Likewise.
* testsuite/23_containers/unordered_set/96088.cc (main): Likewise.
* testsuite/ext/malloc_allocator/deallocate_local.cc (main): Likewise.
* testsuite/ext/new_allocator/deallocate_local.cc (main): Likewise.
* testsuite/ext/throw_allocator/deallocate_local.cc (main): Likewise.
* testsuite/ext/pool_allocator/allocate_chunk.cc (started): New global.
(operator new(size_t)): Check started.
(main): Set/Unset started.
* testsuite/17_intro/no_library_allocation.cc: New test case.

11 months agoRISC-V: Fix potential ICE of global vsetvl elimination
Juzhe-Zhong [Wed, 23 Aug 2023 02:42:01 +0000 (10:42 +0800)]
RISC-V: Fix potential ICE of global vsetvl elimination

Committed for following VSETVL refactor patch to make V2 patch easier to review.
gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc
(pass_vsetvl::global_eliminate_vsetvl_insn): Fix potential ICE.

11 months agoRISC-V: Fix VTYPE fuse rule bug
Juzhe-Zhong [Wed, 23 Aug 2023 02:32:30 +0000 (10:32 +0800)]
RISC-V: Fix VTYPE fuse rule bug

This bug is exposed after refactor patch.
Separate it and commited.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (ge_sew_ratio_unavailable_p):
Fix fuse rule bug.
* config/riscv/riscv-vsetvl.def (DEF_SEW_LMUL_FUSE_RULE): Ditto.

11 months agoRISC-V: Fix gather_load_run-12.c test
Juzhe-Zhong [Wed, 23 Aug 2023 02:21:22 +0000 (10:21 +0800)]
RISC-V: Fix gather_load_run-12.c test

FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c:
Add vsetvli asm.

11 months agoRISC-V: Add attribute to vtype change only vsetvl
Juzhe-Zhong [Wed, 23 Aug 2023 02:11:06 +0000 (10:11 +0800)]
RISC-V: Add attribute to vtype change only vsetvl

This patch is prepare patch for VSETVL PASS.

Commited.

gcc/ChangeLog:

* config/riscv/vector.md: Add attribute.

11 months agoRISC-V: Adapt live-1.c testcase
Juzhe-Zhong [Wed, 23 Aug 2023 01:19:15 +0000 (09:19 +0800)]
RISC-V: Adapt live-1.c testcase

Commited.

Fix failures:

FAIL: gcc.target/riscv/rvv/autovec/partial/live-1.c scan-tree-dump-times optimized ".VEC_EXTRACT" 10
FAIL: gcc.target/riscv/rvv/autovec/partial/live-1.c scan-tree-dump-times optimized ".VEC_EXTRACT" 10

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/live-1.c: Adapt test.

11 months agoDaily bump.
GCC Administrator [Wed, 23 Aug 2023 00:17:59 +0000 (00:17 +0000)]
Daily bump.

11 months agoRISC-V: Clang format riscv-vsetvl.cc[NFC]
Juzhe-Zhong [Tue, 22 Aug 2023 23:22:25 +0000 (07:22 +0800)]
RISC-V: Clang format riscv-vsetvl.cc[NFC]

Commited.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (change_insn): Clang format.
(vector_infos_manager::all_same_ratio_p): Ditto.
(vector_infos_manager::all_same_avl_p): Ditto.
(pass_vsetvl::refine_vsetvls): Ditto.
(pass_vsetvl::cleanup_vsetvls): Ditto.
(pass_vsetvl::commit_vsetvls): Ditto.
(pass_vsetvl::local_eliminate_vsetvl_insn): Ditto.
(pass_vsetvl::global_eliminate_vsetvl_insn): Ditto.
(pass_vsetvl::compute_probabilities): Ditto.

11 months agoRISC-V: Add riscv-vsetvl.def to t-riscv
Juzhe-Zhong [Tue, 22 Aug 2023 23:06:50 +0000 (07:06 +0800)]
RISC-V: Add riscv-vsetvl.def to t-riscv

This patch will be backport to GCC 13 and commit to trunk.
gcc/ChangeLog:

* config/riscv/t-riscv: Add riscv-vsetvl.def

11 months agolibgomp, testsuite: Do not call nonstandard functions
Francois-Xavier Coudert [Tue, 22 Aug 2023 08:15:00 +0000 (10:15 +0200)]
libgomp, testsuite: Do not call nonstandard functions

The following functions are not standard, and not always available
(e.g., on darwin). They should not be called unless available: gamma,
gammaf, scalb, scalbf, significand, and significandf.

libgomp/ChangeLog:

* testsuite/lib/libgomp.exp: Add effective target.
* testsuite/libgomp.c/simd-math-1.c: Avoid calling nonstandard
functions.

11 months agoanalyzer: reimplement kf_strlen [PR105899]
David Malcolm [Tue, 22 Aug 2023 22:36:54 +0000 (18:36 -0400)]
analyzer: reimplement kf_strlen [PR105899]

Reimplement kf_strlen in terms of the new string scanning
implementation, sharing strlen's implementation with
__analyzer_get_strlen.

gcc/analyzer/ChangeLog:
PR analyzer/105899
* kf-analyzer.cc (class kf_analyzer_get_strlen): Move to kf.cc.
(register_known_analyzer_functions): Use make_kf_strlen.
* kf.cc (class kf_strlen::impl_call_pre): Replace with
implementation of kf_analyzer_get_strlen from kf-analyzer.cc.
Handle "UNKNOWN" return from check_for_null_terminated_string_arg
by falling back to a conjured svalue.
(make_kf_strlen): New.
(register_known_functions): Use make_kf_strlen.
* known-function-manager.h (make_kf_strlen): New decl.

gcc/testsuite/ChangeLog:
PR analyzer/105899
* gcc.dg/analyzer/null-terminated-strings-1.c: Update expected
results on symbolic values.
* gcc.dg/analyzer/strlen-1.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
11 months agoc++: maybe_substitute_reqs_for fix
Jason Merrill [Fri, 18 Aug 2023 22:24:53 +0000 (18:24 -0400)]
c++: maybe_substitute_reqs_for fix

While working on PR109751 I found that maybe_substitute_reqs_for was doing
the wrong thing for a non-template friend, substituting in the template args
of the scope's original template rather than those of the instantiation.
This didn't end up being necessary to fix the PR, but it's still an
improvement.

gcc/cp/ChangeLog:

* pt.cc (outer_template_args): Handle non-template argument.
* constraint.cc (maybe_substitute_reqs_for): Pass decl to it.
* cp-tree.h (outer_template_args): Adjust.

11 months agoc++: constrained hidden friends [PR109751]
Jason Merrill [Thu, 17 Aug 2023 15:36:23 +0000 (11:36 -0400)]
c++: constrained hidden friends [PR109751]

r13-4035 avoided a problem with overloading of constrained hidden friends by
checking satisfaction, but checking satisfaction early is inconsistent with
the usual late checking and can lead to hard errors, so let's not do that
after all.

We were wrongly treating the different instantiations of the same friend
template as the same function because maybe_substitute_reqs_for was failing
to actually substitute in the case of a non-template friend.  But we don't
actually need to do the substitution anyway, because [temp.friend] says that
such a friend can't be the same as any other declaration.

After fixing that, instead of a redefinition error we got an ambiguous
overload error, fixed by allowing constrained hidden friends to coexist
until overload resolution, at which point they probably won't be in the same
ADL overload set anyway.

And we avoid mangling collisions by following the proposed mangling for
these friends as a member function with an extra 'F' before the name.  I
demangle this by just adding [friend] to the name of the function because
it's not feasible to reconstruct the actual scope of the function since the
mangling ABI doesn't distinguish between class and namespace scopes.

PR c++/109751

gcc/cp/ChangeLog:

* cp-tree.h (member_like_constrained_friend_p): Declare.
* decl.cc (member_like_constrained_friend_p): New.
(function_requirements_equivalent_p): Check it.
(duplicate_decls): Check it.
(grokfndecl): Check friend template constraints.
* mangle.cc (decl_mangling_context): Check it.
(write_unqualified_name): Check it.
* pt.cc (uses_outer_template_parms_in_constraints): Fix for friends.
(tsubst_friend_function): Don't check satisfaction.

include/ChangeLog:

* demangle.h (enum demangle_component_type): Add
DEMANGLE_COMPONENT_FRIEND.

libiberty/ChangeLog:

* cp-demangle.c (d_make_comp): Handle DEMANGLE_COMPONENT_FRIEND.
(d_count_templates_scopes): Likewise.
(d_print_comp_inner): Likewise.
(d_unqualified_name): Handle member-like friend mangling.
* testsuite/demangle-expected: Add test.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-friend11.C: Now works.  Add template.
* g++.dg/cpp2a/concepts-friend15.C: New test.

11 months agoRISC-V: output Autovec params explicitly in --help ...
Vineet Gupta [Tue, 22 Aug 2023 17:32:12 +0000 (10:32 -0700)]
RISC-V: output Autovec params explicitly in --help ...

... otherwise user has no clue what -param to actually change

gcc/ChangeLog:
* config/riscv/riscv.opt: Add --param names
riscv-autovec-preference and riscv-autovec-lmul

Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
11 months agoRISC-V: Add multiarch support on riscv-linux-gnu
Raphael Moreira Zinsly [Tue, 22 Aug 2023 17:37:04 +0000 (11:37 -0600)]
RISC-V: Add multiarch support on riscv-linux-gnu

This adds multiarch support to the RISC-V port so that bootstraps work with
Debian out-of-the-box.  Without this patch the stage1 compiler is unable to
find headers/libraries when building the stage1 runtime.

This is functionally (and possibly textually) equivalent to Debian's fix for
the same problem.

gcc/
* config/riscv/t-linux: Add MULTIARCH_DIRNAME.

11 months agoOpenMP: Handle 'all' as category in defaultmap
Tobias Burnus [Tue, 22 Aug 2023 15:06:50 +0000 (17:06 +0200)]
OpenMP: Handle 'all' as category in defaultmap

Both, specifying no category and specifying 'all', implies
that the implicit-behavior applies to all categories.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_omp_clause_defaultmap): Parse
'all' as category.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_omp_clause_defaultmap): Parse
'all' as category.

gcc/fortran/ChangeLog:

* gfortran.h (enum gfc_omp_defaultmap_category):
Add OMP_DEFAULTMAP_CAT_ALL.
* openmp.cc (gfc_match_omp_clauses): Parse
'all' as category.
* trans-openmp.cc (gfc_trans_omp_clauses): Handle it.

gcc/ChangeLog:

* tree-core.h (enum omp_clause_defaultmap_kind): Add
OMP_CLAUSE_DEFAULTMAP_CATEGORY_ALL.
* gimplify.cc (gimplify_scan_omp_clauses): Handle it.
* tree-pretty-print.cc (dump_omp_clause): Likewise.

libgomp/ChangeLog:

* libgomp.texi (OpenMP 5.2 status): Add depobj with
destroy-var argument as 'N'. Mark defaultmap with
'all' category as 'Y'.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/defaultmap-1.f90: Update dg-error.
* c-c++-common/gomp/defaultmap-5.c: New test.
* c-c++-common/gomp/defaultmap-6.c: New test.
* gfortran.dg/gomp/defaultmap-10.f90: New test.
* gfortran.dg/gomp/defaultmap-9.f90: New test.

11 months agodoc: Remove obsolete sentence about _Float* not being supported in C++ [PR106652]
Jakub Jelinek [Tue, 22 Aug 2023 14:13:44 +0000 (16:13 +0200)]
doc: Remove obsolete sentence about _Float* not being supported in C++ [PR106652]

As mentioned in the PR, these types are supported in C++ since GCC 13,
so we shouldn't confuse users.

2023-08-22  Jakub Jelinek  <jakub@redhat.com>

PR c++/106652
* doc/extend.texi (_Float<n>): Drop obsolete sentence that the
types aren't supported in C++.

11 months agoVECT: Add LEN_FOLD_EXTRACT_LAST pattern
Juzhe-Zhong [Tue, 22 Aug 2023 10:51:37 +0000 (18:51 +0800)]
VECT: Add LEN_FOLD_EXTRACT_LAST pattern

Hi, Richard and Richi.

This is the last autovec pattern I want to add for RVV (length loop control).

This patch is supposed to handled this following case:

int __attribute__ ((noinline, noclone))
condition_reduction (int *a, int min_v, int n)
{
  int last = 66; /* High start value.  */

  for (int i = 0; i < n; i++)
    if (a[i] < min_v)
      last = i;

  return last;
}

ARM SVE IR:

  ...
  mask__7.11_39 = vect__4.10_37 < vect_cst__38;
  _40 = loop_mask_36 & mask__7.11_39;
  last_5 = .FOLD_EXTRACT_LAST (last_15, _40, vect_vec_iv_.7_32);
  ...

RVV IR, we want to see:
 ...
 loop_len = SELECT_VL
 mask__7.11_39 = vect__4.10_37 < vect_cst__38;
 last_5 = .LEN_FOLD_EXTRACT_LAST (last_15, _40, vect_vec_iv_.7_32, loop_len, bias);
 ...

gcc/ChangeLog:

* doc/md.texi: Add LEN_FOLD_EXTRACT_LAST pattern.
* internal-fn.cc (fold_len_extract_direct): Ditto.
(expand_fold_len_extract_optab_fn): Ditto.
(direct_fold_len_extract_optab_supported_p): Ditto.
* internal-fn.def (LEN_FOLD_EXTRACT_LAST): Ditto.
* optabs.def (OPTAB_D): Ditto.

11 months agoSimplify intereaved store vectorization processing
Richard Biener [Tue, 22 Aug 2023 12:28:00 +0000 (14:28 +0200)]
Simplify intereaved store vectorization processing

When doing interleaving we perform code generation when visiting the
last store of a chain.  We keep track of this via DR_GROUP_STORE_COUNT,
the following localizes this to the caller of vectorizable_store,
also avoing redundant non-processing of the other stores.

* tree-vect-stmts.cc (vectorizable_store): Do not bump
DR_GROUP_STORE_COUNT here.  Remove early out.
(vect_transform_stmt): Only call vectorizable_store on
the last element of an interleaving chain.

11 months agoMAINTAINERS: Update my email address
Filip Kastl [Tue, 22 Aug 2023 11:07:19 +0000 (13:07 +0200)]
MAINTAINERS: Update my email address

Signed-off-by: Filip Kastl <fkastl@suse.cz>
ChangeLog:

* MAINTAINERS: Update my email address.

11 months agotree-optimization/94864 - vector insert of vector extract simplification
Richard Biener [Wed, 12 Jul 2023 13:01:47 +0000 (15:01 +0200)]
tree-optimization/94864 - vector insert of vector extract simplification

The PRs ask for optimizing of

  _1 = BIT_FIELD_REF <b_3(D), 64, 64>;
  result_4 = BIT_INSERT_EXPR <a_2(D), _1, 64>;

to a vector permutation.  The following implements this as
match.pd pattern, improving code generation on x86_64.

On the RTL level we face the issue that backend patterns inconsistently
use vec_merge and vec_select of vec_concat to represent permutes.

I think using a (supported) permute is almost always better
than an extract plus insert, maybe excluding the case we extract
element zero and that's aliased to a register that can be used
directly for insertion (not sure how to query that).

The patch FAILs one case in gcc.target/i386/avx512fp16-vmovsh-1a.c
where we now expand from

 __A_28 = VEC_PERM_EXPR <x2.8_9, x1.9_10, { 0, 9, 10, 11, 12, 13, 14, 15 }>;

instead of

 _28 = BIT_FIELD_REF <x2.8_9, 16, 0>;
 __A_29 = BIT_INSERT_EXPR <x1.9_10, _28, 0>;

producing a vpblendw instruction instead of the expected vmovsh.  That's
either a missed vec_perm_const expansion optimization or even better,
an improvement - Zen4 for example has 4 ports to execute vpblendw
but only 3 for executing vmovsh and both instructions have the same size.

The patch XFAILs the sub-testcase.

PR tree-optimization/94864
PR tree-optimization/94865
PR tree-optimization/93080
* match.pd (bit_insert @0 (BIT_FIELD_REF @1 ..) ..): New pattern
for vector insertion from vector extraction.

* gcc.target/i386/pr94864.c: New testcase.
* gcc.target/i386/pr94865.c: Likewise.
* gcc.target/i386/avx512fp16-vmovsh-1a.c: XFAIL.
* gcc.dg/tree-ssa/forwprop-40.c: Likewise.
* gcc.dg/tree-ssa/forwprop-41.c: Likewise.

11 months agoFortran: implement vector sections in DATA statements [PR49588]
Harald Anlauf [Mon, 21 Aug 2023 19:23:57 +0000 (21:23 +0200)]
Fortran: implement vector sections in DATA statements [PR49588]

gcc/fortran/ChangeLog:

PR fortran/49588
* data.cc (gfc_advance_section): Derive next index set and next offset
into DATA variable also for array references using vector sections.
Use auxiliary array to keep track of offsets into indexing vectors.
(gfc_get_section_index): Set up initial indices also for DATA variables
with array references using vector sections.
* data.h (gfc_get_section_index): Adjust prototype.
(gfc_advance_section): Likewise.
* resolve.cc (check_data_variable): Pass vector offsets.

gcc/testsuite/ChangeLog:

PR fortran/49588
* gfortran.dg/data_vector_section.f90: New test.

11 months agoVECT: Support loop len control on EXTRACT_LAST vectorization
Juzhe-Zhong [Mon, 21 Aug 2023 10:59:55 +0000 (18:59 +0800)]
VECT: Support loop len control on EXTRACT_LAST vectorization

Hi, @Richi and @Richard, base on previous disscussion, I simpily fix issuses for
powerpc and s390 with your suggestions:

-  machine_mode len_load_mode = get_len_load_store_mode
-    (loop_vinfo->vector_mode, true).require ();
-  machine_mode len_store_mode = get_len_load_store_mode
-    (loop_vinfo->vector_mode, false).require ();
+  machine_mode len_load_mode, len_store_mode;
+  if (!get_len_load_store_mode (loop_vinfo->vector_mode, true)
+        .exists (&len_load_mode))
+    return false;
+  if (!get_len_load_store_mode (loop_vinfo->vector_mode, false)
+        .exists (&len_store_mode))
+    return false;

Co-Authored-By: Kewen.Lin <linkw@linux.ibm.com>
gcc/ChangeLog:

* tree-vect-loop.cc (vect_verify_loop_lens): Add exists check.
(vectorizable_live_operation): Add live vectorization for length loop
control.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/live-1.c: New test.
* gcc.target/riscv/rvv/autovec/partial/live_run-1.c: New test.

11 months agoTestcase fix.
liuhongt [Tue, 22 Aug 2023 02:51:57 +0000 (10:51 +0800)]
Testcase fix.

gcc/testsuite/ChangeLog:

* gcc.target/i386/invariant-ternlog-1.c: Only scan %rdx under
TARGET_64BIT.

11 months agoRISC-V: Change fnms testcases assertion to xfail
Lehua Ding [Tue, 22 Aug 2023 02:54:08 +0000 (10:54 +0800)]
RISC-V: Change fnms testcases assertion to xfail

Hi,

This patch fixes inappropriate assertions in fnms testcases since
we want to generate .COND_FNMS but actually generate .FNMS + .VCOND_MASK.
A patch to do this optimization will follow.

Best,
Lehua

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Adjust.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto.

11 months agoanalyzer: check format strings for null termination [PR105899]
David Malcolm [Tue, 22 Aug 2023 01:13:19 +0000 (21:13 -0400)]
analyzer: check format strings for null termination [PR105899]

This patch extends -fanalyzer to check the format strings of calls
to functions marked with '__attribute__ ((format...))'.

The only checking done in this patch is to check that the format string
is a valid null-terminated string; this patch doesn't attempt to check
the content of the format string.

gcc/analyzer/ChangeLog:
PR analyzer/105899
* call-details.cc (call_details::call_details): New ctor.
* call-details.h (call_details::call_details): New ctor decl.
(struct call_arg_details): Move here from region-model.cc.
* region-model.cc (region_model::check_call_format_attr): New.
(region_model::check_call_args): Call it.
(struct call_arg_details): Move it to call-details.h.
* region-model.h (region_model::check_call_format_attr): New decl.

gcc/testsuite/ChangeLog:
PR analyzer/105899
* gcc.dg/analyzer/attr-format-1.c: New test.
* gcc.dg/analyzer/sprintf-1.c: Update expected results for
now-passing tests.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
11 months agoanalyzer: add kf_fopen
David Malcolm [Tue, 22 Aug 2023 01:13:19 +0000 (21:13 -0400)]
analyzer: add kf_fopen

Add checking to -fanalyzer that both params of calls to "fopen" are
valid null-terminated strings.

gcc/analyzer/ChangeLog:
* kf.cc (class kf_fopen): New.
(register_known_functions): Register it.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/fopen-1.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
11 months agoanalyzer: replace -Wanalyzer-unterminated-string with scan_for_null_terminator [PR105899]
David Malcolm [Tue, 22 Aug 2023 01:13:19 +0000 (21:13 -0400)]
analyzer: replace -Wanalyzer-unterminated-string with scan_for_null_terminator [PR105899]

In r14-3169-g325f9e88802daa I added check_for_null_terminated_string_arg
to -fanalyzer, calling it in various places, with a sole check for
unterminated string constants, adding -Wanalyzer-unterminated-string for
this case.

This patch adds region_model::scan_for_null_terminator, which simulates
scanning memory for a zero byte, complaining about uninitiliazed bytes
and out-of-range accesses seen before any zero byte is seen.

This more flexible approach catches the issues we saw before with
-Wanalyzer-unterminated-string, and also catches uninitialized runs
of bytes, and I believe will be a better way to build checking of C
string operations in the analyzer.

Given that the patch makes -Wanalyzer-unterminated-string redundant
and that this option was only in trunk for 10 days and has no known
users, the patch simply removes the option without a compatibility
fallback.

The patch uses custom events and notes to provide context on where
the issues are coming from.  For example, given:

null-terminated-strings-1.c: In function ‘test_partially_initialized’:
null-terminated-strings-1.c:71:3: warning: use of uninitialized value ‘buf[1]’ [CWE-457] [-Wanalyzer-use-of-uninitialized-value]
   71 |   __analyzer_get_strlen (buf);
      |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~
  ‘test_partially_initialized’: events 1-3
    |
    |   69 |   char buf[16];
    |      |        ^~~
    |      |        |
    |      |        (1) region created on stack here
    |   70 |   buf[0] = 'a';
    |   71 |   __analyzer_get_strlen (buf);
    |      |   ~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |      |   |
    |      |   (2) while looking for null terminator for argument 1 (‘&buf’) of ‘__analyzer_get_strlen’...
    |      |   (3) use of uninitialized value ‘buf[1]’ here
    |
analyzer-decls.h:59:22: note: argument 1 of ‘__analyzer_get_strlen’ must be a pointer to a null-terminated string
   59 | extern __SIZE_TYPE__ __analyzer_get_strlen (const char *ptr);
      |                      ^~~~~~~~~~~~~~~~~~~~~

gcc/analyzer/ChangeLog:
PR analyzer/105899
* analyzer.opt (Wanalyzer-unterminated-string): Delete.
* call-details.cc
(call_details::check_for_null_terminated_string_arg): Convert
return type from void to const svalue *.  Add param "out_sval".
* call-details.h
(call_details::check_for_null_terminated_string_arg): Likewise.
* kf-analyzer.cc (kf_analyzer_get_strlen::impl_call_pre): Wire up
to result of check_for_null_terminated_string_arg.
* region-model.cc (get_strlen): Delete.
(class unterminated_string_arg): Delete.
(struct fragment): New.
(class iterable_cluster): New.
(region_model::get_store_bytes): New.
(get_tree_for_byte_offset): New.
(region_model::scan_for_null_terminator): New.
(region_model::check_for_null_terminated_string_arg): Convert
return type from void to const svalue *.  Add param "out_sval".
Reimplement in terms of scan_for_null_terminator, dropping the
special-case for -Wanalyzer-unterminated-string.
* region-model.h (region_model::get_store_bytes): New decl.
(region_model::scan_for_null_terminator): New decl.
(region_model::check_for_null_terminated_string_arg): Convert
return type from void to const svalue *.  Add param "out_sval".
* store.cc (concrete_binding::get_byte_range): New.
* store.h (concrete_binding::get_byte_range): New decl.
(store_manager::get_concrete_binding): New overload.

gcc/ChangeLog:
PR analyzer/105899
* doc/invoke.texi: Remove -Wanalyzer-unterminated-string.

gcc/testsuite/ChangeLog:
PR analyzer/105899
* gcc.dg/analyzer/error-1.c: Update expected results to reflect
reimplementation of unterminated string detection.  Add test
coverage for uninitialized buffers.
* gcc.dg/analyzer/null-terminated-strings-1.c: Likewise.
* gcc.dg/analyzer/putenv-1.c: Likewise.
* gcc.dg/analyzer/strchr-1.c: Likewise.
* gcc.dg/analyzer/strcpy-1.c: Likewise.
* gcc.dg/analyzer/strdup-1.c: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
This page took 0.125463 seconds and 5 git commands to generate.