David Malcolm [Tue, 14 Nov 2023 19:02:10 +0000 (14:02 -0500)]
diagnostics: make option-handling callbacks private
No functional change intended.
gcc/c-family/ChangeLog:
* c-warn.cc (conversion_warning): Update call to
global_dc->m_option_enabled to use option_enabled_p.
gcc/cp/ChangeLog:
* decl.cc (finish_function): Update call to
global_dc->m_option_enabled to use option_enabled_p.
gcc/ChangeLog:
* diagnostic-format-json.cc
(json_output_format::on_end_diagnostic): Update calls to m_context
callbacks to use member functions; tighten up scopes.
* diagnostic-format-sarif.cc (sarif_builder::make_result_object):
Likewise.
(sarif_builder::make_reporting_descriptor_object_for_warning):
Likewise.
* diagnostic.cc (diagnostic_context::initialize): Update for
callbacks being moved into m_option_callbacks and being renamed.
(diagnostic_context::set_option_hooks): New.
(diagnostic_option_classifier::classify_diagnostic): Update call
to global_dc->m_option_enabled to use option_enabled_p.
(diagnostic_context::print_option_information): Update calls to
m_context callbacks to use member functions; tighten up scopes.
(diagnostic_context::diagnostic_enabled): Likewise.
* diagnostic.h (diagnostic_option_enabled_cb): New typedef.
(diagnostic_make_option_name_cb): New typedef.
(diagnostic_make_option_url_cb): New typedef.
(diagnostic_context::option_enabled_p): New.
(diagnostic_context::make_option_name): New.
(diagnostic_context::make_option_url): New.
(diagnostic_context::set_option_hooks): New decl.
(diagnostic_context::m_option_enabled): Rename to
m_option_enabled_cb and move within m_option_callbacks, using
typedef.
(diagnostic_context::m_option_state): Move within
m_option_callbacks.
(diagnostic_context::m_option_name): Rename to
m_make_option_name_cb and move within m_option_callbacks, using
typedef.
(diagnostic_context::m_get_option_url): Likewise, renaming to
m_make_option_url_cb.
* lto-wrapper.cc (print_lto_docs_link): Update call to m_context
callback to use member function.
(main): Use diagnostic_context::set_option_hooks.
* opts-diagnostic.h (option_name): Make context param const.
(get_option_url): Likewise.
* opts.cc (option_name): Likewise.
(get_option_url): Likewise.
* toplev.cc (general_init): Use
diagnostic_context::set_option_hooks.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Tue, 14 Nov 2023 19:01:55 +0000 (14:01 -0500)]
diagnostics: make m_text_callbacks private
No functional change intended.
gcc/ChangeLog:
* diagnostic-show-locus.cc (diagnostic_context::show_locus):
Update for renaming of text callbacks fields.
* diagnostic.cc (diagnostic_context::initialize): Likewise.
* diagnostic.h (class diagnostic_context): Add "friend" for
accessors to m_text_callbacks.
(diagnostic_context::m_text_callbacks): Make private, and add an
"m_" prefix to field names.
(diagnostic_starter): Convert from macro to inline function.
(diagnostic_start_span): New.
(diagnostic_finalizer): Convert from macro to inline function.
gcc/fortran/ChangeLog:
* error.cc (gfc_diagnostics_init): Use diagnostic_start_span.
Jakub Jelinek [Tue, 14 Nov 2023 17:32:37 +0000 (18:32 +0100)]
libcpp, contrib: Update to Unicode 15.1
The following patch (in plaintext just a pseudo-patch where I've left out
the too big parts of either wget downloaded or regenerated files out with
..., full patch attached compressed) updates to Unicode 15.1 from 15.0
we had last year. Apparently Unicode forgot to add a new range to 4-8 Table
we are using, but from the other files it is clear what should have been
added; I've filed a bugreport against Unicode.
2023-11-14 Jakub Jelinek <jakub@redhat.com>
contrib/
* unicode/README: Adjust glibc git commit hash, number of Unicode
data files to be updated and latest Unicode version.
* unicode/from_glibc/utf8_gen.py: Update from glibc.
* unicode/UnicodeData.txt: Update from Unicode 15.1.
* unicode/EastAsianWidth.txt: Likewise.
* unicode/DerivedNormalizationProps.txt: Likewise.
* unicode/NameAliases.txt: Likewise.
* unicode/DerivedCoreProperties.txt: Likewise.
* unicode/PropList.txt: Likewise.
libcpp/
* makeucnid.cc (write_copyright): Update copyright year.
* makeuname2c.cc (write_copyright): Likewise.
(struct generated): Update latest Unicode version.
(generated_ranges): Add 2ebf0-2ee5d CJK UNIFIED IDEOGRAPH
range which was forgotten to be added to 4-8 table, but
clearly is expected to be there from the 15.1 additions.
* ucnid.h: Regenerated.
* uname2c.h: Regenerated.
* generated_cpp_wcwidth.h: Regenerated.
This paper voted in as DR makes some multi-character literals ill-formed.
'abcd' stays valid, but e.g. 'รก' is newly invalid in UTF-8 exec charset
while valid e.g. in ISO-8859-1, because it is a single character which needs
2 bytes to be encoded.
The following patch does that by checking (only pedantically, especially
because it is a DR) if we'd emit a -Wmultichar warning because character
constant has more than one byte in it whether the number of source characters
is equal to the number of bytes in the multichar string.
If it is, it is normal multi-character literal constant
and is diagnosed normally with -Wmultichar, otherwise at least one of the
c-chars in the sequence was encoded as 2+ bytes.
2023-11-14 Jakub Jelinek <jakub@redhat.com>
PR c++/110341
libcpp/
* charset.cc: Implement C++26 P1854R4 - Making non-encodable string
literals ill-formed.
(one_count_chars, convert_count_chars, count_source_chars): New
functions.
(narrow_str_to_charconst): Change last arg type from cpp_ttype to
const cpp_token *. For C++ if pedantic and i > 1 in CPP_CHAR
interpret token also as CPP_STRING32 and if number of characters
in the CPP_STRING32 is larger than number of bytes in CPP_CHAR,
pedwarn on it. Make the diagnostics more detailed.
(wide_str_to_charconst): Change last arg type from cpp_ttype to
const cpp_token *. Make the diagnostics more detailed.
(cpp_interpret_charconst): Adjust narrow_str_to_charconst and
wide_str_to_charconst callers.
gcc/testsuite/
* g++.dg/cpp26/literals1.C: New test.
* g++.dg/cpp26/literals2.C: New test.
* g++.dg/cpp23/wchar-multi1.C: Adjust expected diagnostic wordings.
* g++.dg/cpp23/wchar-multi2.C: Likewise.
* gcc.dg/c23-utf8char-3.c: Likewise.
* gcc.dg/cpp/charconst-4.c: Likewise.
* gcc.dg/cpp/charconst.c: Likewise.
* gcc.dg/cpp/if-2.c: Likewise.
* gcc.dg/utf16-4.c: Likewise.
* gcc.dg/utf32-4.c: Likewise.
* g++.dg/cpp1z/utf8-neg.C: Likewise.
* g++.dg/cpp2a/ucn2.C: Likewise.
* g++.dg/ext/utf16-4.C: Likewise.
* g++.dg/ext/utf32-4.C: Likewise.
in favor of explicitly using a specific file_cache throughout, and only
using global_dc's file_cache in gcc-specific code.
Rather than creating global_dc's file_cache the first time its needed,
this patch simply creates one when a diagnostic_context is initialized,
and eliminates diagnostic_file_cache_init.
No functional change intended.
gcc/c-family/ChangeLog:
* c-common.cc (c_get_substring_location): Use global_dc's
file_cache.
* c-format.cc (get_corrected_substring): Likewise.
* c-indentation.cc (get_visual_column): Add file_cache param.
(get_first_nws_vis_column): Likewise.
(detect_intervening_unindent): Likewise.
(should_warn_for_misleading_indentation): Use global_dc's
file_cache.
(assert_get_visual_column_succeeds): Add file_cache param.
(ASSERT_GET_VISUAL_COLUMN_SUCCEEDS): Likewise.
(assert_get_visual_column_fails): Likewise.
(define ASSERT_GET_VISUAL_COLUMN_FAILS): Likewise.
(selftest::test_get_visual_column): Create and use a temporary
file_cache.
gcc/cp/ChangeLog:
* contracts.cc (build_comment): Use global_dc's file_cache.
gcc/ChangeLog:
* diagnostic-format-sarif.cc (sarif_builder::get_sarif_column):
Use m_context's file_cache.
(sarif_builder::maybe_make_artifact_content_object): Likewise.
(sarif_builder::get_source_lines): Likewise.
* diagnostic-show-locus.cc
(exploc_with_display_col::exploc_with_display_col): Add file_cache
param.
(layout::m_file_cache): New field.
(make_range): Add file_cache param.
(selftest::test_layout_range_for_single_point): Create and use a
temporary file_cache.
(selftest::test_layout_range_for_single_line): Likewise.
(selftest::test_layout_range_for_multiple_lines): Likewise.
(layout::layout): Initialize m_file_cache from the context and use it.
(layout::maybe_add_location_range): Use m_file_cache.
(layout::calculate_x_offset_display): Likewise.
(get_affected_range): Add file_cache param.
(get_printed_columns): Likewise.
(line_corrections::line_corrections): Likewwise.
(line_corrections::m_file_cache): New field.
(source_line::source_line): Add file_cache param.
(line_corrections::add_hint): Use m_file_cache.
(layout::print_trailing_fixits): Likewise.
(layout::print_line): Likewise.
(selftest::test_layout_x_offset_display_utf8): Create and use a
temporary file_cache.
(selftest::test_layout_x_offset_display_tab): Likewise.
(selftest::test_diagnostic_show_locus_one_liner_utf8): Likewise.
(selftest::test_add_location_if_nearby): Pass global_dc's
file_cache to temp_source_file ctor.
(selftest::test_overlapped_fixit_printing): Create and use a
temporary file_cache.
(selftest::test_overlapped_fixit_printing_utf8): Likewise.
(selftest::test_overlapped_fixit_printing_2): Use dc's file_cache.
* diagnostic.cc (diagnostic_context::initialize): Always create a
file_cache.
(diagnostic_context::initialize_input_context): Assume
m_file_cache has already been created.
(diagnostic_context::create_edit_context): Pass m_file_cache to
edit_context.
(convert_column_unit): Add file_cache param.
(diagnostic_context::converted_column): Use context's file_cache.
(print_parseable_fixits): Add file_cache param.
(diagnostic_context::report_diagnostic): Use context's file_cache.
(selftest::test_print_parseable_fixits_none): Create and use a
temporary file_cache.
(selftest::test_print_parseable_fixits_insert): Likewise.
(selftest::test_print_parseable_fixits_remove): Likewise.
(selftest::test_print_parseable_fixits_replace): Likewise.
(selftest::test_print_parseable_fixits_bytes_vs_display_columns):
Likewise.
* diagnostic.h (diagnostic_context::file_cache_init): Delete.
(diagnostic_context::get_file_cache): Convert return type from
pointer to reference.
* edit-context.cc (edited_file::get_file_cache): New.
(edited_file::m_edit_context): New.
(edit_context::edit_context): Add file_cache param.
(edit_context::get_or_insert_file): Pass this to edited_file's
ctor.
(edited_file::edited_file): Add edit_context param.
(edited_file::print_content): Use get_file_cache.
(edited_file::print_diff_hunk): Likewise.
(edited_file::print_run_of_changed_lines): Likewise.
(edited_file::get_or_insert_line): Likewise.
(edited_file::get_num_lines): Likewise.
(edited_line::edited_line): Pass in file_cache and use it.
(selftest::test_get_content): Create and use a
temporary file_cache.
(selftest::test_applying_fixits_insert_before): Likewise.
(selftest::test_applying_fixits_insert_after): Likewise.
(selftest::test_applying_fixits_insert_after_at_line_end):
Likewise.
(selftest::test_applying_fixits_insert_after_failure): Likewise.
(selftest::test_applying_fixits_insert_containing_newline):
Likewise.
(selftest::test_applying_fixits_growing_replace): Likewise.
(selftest::test_applying_fixits_shrinking_replace): Likewise.
(selftest::test_applying_fixits_replace_containing_newline):
Likewise.
(selftest::test_applying_fixits_remove): Likewise.
(selftest::test_applying_fixits_multiple): Likewise.
(selftest::test_applying_fixits_multiple_lines): Likewise.
(selftest::test_applying_fixits_modernize_named_init): Likewise.
(selftest::test_applying_fixits_modernize_named_init): Likewise.
(selftest::test_applying_fixits_unreadable_file): Likewise.
(selftest::test_applying_fixits_line_out_of_range): Likewise.
(selftest::test_applying_fixits_column_validation): Likewise.
(selftest::test_applying_fixits_column_validation): Likewise.
(selftest::test_applying_fixits_column_validation): Likewise.
(selftest::test_applying_fixits_column_validation): Likewise.
* edit-context.h (edit_context::edit_context): Add file_cache
param.
(edit_context::get_file_cache): New.
(edit_context::m_file_cache): New.
* final.cc: Include "diagnostic.h".
(asm_show_source): Use global_dc's file_cache.
* gcc-rich-location.cc (blank_line_before_p): Add file_cache
param.
(use_new_line): Likewise.
(gcc_rich_location::add_fixit_insert_formatted): Use global dc's
file_cache.
* input.cc (diagnostic_file_cache_init): Delete.
(diagnostic_context::file_cache_init): Delete.
(diagnostics_file_cache_forcibly_evict_file): Delete.
(file_cache::missing_trailing_newline_p): New.
(file_cache::evicted_cache_tab_entry): Don't call
diagnostic_file_cache_init.
(location_get_source_line): Delete.
(get_source_text_between): Add file_cache param.
(get_source_file_content): Delete.
(location_missing_trailing_newline): Delete.
(location_compute_display_column): Add file_cache param.
(dump_location_info): Create and use temporary file_cache.
(get_substring_ranges_for_loc): Add file_cache param.
(get_location_within_string): Likewise.
(get_source_range_for_char): Likewise.
(get_num_source_ranges_for_substring): Likewise.
(selftest::test_reading_source_line): Create and use temporary
file_cache.
(selftest::lexer_test::m_file_cache): New field.
(selftest::assert_char_at_range): Use test.m_file_cache.
(selftest::assert_num_substring_ranges): Likewise.
(selftest::assert_has_no_substring_ranges): Likewise.
(selftest::test_lexer_string_locations_concatenation_2): Likewise.
* input.h (class file_cache): New forward decl.
(location_compute_display_column): Add file_cache param.
(location_get_source_line): Delete.
(get_source_text_between): Add file_cache param.
(get_source_file_content): Delete.
(location_missing_trailing_newline): Delete.
(file_cache::missing_trailing_newline_p): New decl.
(diagnostics_file_cache_forcibly_evict_file): Delete.
* selftest.cc (named_temp_file::named_temp_file): Add file_cache
param.
(named_temp_file::~named_temp_file): Optionally evict the file
from the given file_cache.
(temp_source_file::temp_source_file): Add file_cache param.
* selftest.h (class file_cache): New forward decl.
(named_temp_file::named_temp_file): Add file_cache param.
(named_temp_file::m_file_cache): New field.
(temp_source_file::temp_source_file): Add file_cache param.
* substring-locations.h (get_location_within_string): Add
file_cache param.
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: Use
global_dc's file cache.
* gcc.dg/plugin/expensive_selftests_plugin.c: Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Tue, 14 Nov 2023 16:01:39 +0000 (11:01 -0500)]
json: reduce use of naked new in json-building code
No functional change intended.
gcc/ChangeLog:
* diagnostic-format-json.cc: Use type-specific "set_*" functions
of json::object to avoid naked new of json value subclasses.
* diagnostic-format-sarif.cc: Likewise.
* gcov.cc: Likewise.
* json.cc (object::set_string): New.
(object::set_integer): New.
(object::set_float): New.
(object::set_bool): New.
(selftest::test_writing_objects): Use object::set_string.
* json.h (object::set_string): New decl.
(object::set_integer): New decl.
(object::set_float): New decl.
(object::set_bool): New decl.
* optinfo-emit-json.cc: Use type-specific "set_*" functions of
json::object to avoid naked new of json value subclasses.
* timevar.cc: Likewise.
* tree-diagnostic-path.cc: Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
The Xmethod for std::deque::size() assumed that the first element would
be at the start of the first node. That's only true if elements are only
added at the back. If an element is inserted at the front, or removed
from the front (or anywhere before the middle) then the first node will
not be completely populated, and the Xmethod will give the wrong result.
libstdc++-v3/ChangeLog:
PR libstdc++/112491
* python/libstdcxx/v6/xmethods.py (DequeWorkerBase.size): Fix
calculation to use _M_start._M_cur.
* testsuite/libstdc++-xmethods/deque.cc: Check failing cases.
s390: Fix vec_scatter_element for vectors of floats
The offset for vec_scatter_element of floats should be a vector of type
UV4SI instead of V4SF. Note, this is an incompatibility change.
gcc/ChangeLog:
* config/s390/s390-builtin-types.def: Add/remove types.
* config/s390/s390-builtins.def (s390_vec_scatter_element_flt):
The type for the offset should be UV4SI instead of V4SF.
The change in r14-2852-gf5fb9ff2396fd4 failed to update patch_loop_exit
to compensate for rewriting of a NE/EQ_EXPR to a new code. Fixed
with the following.
PR tree-optimization/111233
PR tree-optimization/111652
PR tree-optimization/111727
PR tree-optimization/111838
PR tree-optimization/112113
* tree-ssa-loop-split.cc (patch_loop_exit): Get the new
guard code instead of the old guard stmt.
(split_loop): Adjust.
Richard Biener [Tue, 14 Nov 2023 11:53:18 +0000 (12:53 +0100)]
Loop distribution fix for SCC detection
The following adjusts data_dep_in_cycle_p to properly consider the
whole loop nest when looking for data dep cycles and exempting
zero-distance DDRs instead of just the outermost loop.
* tree-loop-distribution.cc (loop_distribution::data_dep_in_cycle_p):
Consider all loops in the nest when looking for
lambda_vector_zerop.
Richard Biener [Tue, 14 Nov 2023 10:37:13 +0000 (11:37 +0100)]
tree-optimization/112281 - loop distribution and zero dependence distances
We currently distribute
for (c = 2; c; c--)
for (e = 0; e < 2; e++) {
d[c] = b = d[c + 1];
d[c + 1].a = 0;
}
in a wrong way where the inner loop zero dependence distance should
make us preserve stmt execution order. We fail to do so since we
only look for a fully zero distance vector rather than looking at
the innermost loop distance. This is somewhat similar to PR87022
where we instead looked at the outermost loop distance and changed
this to what we do now. The following switches us to look at the
innermost loop distance.
PR tree-optimization/112281
* tree-loop-distribution.cc (pg_add_dependence_edges):
Preserve stmt order when the innermost loop has exact
overlap.
Jakub Jelinek [Tue, 14 Nov 2023 12:19:48 +0000 (13:19 +0100)]
i386: Fix up <insn><dwi>3_doubleword_lowpart [PR112523]
On Sun, Nov 12, 2023 at 09:03:42PM -0000, Roger Sayle wrote:
> This patch improves register pressure during reload, inspired by PR 97756.
> Normally, a double-word right-shift by a constant produces a double-word
> result, the highpart of which is dead when followed by a truncation.
> The dead code calculating the high part gets cleaned up post-reload, so
> the issue isn't normally visible, except for the increased register
> pressure during reload, sometimes leading to odd register assignments.
> Providing a post-reload splitter, which clobbers a single wordmode
> result register instead of a doubleword result register, helps (a bit).
Unfortunately this broke bootstrap on i686-linux, broke all ACATS tests
on x86_64-linux as well as miscompiled e.g. __floattisf in libgcc there
as well.
The bug is that shrd{l,q} instruction expects the low part of the input
to be the same register as the output, rather than the high part as the
patch implemented.
split_double_mode (<DWI>mode, &operands[1], 1, &operands[1], &operands[3]);
sets operands[1] to the lo_half and operands[3] to the hi_half, so if
operands[0] is not the same register as operands[1] (rather than [3]) after
RA, we should during splitting move operands[1] into operands[0].
Your testcase:
> #define MASK60 ((1ul << 60) - 1)
> unsigned long foo (__uint128_t n)
> {
> unsigned long a = n & MASK60;
> unsigned long b = (n >> 60);
> b = b & MASK60;
> unsigned long c = (n >> 120);
> return a+b+c;
> }
still has the same number of instructions.
Bootstrapped/regtested on x86_64-linux (where it e.g. turns
=== acats Summary ===
-# of unexpected failures 2328
+# of expected passes 2328
+# of unexpected failures 0
and fixes gcc.dg/torture/fp-int-convert-*timode.c FAILs as well)
and i686-linux (where it previously didn't bootstrap, but compared to
Friday evening's bootstrap the testresults are ok).
2023-11-14 Jakub Jelinek <jakub@redhat.com>
PR target/112523
PR ada/112514
* config/i386/i386.md (<insn><dwi>3_doubleword_lowpart): Move
operands[1] aka low part of input rather than operands[3] aka high
part of input to output if not the same register.
The r14-5312-g040e5b0edbca861196d9e2ea2af5e805769c8d5d commit log contains
a line from git revert with correct hash, but unfortunately hand ammended
with explanation, so it got through the pre-commit hook but failed during
update_version_git generation. Please don't do this.
Georg-Johann Lay [Tue, 14 Nov 2023 11:05:19 +0000 (12:05 +0100)]
LibF7: sinh: Fix loss of precision due to cancellation for small values.
libgcc/config/avr/libf7/
* libf7-const.def [F7MOD_sinh_]: Add MiniMax polynomial.
* libf7.c (f7_sinh): Use it instead of (exp(x) - exp(-x)) / 2
when |x| < 0.5 to avoid loss of precision due to cancellation.
Lehua Ding [Tue, 14 Nov 2023 08:42:19 +0000 (16:42 +0800)]
x86: Make testcase apx-spill_to_egprs-1.c more robust
Hi,
This little patch adjust the assert in apx-spill_to_egprs-1.c testcase.
The -mapxf compilation option allows more registers to be used, which in
turn eliminates the need for local variables to be stored in stack memory.
Therefore, the assertion is changed to detects no memory loaded through the
%rsp register.
gcc/testsuite/ChangeLog:
* gcc.target/i386/apx-spill_to_egprs-1.c: Make sure that no local
variables are stored on the stack.
Andreas Krebbel [Tue, 14 Nov 2023 10:33:45 +0000 (11:33 +0100)]
IBM Z: Add GTY marker to builtin data structures
This adds GTY markers to s390_builtin_types, s390_builtin_fn_types,
and s390_builtin_decls. These were missing causing problems in
particular when using builtins after including a precompiled header.
Unfortunately the declaration of these data structures use enum values
from s390-builtins.h. This file however is not included everywhere
and is rather large. In order to include it only for the purpose of
gtype-desc.cc we place a preprocessed copy of it in the build
directory and include only this.
Andreas Krebbel [Tue, 14 Nov 2023 10:33:44 +0000 (11:33 +0100)]
IBM Z: Fix ICE with overloading and checking enabled
s390_resolve_overloaded_builtin, when called on NON_DEPENDENT_EXPR,
ICEs when using the type from it which ends up as error_mark_node.
This particular instance of the problem does not occur anymore since
NON_DEPENDENT_EXPR has been removed. Nevertheless that case needs to
be handled here.
gcc/ChangeLog:
* config/s390/s390-c.cc (s390_fn_types_compatible): Add a check
for error_mark_node.
Jonathan Wakely [Mon, 13 Nov 2023 12:03:31 +0000 (12:03 +0000)]
c++: Link extended FP conversion pedwarns to -Wnarrowing [PR111842]
Several users have been confused by the status of these warnings,
which can be misunderstood as "this might not be what you want",
rather than diagnostics required by the C++ standard. Add the text "ISO
C++ does not allow" to make this clear.
Also link them to -Wnarrowing so that they can be disabled or promoted
to errors independently of other pedwarns.
PR c++/111842
PR c++/112498
gcc/cp/ChangeLog:
* call.cc (convert_like_internal): Use OPT_Wnarrowing for
pedwarns about illformed conversions involving extended
floating-point types. Clarify that ISO C++ requires these
diagnostics.
gcc/testsuite/ChangeLog:
* g++.dg/cpp23/ext-floating16.C: New test.
* g++.dg/cpp23/ext-floating17.C: New test.
The following patch adds 6 new type-generic builtins,
__builtin_clzg
__builtin_ctzg
__builtin_clrsbg
__builtin_ffsg
__builtin_parityg
__builtin_popcountg
The g at the end stands for generic because the unsuffixed variant
of the builtins already have unsigned int or int arguments.
The main reason to add these is to support arbitrary unsigned (for
clrsb/ffs signed) bit-precise integer types and also __int128 which
wasn't supported by the existing builtins, so that e.g. <stdbit.h>
type-generic functions could then support not just bit-precise unsigned
integer type whose width matches a standard or extended integer type,
but others too.
None of these new builtins promote their first argument, so the argument
can be e.g. unsigned char or unsigned short or unsigned __int20 etc.
The first 2 support either 1 or 2 arguments, if only 1 argument is supplied,
the behavior is undefined for argument 0 like for other __builtin_c[lt]z*
builtins, if 2 arguments are supplied, the second argument should be int
that will be returned if the argument is 0. All other builtins have
just one argument. For __builtin_clrsbg and __builtin_ffsg the argument
shall be any signed standard/extended or bit-precise integer, for the others
any unsigned standard/extended or bit-precise integer (bool not allowed).
One possibility would be to also allow signed integer types for
the clz/ctz/parity/popcount ones (and just cast the argument to
unsigned_type_for during folding) and similarly unsigned integer types
for the clrsb/ffs ones, dunno what is better; for stdbit.h the current
version is sufficient and diagnoses use of the inappropriate sign,
though on the other side I wonder if users won't be confused by
__builtin_clzg (1) being an error and having to write __builtin_clzg (1U).
The new builtins are lowered to corresponding builtins with other suffixes
or internal calls (plus casts and adjustments where needed) during FE
folding or during gimplification at latest, the non-suffixed builtins
handling precisions up to precision of int, l up to precision of long,
ll up to precision of long long, up to __int128 precision lowered to
double-word expansion early and the rest (which must be _BitInt) lowered
to internal fn calls - those are then lowered during bitint lowering pass.
The patch also changes representation of IFN_CLZ and IFN_CTZ calls,
previously they were in the IL only if they are directly supported optab
and depending on C[LT]Z_DEFINED_VALUE_AT_ZERO (...) == 2 they had or didn't
have defined behavior at 0, now they are in the IL either if directly
supported optab, or for the large/huge BITINT_TYPEs and they have either
1 or 2 arguments. If one, the behavior is undefined at zero, if 2, the
second argument is an int constant that should be returned for 0.
As there is no extra support during expansion, for directly supported optab
the second argument if present should still match the
C[LT]Z_DEFINED_VALUE_AT_ZERO (...) == 2 value, but for BITINT_TYPE arguments
it can be arbitrary int INTEGER_CST.
The indended uses in stdbit.h are e.g.
#ifdef __has_builtin
#if __has_builtin(__builtin_clzg) && __has_builtin(__builtin_ctzg) && __has_builtin(__builtin_popcountg)
#define stdc_leading_zeros(value) \
((unsigned int) __builtin_clzg (value, __builtin_popcountg ((__typeof (value)) ~(__typeof (value)) 0)))
#define stdc_leading_ones(value) \
((unsigned int) __builtin_clzg ((__typeof (value)) ~(value), __builtin_popcountg ((__typeof (value)) ~(__typeof (value)) 0)))
#define stdc_first_trailing_one(value) \
((unsigned int) (__builtin_ctzg (value, -1) + 1))
#define stdc_trailing_zeros(value) \
((unsigned int) __builtin_ctzg (value, __builtin_popcountg ((__typeof (value)) ~(__typeof (value)) 0)))
#endif
#endif
where __builtin_popcountg ((__typeof (x)) -1) computes the bit precision
of x's type (kind of _Bitwidthof (x) alternative).
They also allow casting of arbitrary unsigned _BitInt other than
unsigned _BitInt(1) to corresponding signed _BitInt by using
signed _BitInt(__builtin_popcountg ((__typeof (a)) -1))
and of arbitrary signed _BitInt to corresponding unsigned _BitInt
using unsigned _BitInt(__builtin_clrsbg ((__typeof (a)) -1) + 1).
2023-11-14 Jakub Jelinek <jakub@redhat.com>
PR c/111309
gcc/
* builtins.def (BUILT_IN_CLZG, BUILT_IN_CTZG, BUILT_IN_CLRSBG,
BUILT_IN_FFSG, BUILT_IN_PARITYG, BUILT_IN_POPCOUNTG): New
builtins.
* builtins.cc (fold_builtin_bit_query): New function.
(fold_builtin_1): Use it for
BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
(fold_builtin_2): Use it for BUILT_IN_{CLZ,CTZ}G.
* fold-const-call.cc: Fix comment typo on tm.h inclusion.
(fold_const_call_ss): Handle
CFN_BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
(fold_const_call_sss): New function.
(fold_const_call_1): Call it for 2 argument functions returning
scalar when passed 2 INTEGER_CSTs.
* genmatch.cc (cmp_operand): For function calls also compare
number of arguments.
(fns_cmp): New function.
(dt_node::gen_kids): Sort fns and generic_fns.
(dt_node::gen_kids_1): Handle fns with the same id but different
number of arguments.
* match.pd (CLZ simplifications): Drop checks for defined behavior
at zero. Add variant of simplifications for IFN_CLZ with 2 arguments.
(CTZ simplifications): Drop checks for defined behavior at zero,
don't optimize precisions above MAX_FIXED_MODE_SIZE. Add variant of
simplifications for IFN_CTZ with 2 arguments.
(a != 0 ? CLZ(a) : CST -> .CLZ(a)): Use TREE_TYPE (@3) instead of
type, add BITINT_TYPE handling, create 2 argument IFN_CLZ rather than
one argument. Add variant for matching CLZ with 2 arguments.
(a != 0 ? CTZ(a) : CST -> .CTZ(a)): Similarly.
* gimple-lower-bitint.cc (bitint_large_huge::lower_bit_query): New
method.
(bitint_large_huge::lower_call): Use it for IFN_{CLZ,CTZ,CLRSB,FFS}
and IFN_{PARITY,POPCOUNT} calls.
* gimple-range-op.cc (cfn_clz::fold_range): Don't check
CLZ_DEFINED_VALUE_AT_ZERO for m_gimple_call_internal_p, instead
assume defined value at zero if the call has 2 arguments and use
second argument value for that case.
(cfn_ctz::fold_range): Similarly.
(gimple_range_op_handler::maybe_builtin_call): Use op_cfn_clz_internal
or op_cfn_ctz_internal only if internal fn call has 2 arguments and
set m_op2 in that case.
* tree-vect-patterns.cc (vect_recog_ctz_ffs_pattern,
vect_recog_popcount_clz_ctz_ffs_pattern): For value defined at zero
use second argument of calls if present, otherwise assume UB at zero,
create 2 argument .CLZ/.CTZ calls if needed.
* tree-vect-stmts.cc (vectorizable_call): Handle 2 argument .CLZ/.CTZ
calls.
* tree-ssa-loop-niter.cc (build_cltz_expr): Create 2 argument
.CLZ/.CTZ calls if needed.
* tree-ssa-forwprop.cc (simplify_count_trailing_zeroes): Create 2
argument .CTZ calls if needed.
* tree-ssa-phiopt.cc (cond_removal_in_builtin_zero_pattern): Handle
2 argument .CLZ/.CTZ calls, handle BITINT_TYPE, create 2 argument
.CLZ/.CTZ calls.
* doc/extend.texi (__builtin_clzg, __builtin_ctzg, __builtin_clrsbg,
__builtin_ffsg, __builtin_parityg, __builtin_popcountg): Document.
gcc/c-family/
* c-common.cc (check_builtin_function_arguments): Handle
BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
* c-gimplify.cc (c_gimplify_expr): If __builtin_c[lt]zg second
argument hasn't been folded into constant yet, transform it to one
argument call inside of a COND_EXPR which for first argument 0
returns the second argument.
gcc/c/
* c-typeck.cc (convert_arguments): Don't promote first argument
of BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
gcc/cp/
* call.cc (magic_varargs_p): Return 4 for
BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
(build_over_call): Don't promote first argument of
BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
* cp-gimplify.cc (cp_gimplify_expr): For BUILT_IN_C{L,T}ZG use
c_gimplify_expr.
gcc/testsuite/
* c-c++-common/pr111309-1.c: New test.
* c-c++-common/pr111309-2.c: New test.
* gcc.dg/torture/bitint-43.c: New test.
* gcc.dg/torture/bitint-44.c: New test.
Xi Ruoyao [Fri, 3 Nov 2023 13:19:59 +0000 (21:19 +0800)]
LoongArch: Disable relaxation if the assembler don't support conditional branch relaxation [PR112330]
As the commit message of r14-4674 has indicated, if the assembler does
not support conditional branch relaxation, a relocation overflow may
happen on conditional branches when relaxation is enabled because the
number of NOP instructions inserted by the assembler will be more than
the number estimated by GCC.
To work around this issue, disable relaxation by default if the
assembler is detected incapable to perform conditional branch relaxation
at GCC build time. We also need to pass -mno-relax to the assembler to
really disable relaxation. But, if the assembler does not support
-mrelax option at all, we should not pass -mno-relax to the assembler or
it will immediately error out. Also handle this with the build time
assembler capability probing, and add a pair of options
-m[no-]pass-mrelax-to-as to allow using a different assembler from the
build-time one.
With this change, if GCC is built with GAS 2.41, relaxation will be
disabled by default. So the default value of -mexplicit-relocs= is also
changed to 'always' if -mno-relax is specified or implied by the
build-time default, because using assembler macros for symbol addresses
produces no benefit when relaxation is disabled.
gcc/ChangeLog:
PR target/112330
* config/loongarch/genopts/loongarch.opt.in: Add
-m[no]-pass-relax-to-as. Change the default of -m[no]-relax to
account conditional branch relaxation support status.
* config/loongarch/loongarch.opt: Regenerate.
* configure.ac (gcc_cv_as_loongarch_cond_branch_relax): Check if
the assembler supports conditional branch relaxation.
* configure: Regenerate.
* config.in: Regenerate. Note that there are some unrelated
changes introduced by r14-5424 (which does not contain a
config.in regeneration).
* config/loongarch/loongarch-opts.h
(HAVE_AS_COND_BRANCH_RELAXATION): Define to 0 if not defined.
* config/loongarch/loongarch-driver.h (ASM_MRELAX_DEFAULT):
Define.
(ASM_MRELAX_SPEC): Define.
(ASM_SPEC): Use ASM_MRELAX_SPEC instead of "%{mno-relax}".
* config/loongarch/loongarch.cc: Take the setting of
-m[no-]relax into account when determining the default of
-mexplicit-relocs=.
* doc/invoke.texi: Document -m[no-]relax and
-m[no-]pass-mrelax-to-as for LoongArch. Update the default
value of -mexplicit-relocs=.
Xi Ruoyao [Mon, 13 Nov 2023 21:32:38 +0000 (05:32 +0800)]
LoongArch: Use finer-grained DBAR hints
LA664 defines DBAR hints 0x1 - 0x1f (except 0xf and 0x1f) as follows [1-2]:
- Bit 4: kind of constraint (0: completion, 1: ordering)
- Bit 3: barrier for previous read (0: true, 1: false)
- Bit 2: barrier for previous write (0: true, 1: false)
- Bit 1: barrier for succeeding read (0: true, 1: false)
- Bit 0: barrier for succeeding write (0: true, 1: false)
LLVM has already utilized them for different memory orders [3]:
- Bit 4 is always set to one because it's only intended to be zero for
things like MMIO devices, which are out of the scope of memory orders.
- An acquire barrier is used to implement acquire loads like
ld.d $a1, $t0, 0
dbar acquire_hint
where the load operation (ld.d) should not be reordered with any load
or store operation after the acquire load. To accomplish this
constraint, we need to prevent the load operation from being reordered
after the barrier, and also prevent any following load/store operation
from being reordered before the barrier. Thus bits 0, 1, and 3 must
be zero, and bit 2 can be one, so acquire_hint should be 0b10100.
- An release barrier is used to implement release stores like
dbar release_hint
st.d $a1, $t0, 0
where the store operation (st.d) should not be reordered with any load
or store operation before the release store. So we need to prevent
the store operation from being reordered before the barrier, and also
prevent any preceding load/store operation from being reordered after
the barrier. So bits 0, 2, 3 must be zero, and bit 1 can be one. So
release_hint should be 0b10010.
A similar mapping has been utilized for RISC-V GCC [4], LoongArch Linux
kernel [1], and LoongArch LLVM [3]. So the mapping should be correct.
And I've also bootstrapped & regtested GCC on a LA664 with this patch.
The LoongArch CPUs should treat "unknown" hints as dbar 0, so we can
unconditionally emit the new hints without a compiler switch.
Jakub Jelinek [Tue, 14 Nov 2023 08:24:34 +0000 (09:24 +0100)]
tree: Handle BITINT_TYPE in type_contains_placeholder_1 [PR112511]
The following testcase ICEs because BITINT_TYPE isn't handled in
type_contains_placeholder_1. Given that Ada doesn't emit it, it doesn't
matter that much where exactly we handle it as right now it should never
contain a placeholder; I've picked the same spot as INTEGER_TYPE, but if
you prefer e.g. the one with OFFSET_TYPE above, I can move it there too.
2023-11-14 Jakub Jelinek <jakub@redhat.com>
PR middle-end/112511
* tree.cc (type_contains_placeholder_1): Handle BITINT_TYPE like
INTEGER_TYPE.
Jakub Jelinek [Tue, 14 Nov 2023 07:11:44 +0000 (08:11 +0100)]
i386: Don't optimize vshuf{i,f}{32x4,64x2} and vperm{i,f}128 to vblendps for %ymm16+ [PR112435]
The vblendps instruction is only VEX encoded, not EVEX, so can't be used if
there are %ymm16+ or EGPR registers involved.
2023-11-14 Jakub Jelinek <jakub@redhat.com>
Hu, Lin1 <lin1.hu@intel.com>
PR target/112435
* config/i386/sse.md (avx512vl_shuf_<shuffletype>32x4_1<mask_name>,
<mask_codefor>avx512dq_shuf_<shuffletype>64x2_1<mask_name>): Add
alternative with just x instead of v constraints and xjm instead of
vm and use vblendps as optimization only with that alternative.
* gcc.target/i386/avx512vl-pr112435-1.c: New test.
* gcc.target/i386/avx512vl-pr112435-2.c: New test.
* gcc.target/i386/avx512vl-pr112435-3.c: New test.
Arsen Arsenoviฤ [Fri, 19 May 2023 19:12:57 +0000 (21:12 +0200)]
*: add modern gettext
This patch updates gettext.m4 and related .m4 files and adds
gettext-runtime as a gmp/mpfr/... style host library, allowing newer
libintl to be used.
This patch /does not/ add build-time tools required for
internationalizing (msgfmt et al), instead, it just updates the runtime
library. The result should be a distribution that acts exactly the same
when a copy of gettext is present, and disables internationalization
otherwise.
There should be no changes in behavior when gettext is included in-tree.
When gettext is not included in tree, nor available on the system, the
programs will be built without localization.
ChangeLog:
PR bootstrap/12596
* .gitignore: Add '/gettext*'.
* configure.ac (host_libs): Replace intl with gettext.
(hbaseargs, bbaseargs, baseargs): Split baseargs into
{h,b}baseargs.
(skip_barg): New flag. Skips appending current flag to
bbaseargs.
<library exemptions>: Exempt --with-libintl-{type,prefix} from
target and build machine argument passing.
* configure: Regenerate.
* Makefile.def (host_modules): Replace intl module with gettext
module.
(configure-ld): Depend on configure-gettext.
* Makefile.in: Regenerate.
config/ChangeLog:
* intlmacosx.m4: Import from gettext-0.22 (serial 8).
* gettext.m4: Sync with gettext-0.22 (serial 77).
* gettext-sister.m4 (ZW_GNU_GETTEXT_SISTER_DIR): Load gettext's
uninstalled-config.sh, or call AM_GNU_GETTEXT if missing.
* iconv.m4: Sync with gettext-0.22 (serial 26).
* configure: Regenerate.
* aclocal.m4: Regenerate.
* Makefile.in (LIBDEPS): Remove (potential) ./ prefix from
LIBINTL_DEP.
* doc/install.texi: Document new (notable) flags added by the
optional gettext tree and by AM_GNU_GETTEXT. Document libintl/libc
with gettext dependency.
Jonathan Wakely [Sat, 11 Nov 2023 00:35:18 +0000 (00:35 +0000)]
libstdc++: Micro-optimization for std::optional [PR112480]
This small change removes a branch when clearing a std::optional<T> for
types with no-op destructors. For types where the destructor can be
optimized away (e.g. because it's trivial, or empty and can be inlined)
the _M_destroy() function does nothing but set _M_engaged to false.
Setting _M_engaged=false unconditionally is cheaper than only doing it
when initially true, because it allows the compiler to remove a branch.
The compiler thinks it would be incorrect to unconditionally introduce a
store there, because it could conflict with reads in other threads, so
it won't do that optimization itself. We know it's safe to do because
we're in a non-const member function, so the standard forbids any
potentially concurrent calls to other member functions of the same
object. Making the store unconditional can't create a data race that
isn't already present in the program.
libstdc++-v3/ChangeLog:
PR libstdc++/112480
* include/std/optional (_Optional_payload_base::_M_reset): Set
_M_engaged to false unconditionally.
Uros Bizjak [Mon, 13 Nov 2023 22:55:41 +0000 (23:55 +0100)]
i386: Rewrite pushfl<mode>2 and popfl<mode>1 as unspecs
Flags reg is valid only with CC mode.
gcc/ChangeLog:
* config/i386/i386-expand.h (gen_pushfl): New prototype.
(gen_popfl): Ditto.
* config/i386/i386-expand.cc (ix86_expand_builtin)
[case IX86_BUILTIN_READ_FLAGS]: Use gen_pushfl.
[case IX86_BUILTIN_WRITE_FLAGS]: Use gen_popfl.
* config/i386/i386.cc (gen_pushfl): New function.
(gen_popfl): Ditto.
* config/i386/i386.md (unspec): Add UNSPEC_PUSHFL and UNSPEC_POPFL.
(@pushfl<mode>2): Rename from *pushfl<mode>2.
Rewrite as unspec using UNSPEC_PUSHFL.
(@popfl<mode>1): Rename from *popfl<mode>1.
Rewrite as unspec using UNSPEC_POPFL.
Uros Bizjak [Mon, 13 Nov 2023 21:45:55 +0000 (22:45 +0100)]
i386: Return CCmode from ix86_cc_mode for unknown RTX code [PR112494]
Combine wants to combine following instructions into an insn that can
perform both an (arithmetic) operation and set the condition code. During
the conversion a new RTX is created, and combine passes the RTX code of the
innermost RTX expression of the CC use insn in which CC reg is used to
SELECT_CC_MODE, to determine the new mode of the comparison:
x86_cc_mode (AKA SELECT_CC_MODE) is not prepared to handle random RTX
codes and triggers gcc_unreachable() when SET RTX code is passed to it.
The patch removes gcc_unreachable() and returns CCmode for unknown
RTX codes, so combine can try various combinations involving CC reg
without triggering ICE.
Please note that x86 MOV instructions do not set flags, so the above
combination is not recognized as a valid x86 instruction.
Robin Dapp [Sat, 11 Nov 2023 11:47:57 +0000 (12:47 +0100)]
RISC-V: vsetvl: Refine REG_EQUAL equality.
This patch enhances the equality check for REG_EQUAL notes in the vsetvl
pass by using the == operator instead of rtx_equal_p. With that, in
situations like the following, a5 and a7 are not considered equal
anymore.
Gaius Mulley [Mon, 13 Nov 2023 15:11:50 +0000 (15:11 +0000)]
PR modula2/110779: Add reduced acinclude.m4 to allow interrogation of time features
This patch adds libgm2/acinclude.m4 and libgm2/configure.host which
are reduced versions from libstdc++-v3. They currently allow for
discovering the time features available in libc and will be extended
to discover availability of ieee128 long double support in the near
future. These files were also added to provide the functions:
GLIBCXX_CONFIGURE, GLIBCXX_CHECK_GETTIMEOFDAY and
GLIBCXX_ENABLE_LIBSTDCXX_TIME called by configure.ac.
This fixes a bunch more tests that try to override the default architecture;
some partially used the framework for doing this, others just blindly
added a -march option, which was doomed to cause problems. In most cases
we can now run these tests regardless of the users testing options and
the base compiler configuration.
gcc/testsuite:
* lib/target-supports.exp (check_effective_target_arm_arch_FUNC_ok):
Add test for v7a_arm.
* gcc.target/arm/pr60650-2.c: Use require-effective-target and
add-options.
* gcc.target/arm/pr60657.c: Likewise.
* gcc.target/arm/pr60663.c: Likewise.
* gcc.target/arm/pr81863.c: Likewise.
* gcc.target/arm/pr97969.c: Likewise.
* gcc.target/arm/pr98931.c: Likewise.
* gcc.target/arm/tail-long-call.c: Likewise.
testsuite: arm: tighten up mode-specific ISA tests
Some of the standard Arm architecture tests require the test to use a
specific instruction set (arm or thumb). But although the framework
was checking that the flag was accepted, it wasn't checking that the
flag wasn't somehow being override (eg by run-specific options). We
can improve these tests easily by checking whether or not __thumb-_ is
defined.
gcc/testsuite:
* lib/target-supports.exp (check_effective_target_arm_arch_FUNC_ok):
For instruction-set specific tests, check that __thumb__ is, or
isn't defined as appropriate.
arm: testsuite: improve compatibility of gcc.target/arm/optional_thumb-*.c
These tests deliberately pass invalid option combinations to check
that the compiler is generating the correct diagnostic. Nevertheless,
we can improve their compatibility with other testsuite options. For
optional_thumb-1.c we use a soft-float ABI, while for
optional_thumb2.c we use arm_arch_v7em as the target architecture,
then set the architecture manually.
gcc/testsuite:
* gcc.target/arm/optional_thumb-1.c: Force a soft-float ABI.
* gcc.target/arm/optional_thumb-3.c: Check for armv7e-m compatibility,
then set the architecture explicitly.
arm: testsuite: improve compatibility of gcc.target/arm/macro_defs*.c
Convert these tests to use dg-add-options for increased compatibilty.
Since they also result in an empty translation unit, override the
default testsuite options.
gcc/testsuite:
* gcc.target/arm/macro_defs0.c: Use dg-effective-target and
dg-add-options.
* gcc.target/arm/macro_defs1.c: Likewise.
* gcc.target/arm/macro_defs2.c: Likewise.
arm: testsuite: improve compatibility of ftest-armv7m-thumb.c
This test is specific to armv7m cores which do not support hardware
floating-point. We can improve its compatibility by having the default
options for this core specify -mfloat-abi=soft.
gcc/testsuite:
* lib/target-supports.exp (check_effective_target_arm_arch_FUNC_ok):
Use soft-float ABI for armv7m.
* gcc.target/arm/ftest-armv7m-thumb.c: Use dg-require-effective-target
to check flag compatibility.
arm: testsuite: improve compatibility of pragma_arch_switch_2.c
This test was explicitly setting the architecture on the command-line and
in the body of the test. In both cases this causes problems with the auto
FPU setting. Fix by using the testsuite infrastructure correctly and by
adding +fp to the pragma.
gcc/testsuite:
* gcc.target/arm/pragma_arch_switch_2.c: Use testsuite infrastructure
to set the architecture flags. Add +fp to the pragma that changes the
architecture.
arm: testsuite: improve compatibility of pragma_arch_attribute*.c
These tests use pragmas adn attributes to change the architecture.
Sometimes they simply add a feature using "+crc", but other times they
try to completely reset the architecture using "arch=armv8-a+crc".
The latter fails on a hard-float ABI with -mfpu=auto because it also
clears the FP capability. Fix by adding +simd when the full
architecture is specified.
gcc/testsuite:
* gcc.target/arm/pragma_arch_attribute.c: Add +simd to pragmas that
set an explicit architecture.
* gcc.target/arm/pragma_arch_attribute_2.c: Likewise.
* gcc.target/arm/pragma_arch_attribute_3.c: Likewise.
gcc.target/arm/g2.c is an xscale-only test, but the test is quite old
and we have improved the infrastructure for setting up such tests now.
So make use of that to reduce the number of cases where this test fails
to run.
gcc/testsuite:
* lib/target-supports.exp (check_effective_target_arm_arch_FUNC_ok):
Add entry to check for xscale.
* gcc.target/arm/g2.c: Use it.
arm: testsuite: avoid problems with -mfpu=auto in attr_thumb-static2.c
This test overrides the architecture, but fails to describe which
floating-point features are needed. This causes problems if the ABI
requires FP for parameter passing and -mfpu=auto is selected, so ensure
that one is specified.
gcc/testsuite:
* gcc.target/arm/attr_thumb-static2.c: Add +fp to the -march
specification.
arm: testsuite: avoid problems with -mfpu=auto in attr-crypto.c
This test overrides the architecture, but fails to describe which
floating-point features are needed. This causes problems if the ABI
requires FP for parameter passing and -mfpu=auto is selected, so ensure
that one is specified.
gcc/testsuite:
* gcc.target/arm/attr-crypto.c: Add +simd to the -march
specification.
arm: testsuite: avoid problems with -mfpu=auto in pacbti-m-predef-11.c
This test overrides the architecture, but fails to describe which
floating-point features are needed. This causes problems if the ABI
requires FP for parameter passing and -mfpu=auto is selected, so ensure
that one is specified.
gcc/testsuite:
* gcc.target/arm/acle/pacbti-m-predef-11.c: Add +fp to the -march
specification.
arm: testsuite: avoid hard-float ABI incompatibility with -march
A number of tests in the gcc testsuite, especially for arm-specific
targets, add various flags to control the architecture. These run
into problems when the compiler is configured with -mfpu=auto if the
new architecture lacks an architectural feature that implies we have
floating-point instructions.
The testsuite makes this worse as it falls foul of this requirement in
the base architecture strings provided by target-supports.exp.
To fix this we add "+fp", or something equivalent to this, to all the
base architecture specifications. The feature will be ignored if the
float ABI is set to soft.
gcc/testsuite:
* lib/target-supports.exp (check_effective_target_arm_arch_FUNC_ok):
Add base FPU specifications to all architectures that can support
one.
arm: testsuite: correctly detect armv6t2 hardware for acle execution tests
Some of the ACLE tests for Arm are executable, but we were only testing
that the compiler could generate code for them, not that the hardware
was capable of executing them. Fix this by adding an execution test for
suitable hardware.
gcc/testsuite:
* lib/target-supports.exp (check_effective_target_arm_arch_v6t2_hw_ok):
New function.
* gcc.target/arm/acle/data-intrinsics-armv6.c: Use it.
* gcc.target/arm/acle/data-intrinsics-rbit.c: Likewise.
Richard Biener [Mon, 13 Nov 2023 08:24:08 +0000 (09:24 +0100)]
middle-end/112487 - inline and parameter mismatch
When passing an aggregate to a implicitly declared function that's
later declared as receiving a register type we can run into a
sanity assert that cannot hold for such gross mismatches. Instead
of asserting avoid emitting a debug temp that's invalid.
PR middle-end/112487
* tree-inline.cc (setup_one_parameter): When the parameter
is unused only insert a debug bind when there's not a gross
mismatch in value and declared parameter type. Do not assert
there effectively isn't.
Juzhe-Zhong [Mon, 13 Nov 2023 11:06:36 +0000 (19:06 +0800)]
RISC-V: Optimize combine sequence by merge approach
gcc/ChangeLog:
* config/riscv/riscv-v.cc
(rvv_builder::combine_sequence_use_merge_profitable_p): New function.
(expand_vector_init_merge_combine_sequence): Ditto.
(expand_vec_init): Adapt for new optimization.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls-vlmax/combine-merge-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/combine-merge-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/combine-merge_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/combine-merge_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-10.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-11.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-12.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-13.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-14.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-8.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-9.c: New test.
liuhongt [Wed, 8 Nov 2023 06:52:01 +0000 (14:52 +0800)]
Support vec_set/vec_extract/vec_init for V4HF/V2HF.
gcc/ChangeLog:
* config/i386/i386-expand.cc
(ix86_expand_vector_init_duplicate): Handle V4HF/V4BF and
V2HF/V2BF.
(ix86_expand_vector_init_one_nonzero): Ditto.
(ix86_expand_vector_init_one_var): Ditto.
(ix86_expand_vector_init_general): Ditto.
(ix86_expand_vector_set_var): Ditto.
(ix86_expand_vector_set): Ditto.
(ix86_expand_vector_extract): Ditto.
* config/i386/mmx.md
(mmxdoublevecmode): Extend to V4HF/V4BF/V2HF/V2BF.
(*mmx_pinsrw): Extend to V4FI_64, add a new alternative (&x,
x, x), add a new define_split after the pattern.
(*mmx_pextrw<mode>): New define_insn.
(mmx_pshufw_1): Rename to ..
(mmx_pshufw<mode>_1): .. this, extend to V4FI_64.
(*mmx_pblendw64): Extend to V4FI_64.
(*vec_dup<mode>): New define_insn.
(vec_setv4hi): Rename to ..
(vec_set<mode>): .. this, and extend to V4FI_64
(vec_extractv4hihi): Rename to ..
(vec_extract<mode><mmxscalarmodelower>): .. this, and extend
to V4FI_64.
(vec_init<mode><mmxscalarmodelower>): New define_insn.
(*pinsrw): Extend to V2FI_32, add a new alternative (&x,
x, x), and add a new define_split after it.
(*pextrw<mode>): New define_insn.
(vec_setv2hi): Rename to ..
(vec_set<mode>): .. this, extend to V2FI_32.
(vec_extractv2hihi): Rename to ..
(vec_extract<mode><mmxscalarmodelower>): .. this, extend to
V2FI_32.
(*punpckwd): Extend to V2FI_32.
(*pshufw_1): Rename to ..
(*pshufw<mode>_1): .. this, extend to V2FI_32.
(vec_initv2hihi): Rename to ..
(vec_init<mode><mmxscalarmodelower>): .. this, and extend to
V2FI_32.
(*vec_dup<mode>): New define_insn.
* config/i386/sse.md (*vec_extract<mode>): Refine constraint
from v to Yw.
gcc/testsuite/ChangeLog:
* gcc.target/i386/part-vect-vec_elem-1.c: New test.
* gcc.target/i386/part-vect-vec_elem-2.c: New test.
Roger Sayle [Mon, 13 Nov 2023 09:16:59 +0000 (09:16 +0000)]
ARC: Improved DImode rotates and right shifts by one bit.
This patch improves the code generated for DImode right shifts (both
arithmetic and logical) by a single bit, and also for DImode rotates
(both left and right) by a single bit. In approach, this is similar
to the recently added DImode left shift by a single bit patch, but
also builds upon the x86's UNSPEC carry flag representation:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632169.html
The benefits can be seen from the four new test cases:
On CPUs without a barrel shifter the improvements are even better.
2023-11-13 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/arc/arc.md (UNSPEC_ARC_CC_NEZ): New UNSPEC that
represents the carry flag being set if the operand is non-zero.
(adc_f): New define_insn representing adc with updated flags.
(ashrdi3): New define_expand that only handles shifts by 1.
(ashrdi3_cnt1): New pre-reload define_insn_and_split.
(lshrdi3): New define_expand that only handles shifts by 1.
(lshrdi3_cnt1): New pre-reload define_insn_and_split.
(rrcsi2): New define_insn for rrc (SImode rotate right through carry).
(rrcsi2_carry): Likewise for rrc.f, as above but updating flags.
(rotldi3): New define_expand that only handles rotates by 1.
(rotldi3_cnt1): New pre-reload define_insn_and_split.
(rotrdi3): New define_expand that only handles rotates by 1.
(rotrdi3_cnt1): New pre-reload define_insn_and_split.
(lshrsi3_cnt1_carry): New define_insn for lsr.f.
(ashrsi3_cnt1_carry): New define_insn for asr.f.
(btst_0_carry): New define_insn for asr.f without result.
gcc/testsuite/ChangeLog
* gcc.target/arc/ashrdi3-1.c: New test case.
* gcc.target/arc/lshrdi3-1.c: Likewise.
* gcc.target/arc/rotldi3-1.c: Likewise.
* gcc.target/arc/rotrdi3-1.c: Likewise.
Roger Sayle [Mon, 13 Nov 2023 09:11:42 +0000 (09:11 +0000)]
ARC: Provide a TARGET_FOLD_BUILTIN target hook.
This patch implements a arc_fold_builtin target hook to allow ARC
builtins to be folded at the tree-level. Currently this function
converts __builtin_arc_swap into a LROTATE_EXPR at the tree-level,
and evaluates __builtin_arc_norm and __builtin_arc_normw of integer
constant arguments at compile-time. Because ARC_BUILTIIN_SWAP is
now handled at the tree-level, UNSPEC_ARC_SWAP no longer used,
allowing it and the "swap" define_insn to be removed.
An example benefit of folding things at compile-time is that
calling __builtin_arc_swap on the result of __builtin_arc_swap
now eliminates both and generates no code, and likewise calling
__builtin_arc_swap of a constant integer argument is evaluated
at compile-time.
2023-11-13 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/arc/arc.cc (TARGET_FOLD_BUILTIN): Define to
arc_fold_builtin.
(arc_fold_builtin): New function. Convert ARC_BUILTIN_SWAP
into a rotate. Evaluate ARC_BUILTIN_NORM and
ARC_BUILTIN_NORMW of constant arguments.
* config/arc/arc.md (UNSPEC_ARC_SWAP): Delete.
(normw): Make output template/assembler whitespace consistent.
(swap): Remove define_insn, only use of SWAP UNSPEC.
* config/arc/builtins.def: Tweak indentation.
(SWAP): Expand using rotlsi2_cnt16 instead of using swap.
Roger Sayle [Mon, 13 Nov 2023 09:05:16 +0000 (09:05 +0000)]
i386: Improve reg pressure of double word right shift then truncate.
This patch improves register pressure during reload, inspired by PR 97756.
Normally, a double-word right-shift by a constant produces a double-word
result, the highpart of which is dead when followed by a truncation.
The dead code calculating the high part gets cleaned up post-reload, so
the issue isn't normally visible, except for the increased register
pressure during reload, sometimes leading to odd register assignments.
Providing a post-reload splitter, which clobbers a single wordmode
result register instead of a doubleword result register, helps (a bit).
An example demonstrating this effect is:
unsigned long foo (__uint128_t n)
{
unsigned long a = n & MASK60;
unsigned long b = (n >> 60);
b = b & MASK60;
unsigned long c = (n >> 120);
return a+b+c;
}
with this patch, we generate one less mov (12 instructions):
foo: movabsq $1152921504606846975, %rcx
xchgq %rdi, %rsi
movq %rdi, %rdx
movq %rsi, %rax
movq %rdi, %rsi
shrdq $60, %rdi, %rdx
andq %rcx, %rax
shrq $56, %rsi
addq %rsi, %rax
andq %rcx, %rdx
addq %rdx, %rax
ret
The significant difference is easier to see via diff:
< shrdq $60, %rdi, %rax
< movq %rax, %rdx
---
> shrdq $60, %rdi, %rdx
Admittedly a single "mov" isn't much of a saving on modern architectures,
but as demonstrated by the PR, people still track the number of them.
2023-11-13 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386.md (<insn><dwi>3_doubleword_lowpart): New
define_insn_and_split to optimize register usage of doubleword
right shifts followed by truncation.
Jakub Jelinek [Mon, 13 Nov 2023 08:49:09 +0000 (09:49 +0100)]
i386: Remove j constraint letter from list of unused letters
I've noticed the list of unused letters still list j, even when that
constraint letter is now the first letter of jr, jR, jm, j<, j>, jo, jV, jp,
ja, jb and jc constraints.
2023-11-13 Jakub Jelinek <jakub@redhat.com>
* config/i386/constraints.md: Remove j constraint letter from list of
unused letters.
Florian Weimer [Mon, 13 Nov 2023 07:54:11 +0000 (08:54 +0100)]
C99 testsuite readiness: Cleanup of execute tests
This change updates the gcc.c-torture/execute/ to avoid obsolete
language constructs. In the changed tests, use of the features
appears to be accidental, and updating allows the tests run with
the default compiler flags.
gcc/testsuite/
* gcc.c-torture/execute/20000112-1.c (main): Add missing
int and void types.
* gcc.c-torture/execute/20000113-1.c (foobar): Add missing
void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/20000314-2.c (main): Likewise.
* gcc.c-torture/execute/20000402-1.c (main): Likewise.
* gcc.c-torture/execute/20000403-1.c (main): Likewise.
* gcc.c-torture/execute/20000503-1.c (main): Likewise.
* gcc.c-torture/execute/20000605-2.c (main): Likewise.
* gcc.c-torture/execute/20000717-1.c (main): Likewise.
* gcc.c-torture/execute/20000717-5.c (main): Likewise.
* gcc.c-torture/execute/20000726-1.c (main): Likewise.
* gcc.c-torture/execute/20000914-1.c(blah): Add missing
void types.
(main): Add missing int and void types.
* gcc.c-torture/execute/20001009-1.c (main): Likewise.
* gcc.c-torture/execute/20001013-1.c (main): Likewise.
* gcc.c-torture/execute/20001031-1.c (main): Likewise.
* gcc.c-torture/execute/20010221-1.c (main): Likewise.
* gcc.c-torture/execute/20010723-1.c (main): Likewise.
* gcc.c-torture/execute/20010915-1.c (s): Call
__builtin_strcmp instead of strcmp.
* gcc.c-torture/execute/20010924-1.c (main): Add missing
int and void types.
* gcc.c-torture/execute/20011128-1.c (main): Likewise.
* gcc.c-torture/execute/20020226-1.c (main): Likewise.
* gcc.c-torture/execute/20020328-1.c (foo): Add missing
void types.
* gcc.c-torture/execute/20020406-1.c (DUPFFexgcd): Call
__builtin_printf instead of printf.
(main): Likewise.
* gcc.c-torture/execute/20020508-1.c (main): Add missing
int and void types.
* gcc.c-torture/execute/20020508-2.c (main): Likewise.
* gcc.c-torture/execute/20020508-3.c (main): Likewise.
* gcc.c-torture/execute/20020611-1.c (main): Likewise.
* gcc.c-torture/execute/20021010-2.c (main): Likewise.
* gcc.c-torture/execute/20021113-1.c (foo): Add missing
void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/20021120-3.c (foo): Call
__builtin_sprintf instead of sprintf.
* gcc.c-torture/execute/20030125-1.c (main): Add missing
int and void types.
* gcc.c-torture/execute/20030216-1.c (main): Likewise.
* gcc.c-torture/execute/20030404-1.c (main): Likewise.
* gcc.c-torture/execute/20030606-1.c (main): Likewise.
Call __builtin_memset instead of memset.
* gcc.c-torture/execute/20030828-1.c (main): Add missing int
and void types.
* gcc.c-torture/execute/20030828-2.c (main): Likewise.
* gcc.c-torture/execute/20031012-1.c: Call __builtin_strlen
instead of strlen.
* gcc.c-torture/execute/20031211-1.c (main): Add missing int
and void types.
* gcc.c-torture/execute/20040319-1.c (main): Likewise.
* gcc.c-torture/execute/20040411-1.c (sub1): Call
__builtin_memcpy instead of memcpy.
* gcc.c-torture/execute/20040423-1.c (sub1): Likewise.
* gcc.c-torture/execute/20040917-1.c (main): Add missing int
and void types.
* gcc.c-torture/execute/20050131-1.c (main): Likewise.
* gcc.c-torture/execute/20051113-1.c (main): Likewise.
* gcc.c-torture/execute/20121108-1.c (main): Call
__builtin_printf instead of printf.
* gcc.c-torture/execute/20170401-2.c (main): Add missing int
and void types.
* gcc.c-torture/execute/900409-1.c (main): Likewise.
* gcc.c-torture/execute/920202-1.c (f): Add int return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/920302-1.c (execute): Add void return
type.
(main): Add missing int and void types.
* gcc.c-torture/execute/920410-1.c (main): Likewise.
* gcc.c-torture/execute/920501-2.c (main): Likewise.
* gcc.c-torture/execute/920501-3.c (execute): Add void return
type.
(main): Add missing int and void types.
* gcc.c-torture/execute/920501-5.c (x): Add int return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/920501-6.c (main): Add int return
type.
* gcc.c-torture/execute/920501-8.c (main): Add missing
int and void types. Call __builtin_strcmp instead of strcmp.
* gcc.c-torture/execute/920506-1.c (main): Add missing
int and void types.
* gcc.c-torture/execute/920612-2.c (main): Likewise.
* gcc.c-torture/execute/920618-1.c (main): Likewise.
* gcc.c-torture/execute/920625-1.c (main): Likewise.
* gcc.c-torture/execute/920710-1.c (main): Likewise.
* gcc.c-torture/execute/920721-1.c (main): Likewise.
* gcc.c-torture/execute/920721-4.c (main): Likewise.
* gcc.c-torture/execute/920726-1.c (first, second): Call
__builtin_strlen instead of strlen.
(main): Add missing int and void types. Call __builtin_strcmp
instead of strcmp.
* gcc.c-torture/execute/920810-1.c (main): Add missing
int and void types.
* gcc.c-torture/execute/920829-1.c (main): Likewise.
* gcc.c-torture/execute/920908-1.c (main): Likewise.
* gcc.c-torture/execute/920922-1.c (main): Likewise.
* gcc.c-torture/execute/920929-1.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/921006-1.c (main): Likewise. Call
__builtin_strcmp instead of strcmp.
* gcc.c-torture/execute/921007-1.c (main): Add missing
int and void types.
* gcc.c-torture/execute/921016-1.c (main): Likewise.
* gcc.c-torture/execute/921019-1.c (main): Likewise.
* gcc.c-torture/execute/921019-2.c (main): Likewise.
* gcc.c-torture/execute/921029-1.c (main): Likewise.
* gcc.c-torture/execute/921104-1.c (main): Likewise.
* gcc.c-torture/execute/921112-1.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/921113-1.c (w, f1, f2, gitter): Add
void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/921117-1.c (check): Call
__builtin_strcmp instead of strcmp.
(main): Add missing int and void types. Call __builtin_strcpy
instead of strcpy.
* gcc.c-torture/execute/921123-2.c (main): Add missing
int and void types.
* gcc.c-torture/execute/921202-2.c (main): Likewise.
* gcc.c-torture/execute/921204-1.c (main): Likewise.
* gcc.c-torture/execute/921208-1.c (main): Likewise.
* gcc.c-torture/execute/930123-1.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/930126-1.c (main): Likewise.
* gcc.c-torture/execute/930406-1.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/930408-1.c (p, f): Add missing void
types.
(main): Add missing int and void types.
* gcc.c-torture/execute/930429-1.c (main): Likewise.
* gcc.c-torture/execute/930603-2.c (f): Add missing void
types.
(main): Add missing int and void types.
* gcc.c-torture/execute/930608-1.c (main): Likewise.
* gcc.c-torture/execute/930614-1.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/930614-2.c (main): Likewise.
* gcc.c-torture/execute/930622-2.c (main): Likewise.
* gcc.c-torture/execute/930628-1.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/930725-1.c (main): Likewise. Call
__builtin_strcmp instead of strcmp.
* gcc.c-torture/execute/930930-2.c (main): Add missing
int and void types.
* gcc.c-torture/execute/931002-1.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/931004-1.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/931004-10.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/931004-11.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/931004-12.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/931004-13.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/931004-14.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/931004-2.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/931004-3.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/931004-4.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/931004-5.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/931004-6.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/931004-7.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/931004-8.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/931004-9.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/931005-1.c (main): Likewise.
* gcc.c-torture/execute/931110-1.c (main): Likewise.
* gcc.c-torture/execute/931110-2.c (main): Likewise.
* gcc.c-torture/execute/941014-1.c (main): Likewise.
* gcc.c-torture/execute/941014-2.c (main): Likewise.
* gcc.c-torture/execute/941015-1.c (main): Likewise.
* gcc.c-torture/execute/941021-1.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/941025-1.c (main): Likewise.
* gcc.c-torture/execute/941031-1.c (main): Likewise.
* gcc.c-torture/execute/950221-1.c (g2): Add int return type.
(f): Add missing void types. Call __builtin_strcpy instead
of strcpy.
(main): Add missing int and void types.
* gcc.c-torture/execute/950426-2.c (main): Likewise.
* gcc.c-torture/execute/950503-1.c (main): Likewise.
* gcc.c-torture/execute/950511-1.c (main): Likewise.
* gcc.c-torture/execute/950607-1.c (main): Likewise.
* gcc.c-torture/execute/950607-2.c (main): Likewise.
* gcc.c-torture/execute/950612-1.c (main): Likewise.
* gcc.c-torture/execute/950628-1.c (main): Likewise.
* gcc.c-torture/execute/950704-1.c (main): Likewise.
* gcc.c-torture/execute/950706-1.c (main): Likewise.
* gcc.c-torture/execute/950710-1.c (main): Likewise.
* gcc.c-torture/execute/950714-1.c (main): Likewise.
* gcc.c-torture/execute/950809-1.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/950906-1.c (g, f): Add void return
type.
(main): Add missing int and void types.
* gcc.c-torture/execute/950915-1.c (main): Likewise.
* gcc.c-torture/execute/950929-1.c (main): Likewise.
* gcc.c-torture/execute/951003-1.c (f): Add missing int
parameter type.
(main): Add missing int and void types.
* gcc.c-torture/execute/951115-1.c (g, f): Add void return
type.
(main): Add missing int and void types.
* gcc.c-torture/execute/951204-1.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/960116-1.c (p): Add int return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/960117-1.c (main): Likewise.
* gcc.c-torture/execute/960209-1.c (main): Likewise.
* gcc.c-torture/execute/960215-1.c (main): Likewise.
* gcc.c-torture/execute/960219-1.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/960301-1.c (main): Likewise.
* gcc.c-torture/execute/960302-1.c (foo, main): Add missing
int and void types.
* gcc.c-torture/execute/960311-1.c (main): Likewise.
* gcc.c-torture/execute/960311-2.c (main): Likewise.
* gcc.c-torture/execute/960311-3.c (main): Likewise.
* gcc.c-torture/execute/960312-1.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/960317-1.c (main): Likewise.
* gcc.c-torture/execute/960321-1.c (main): Likewise.
* gcc.c-torture/execute/960326-1.c (main): Likewise.
* gcc.c-torture/execute/960327-1.c (g, main): Add missing
int and void types.
(f): Add missing void types.
* gcc.c-torture/execute/960405-1.c (main): Add missing
int and void types.
* gcc.c-torture/execute/960416-1.c (main): Likewise.
* gcc.c-torture/execute/960419-1.c (main): Likewise.
* gcc.c-torture/execute/960419-2.c (main): Likewise.
* gcc.c-torture/execute/960512-1.c (main): Likewise.
* gcc.c-torture/execute/960513-1.c (main): Likewise.
* gcc.c-torture/execute/960521-1.c (f): Add missing void
types.
(main): Add missing int and void types.
* gcc.c-torture/execute/960608-1.c (f): Add int return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/960801-1.c (main): Likewise.
* gcc.c-torture/execute/960802-1.c (main): Likewise.
* gcc.c-torture/execute/960909-1.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/961004-1.c (main): Likewise.
* gcc.c-torture/execute/961017-1.c (main): Likewise.
* gcc.c-torture/execute/961017-2.c (main): Likewise.
* gcc.c-torture/execute/961026-1.c (main): Likewise.
* gcc.c-torture/execute/961122-1.c (addhi, subhi): Add void
return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/961122-2.c (main): Likewise.
* gcc.c-torture/execute/961125-1.c (main): Likewise.
* gcc.c-torture/execute/961206-1.c (main): Likewise.
* gcc.c-torture/execute/961213-1.c (main): Likewise.
* gcc.c-torture/execute/970214-1.c (main): Likewise.
* gcc.c-torture/execute/970214-2.c (main): Likewise.
* gcc.c-torture/execute/970217-1.c (sub): Add int return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/970923-1.c (main): Likewise.
* gcc.c-torture/execute/980223.c (main): Likewise.
* gcc.c-torture/execute/980506-1.c (main): Likewise.
* gcc.c-torture/execute/980506-2.c (main): Likewise.
* gcc.c-torture/execute/980506-3.c (build_lookup): Call
__builtin_strlen instead of strlen and __builtin_memset
instead of memset.
* gcc.c-torture/execute/980526-3.c (main): Likewise.
* gcc.c-torture/execute/980602-1.c (main): Likewise.
* gcc.c-torture/execute/980604-1.c (main): Likewise.
* gcc.c-torture/execute/980605-1.c (dummy): Add missing int
parameter type.
(main): Add missing int and void types.
* gcc.c-torture/execute/980701-1.c (ns_name_skip): Add missing
int return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/980709-1.c (main): Likewise.
* gcc.c-torture/execute/990117-1.c (main): Likewise.
* gcc.c-torture/execute/990127-1.c (main): Likewise.
* gcc.c-torture/execute/990128-1.c (main): Likewise.
* gcc.c-torture/execute/990130-1.c (main): Likewise.
* gcc.c-torture/execute/990324-1.c (main): Likewise.
* gcc.c-torture/execute/990524-1.c (main): Likewise.
* gcc.c-torture/execute/990531-1.c (main): Likewise.
* gcc.c-torture/execute/990628-1.c (fetch, load_data): Call
__builtin_memset instead of memset.
(main): Add missing int and void types.
* gcc.c-torture/execute/991019-1.c (main): Likewise.
* gcc.c-torture/execute/991023-1.c (foo, main): Likewise.
* gcc.c-torture/execute/991112-1.c (isprint): Declare.
* gcc.c-torture/execute/991118-1.c (main): Add missing
int and void types.
* gcc.c-torture/execute/alias-1.c (ptr2): Add cast to float *
in initializer.
(typepun): Add missing void types.
(main): Add missing int and void types.
* gcc.c-torture/execute/alias-2.c (main): Likewise.
* gcc.c-torture/execute/alias-3.c (inc): Add missing
void types.
* gcc.c-torture/execute/alias-4.c (main): Add missing int
return type.
* gcc.c-torture/execute/arith-1.c (main): Add missing
int and void types.
* gcc.c-torture/execute/arith-rand-ll.c (main): Likewise.
* gcc.c-torture/execute/arith-rand.c (main): Likewise.
* gcc.c-torture/execute/bf-layout-1.c (main): Likewise.
* gcc.c-torture/execute/bf-pack-1.c (foo): Add missing
void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/bf-sign-1.c (main): Likewise.
* gcc.c-torture/execute/bf-sign-2.c (main): Likewise.
* gcc.c-torture/execute/bf64-1.c (main): Likewise.
* gcc.c-torture/execute/builtin-prefetch-2.c (stat_int_arr):
Add missing int array element type.
* gcc.c-torture/execute/builtin-prefetch-3.c (stat_int_arr):
Likewise.
* gcc.c-torture/execute/cbrt.c (main): Add missing int and
void types.
* gcc.c-torture/execute/complex-1.c (main): Likewise.
* gcc.c-torture/execute/complex-2.c (main): Likewise.
* gcc.c-torture/execute/complex-3.c (main): Likewise.
* gcc.c-torture/execute/complex-4.c (main): Likewise.
* gcc.c-torture/execute/complex-5.c (main): Likewise.
* gcc.c-torture/execute/compndlit-1.c (main): Likewise.
* gcc.c-torture/execute/conversion.c (test_integer_to_float)
(test_longlong_integer_to_float, test_float_to_integer)
(test_float_to_longlong_integer): Add missing void return
type.
(main): Add missing int and void types.
* gcc.c-torture/execute/cvt-1.c (main): Likewise.
* gcc.c-torture/execute/divconst-1.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/divconst-2.c (main): Likewise.
* gcc.c-torture/execute/divconst-3.c (main): Likewise.
* gcc.c-torture/execute/enum-1.c (main): Likewise.
* gcc.c-torture/execute/func-ptr-1.c (main): Likewise.
* gcc.c-torture/execute/ieee/20011123-1.c (main): Likewise.
* gcc.c-torture/execute/ieee/920518-1.c (main): Likewise.
* gcc.c-torture/execute/ieee/920810-1.c (main): Likewise.
Call __builtin_strcmp instead of strcmp.
* gcc.c-torture/execute/ieee/930529-1.c (main): Add missing
int and void types.
* gcc.c-torture/execute/ieee/fp-cmp-1.c (main): Likewise.
* gcc.c-torture/execute/ieee/fp-cmp-2.c (main): Likewise.
* gcc.c-torture/execute/ieee/fp-cmp-3.c (main): Likewise.
* gcc.c-torture/execute/ieee/fp-cmp-6.c (main): Likewise.
* gcc.c-torture/execute/ieee/fp-cmp-9.c (main): Likewise.
* gcc.c-torture/execute/ieee/minuszero.c (main): Likewise.
* gcc.c-torture/execute/ieee/mzero2.c (expect): Call
__builtin_memcmp instead of memcmp.
(main): Add missing int and void types.
* gcc.c-torture/execute/ieee/mzero3.c (main): Likewise.
(expectd, expectf): Call __builtin_memcmp instead of memcmp.
* gcc.c-torture/execute/ieee/mzero5.c (negzero_check):
Likewise.
* gcc.c-torture/execute/ieee/rbug.c (main): Add missing
int and void types.
* gcc.c-torture/execute/index-1.c (main): Likewise.
* gcc.c-torture/execute/loop-1.c (main): Likewise.
* gcc.c-torture/execute/loop-2b.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/loop-6.c (main): Likewise.
* gcc.c-torture/execute/loop-7.c (main): Likewise.
* gcc.c-torture/execute/lto-tbaa-1.c (use_a, set_b, use_c):
Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/memcpy-1.c (main): Likewise.
* gcc.c-torture/execute/memcpy-2.c (main): Likewise.
* gcc.c-torture/execute/memcpy-bi.c (main): Likewise.
* gcc.c-torture/execute/memset-1.c (main): Likewise.
* gcc.c-torture/execute/memset-2.c: Include <string.h>.
* gcc.c-torture/execute/memset-3.c: Likewise.
* gcc.c-torture/execute/nest-stdar-1.c (main): Add missing
int and void types.
* gcc.c-torture/execute/nestfunc-1.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/packed-1.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/pr15262-1.c (main): Likewise. Call
__builtin_malloc instead of malloc.
* gcc.c-torture/execute/pr15262-2.c (foo): Add int return
type.
(main): Add missing int and void types.
* gcc.c-torture/execute/pr15262.c (main): Likewise.
* gcc.c-torture/execute/pr17252.c (main): Likewise.
* gcc.c-torture/execute/pr21331.c (main): Likewise.
* gcc.c-torture/execute/pr34176.c (foo): Add missing int
type to definition of foo.
* gcc.c-torture/execute/pr42231.c (max): Add missing int type
to definition.
* gcc.c-torture/execute/pr42614.c (expect_func): Call
__builtin_abs instead of abs.
* gcc.c-torture/execute/pr54937.c (t): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/ptr-arith-1.c (main): Likewise.
* gcc.c-torture/execute/regstack-1.c (main): Likewise.
* gcc.c-torture/execute/scope-1.c (f): Add missing void types.
(main): Add missing int and void types.
* gcc.c-torture/execute/simd-5.c (main): Call __builtin_memcmp
instead of memcmp.
* gcc.c-torture/execute/strcmp-1.c (main): Add missing
int and void types.
* gcc.c-torture/execute/strcpy-1.c (main): Likewise.
* gcc.c-torture/execute/strct-pack-1.c (main): Likewise.
* gcc.c-torture/execute/strct-pack-2.c (main): Likewise.
* gcc.c-torture/execute/strct-pack-4.c (main): Likewise.
* gcc.c-torture/execute/strct-stdarg-1.c (f): Add void return
type.
(main): Add missing int and void types.
* gcc.c-torture/execute/strct-varg-1.c (f): Add void return
type.
(main): Add missing int and void types.
* gcc.c-torture/execute/strlen-1.c (main): Likewise.
* gcc.c-torture/execute/strncmp-1.c (main): Likewise.
* gcc.c-torture/execute/struct-ini-1.c (main): Likewise.
* gcc.c-torture/execute/struct-ini-2.c (main): Likewise.
* gcc.c-torture/execute/struct-ini-3.c (main): Likewise.
* gcc.c-torture/execute/struct-ini-4.c (main): Likewise.
* gcc.c-torture/execute/struct-ret-1.c (main): Likewise.
* gcc.c-torture/execute/struct-ret-2.c (main): Likewise.
* gcc.c-torture/execute/va-arg-1.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/va-arg-10.c (main): Likewise.
* gcc.c-torture/execute/va-arg-2.c (main): Likewise.
* gcc.c-torture/execute/va-arg-4.c (main): Likewise.
* gcc.c-torture/execute/va-arg-5.c (va_double)
(va_long_double): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/va-arg-6.c (f): Add void return type.
(main): Add missing int and void types.
* gcc.c-torture/execute/va-arg-9.c (main): Likewise.
* gcc.c-torture/execute/widechar-1.c (main): Likewise.
The execute tests use abort/exit to report failure/success, but
they generally do not declare these functions (or include <stdlib.h>).
This change adds declarations as appropriate.
It would have been possible to switch to __builtin_abort and
__builtin_exit instead. Existing practice varies. Adding the
declarations makes it easier to write the GNU-style commit message
because it is not necessary to mention the function with the call
site.
Instead of this change, it would be possible to create a special
header file with the declarations that is included during the
test file compilation using -include, but that would mean that
many tests would no longer build standalone.
Florian Weimer [Mon, 13 Nov 2023 07:54:10 +0000 (08:54 +0100)]
C99 testsuite readiness: -fpermissive tests
These tests use obsolete language constructs, but they are not
clearly targeting C89, either. So use -fpermissive to keep
future errors as warnings.
The reasons why obsolete constructs are used used vary from
test to test. Some tests deliberately exercise later stages
of the compiler that only occur with those constructs. Some
tests have precise expectations about warnings that will become
errors with a future change, but do not specifically test a
particular warning/error (if that is the case, the later changes
tend to duplicate them into warning/error variants). In a few
cases, use of obsolete constructs is clearly due to test case
reduction, but it was not possible to un-reduce the test due
to its size.
Jakub Jelinek [Mon, 13 Nov 2023 07:47:41 +0000 (08:47 +0100)]
gimple-range-cache: Fix ICEs when dumping details [PR111967]
The following testcase ICEs when dumping details.
When m_ssa_ranges vector is created, it is safe_grow_cleared (num_ssa_names),
but when when some new SSA_NAME is added, we strangely grow it to
num_ssa_names + 1 instead and later on the 3 argument dump method
iterates from 1 to m_ssa_ranges.length () - 1 and uses ssa_name (x)
on each; but because set_bb_range grew it one too much, ssa_name
(m_ssa_ranges.length () - 1) might be after the end of the ssanames
vector and ICE.
The fix grows the vector consistently only to num_ssa_names,
doesn't waste time checking m_ssa_ranges[0] because there is no
ssa_names (0), it is always NULL, before using ssa_name (x) checks
if we'll need it at all (we check later if m_ssa_ranges[x] is non-NULL,
so we might check it earlier as well) and also in the last loop
iterates until m_ssa_ranges.length () rather than num_ssa_names, I don't
see a reason for the inconsistency and in theory some SSA_NAME could be
added without set_bb_range called for it and the vector could be shorter
than the ssanames vector.
To actually fix the ICE, either the first hunk or the last 2 hunks
would be enough, but I think it doesn't hurt to change all the spots.
2023-11-13 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/111967
* gimple-range-cache.cc (block_range_cache::set_bb_range): Grow
m_ssa_ranges to num_ssa_names rather than num_ssa_names + 1.
(block_range_cache::dump): Iterate from 1 rather than 0. Don't use
ssa_name (x) unless m_ssa_ranges[x] is non-NULL. Iterate to
m_ssa_ranges.length () rather than num_ssa_names.
Xi Ruoyao [Mon, 30 Oct 2023 12:24:58 +0000 (20:24 +0800)]
LoongArch: Optimize single-used address with -mexplicit-relocs=auto for fld/fst
fld and fst have same address mode as ld.w and st.w, so the same
optimization as r14-4851 should be applied for them too.
gcc/ChangeLog:
* config/loongarch/loongarch.md (LD_AT_LEAST_32_BIT): New mode
iterator.
(ST_ANY): New mode iterator.
(define_peephole2): Use LD_AT_LEAST_32_BIT instead of GPR and
ST_ANY instead of QHWD for applicable patterns.
Pan Li [Mon, 13 Nov 2023 03:06:38 +0000 (11:06 +0800)]
RISC-V: Fix RVV dynamic frm tests failure
The hancement of mode-switching performs some optimization when
emit the frm backup insn, some redudant fsrm insns are removed
for the following test cases.
This patch would like to adjust the asm check for above optimization.
Pan Li [Sun, 12 Nov 2023 12:16:03 +0000 (20:16 +0800)]
RISC-V: Support FP l/ll round and rint HF mode autovec
This patch would like to support the FP below API auto vectorization
with different type size
+------------+-----------+----------+
| API | RV64 | RV32 |
+------------+-----------+----------+
| lrintf16 | HF => DI | HF => SI |
| llrintf16 | HF => DI | HF => DI |
| lroundf16 | HF => DI | HF => SI |
| llroundf16 | HF => DI | HF => DI |
+------------+-----------+----------+
Given below code:
void
test_lrintf16 (long *out, _Float16 *in, int count)
{
for (unsigned i = 0; i < count; i++)
out[i] = __builtin_lrintf16 (in[i]);
}
Before this patch:
.L3:
lhu a5,0(s0)
addi s0,s0,2
addi s1,s1,8
fmv.s.x fa0,a5
call lrintf16
sd a0,-8(s1)
bne s0,s2,.L3
After this patch:
.L3:
vsetvli a5,a2,e16,mf4,ta,ma
vle16.v v1,0(a1)
vfwcvt.f.f.v v2,v1
vsetvli zero,zero,e32,mf2,ta,ma
vfwcvt.x.f.v v1,v2
vse64.v v1,0(a0)
slli a4,a5,1
add a1,a1,a4
slli a4,a5,3
add a0,a0,a4
sub a2,a2,a5
bne a2,zero,.L3
gcc/ChangeLog:
* config/riscv/autovec.md: Add bridge mode to lrint and lround
pattern.
* config/riscv/riscv-protos.h (expand_vec_lrint): Add new arg
bridge machine mode.
(expand_vec_lround): Ditto.
* config/riscv/riscv-v.cc (emit_vec_widden_cvt_f_f): New helper
func impl to emit vfwcvt.f.f.
(emit_vec_rounding_to_integer): Handle the HF to DI rounding
with the bridge mode.
(expand_vec_lrint): Reorder the args.
(expand_vec_lround): Ditto.
(expand_vec_lceil): Ditto.
(expand_vec_lfloor): Ditto.
* config/riscv/vector-iterators.md: Add vector HFmode and bridge
mode for converting to DI.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/math-llrintf16-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llroundf16-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lrintf16-rv32-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lrintf16-rv64-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lroundf16-rv32-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lroundf16-rv64-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llrintf16-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llroundf16-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lrintf16-rv32-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lrintf16-rv64-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lroundf16-rv32-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lroundf16-rv64-0.c: New test.
Eric Botcazou [Sat, 11 Nov 2023 18:43:07 +0000 (19:43 +0100)]
Handle addresses of more constants in IPA-CP
IPA-CP can handle addresses of scalar constants (CONST_DECL) so this extends
that to addresses of constants in the pool (DECL_IN_CONSTANT_POOL). Again
this is helpful for so-called fat pointers in Ada, i.e. objects that are
semantically pointers but represented by structures made up of two pointers.
This also moves the unused function print_ipcp_constant_value from ipa-cp.cc
to ipa-prop.cc and renames it.
gcc/
* ipa-cp.cc (print_ipcp_constant_value): Move to...
(values_equal_for_ipcp_p): Deal with VAR_DECLs from the
constant pool.
* ipa-prop.cc (ipa_print_constant_value): ...here. Likewise.
(ipa_print_node_jump_functions_for_edge): Call the function
ipa_print_constant_value to print IPA_JF_CONST elements.
Jin Ma [Sat, 11 Nov 2023 20:11:45 +0000 (13:11 -0700)]
[PATCH v2] In the pipeline, USE or CLOBBER should delay execution if it starts a new live range.
CLOBBER and USE does not represent real instructions, but in the
process of pipeline optimization, they will wait for transmission
in ready list like other insns, without considering resource
conflicts and cycles. This results in a multi-issue CPU architecture
that can be issued at any time if other regular insns have resource
conflicts or cannot be launched for other reasons. As a result,
its position is advanced in the generated insns sequence, which
will affect register allocation and often lead to more redundant
mov instructions.
A simple example:
https://github.com/majin2020/gcc-test/blob/master/test.c
This is a function in the dhrystone benchmark.
https://github.com/majin2020/gcc-test/blob/0b08c1a13de9663d7d9aba7539b960ec0607ca24/test.c.299r.sched1
This is a log of the pass 'sched1' When -mtune=rocket but issue_rate == 2.
In this log, insn 13 and 14 are much ahead of schedule, which risks generating
redundant mov instructions, which seems unreasonable.
Therefore, I submit patch again on the basis of the last review
opinions to try to solve this problem.
https://github.com/majin2020/gcc-test/commit/efcb43e3369e771bde702955048bfe3f501263dd#diff-805031b1be5092a2322852a248d0b0f92eef7cad5784a8209f4dfc6221407457L189
This is the diff log of shed1 after patch is added.
The new pipeline is:
;; | insn | prio |
;; | 17 | 3 | r142=a0 alu
...
;; | 10 | 0 | [r144]=r141 alu
;; | 13 | 0 | clobber a0 nothing
;; | 14 | 0 | clobber r136 nothing
;; | 12 | 0 | a0=r136 alu
;; | 15 | 0 | use a0 nothing
gcc/ChangeLog:
* haifa-sched.cc (use_or_clobber_starts_range_p): New.
(prune_ready_list): USE or CLOBBER should delay execution
if it starts a new live range.
Jakub Jelinek [Sat, 11 Nov 2023 19:15:53 +0000 (20:15 +0100)]
tree-ssa-math-opts: Fix up gsi_remove order in match_uaddc_usubc [PR112430]
The following testcase ICEs, because the temp_stmts were removed in
wrong order, from the ones appearing earlier in the IL to the later ones,
so insert_debug_temps_for_defs can reintroduce dead SSA_NAMEs back into the
IL.
The following patch fixes that by removing them in the order they were
pushed into the vector, which is from later ones to earlier ones.
Additionally, I've noticed I forgot to call release_defs on the removed
stmts.
2023-11-11 Jakub Jelinek <jakub@redhat.com>
PR middle-end/112430
* tree-ssa-math-opts.cc (match_uaddc_usubc): Remove temp_stmts in the
order they were pushed rather than in reverse order. Call
release_defs after gsi_remove.
This patch adds a way for targets to ask that selected mode changes
be brought forward, through a combination of:
(1) requiring a mode in blocks where the entity was previously
transparent
(2) pushing the transition at the head of a block onto incomging edges
SME has two uses for this:
- A "one-shot" entity that, for any given path of execution,
either stays off or makes exactly one transition from off to on.
This relies only on (1) above; see the hook description for more info.
The main purpose of using mode-switching for this entity is to
shrink-wrap the code that requires it.
- A second entity for which all transitions must be from known
modes, which is enforced using a combination of (1) and (2).
More specifically, (1) looks for edges B1->B2 for which:
- B2 requires a specific mode and
- B1 does not guarantee a specific starting mode
In this system, such an edge is only possible if the entity is
transparent in B1. (1) then forces B1 to require some safe common
mode. Applying this inductively means that all incoming edges are
from known modes. If different edges give different starting modes,
(2) pushes the transitions onto the edges themselves; this only
happens if the entity is not transparent in some predecessor block.
The patch also uses the back-propagation as an excuse to do a simple
on-the-fly optimisation.
Hopefully the comments in the patch explain things a bit better.
gcc/
* target.def (mode_switching.backprop): New hook.
* doc/tm.texi.in (TARGET_MODE_BACKPROP): New @hook.
* doc/tm.texi: Regenerate.
* mode-switching.cc (struct bb_info): Add single_succ.
(confluence_info): Add transp field.
(single_succ_confluence_n, single_succ_transfer): New functions.
(backprop_confluence_n, backprop_transfer): Likewise.
(optimize_mode_switching): Use them. Push mode transitions onto
a block's incoming edges, if the backprop hook requires it.
mode-switching: Add a target-configurable confluence operator
The mode-switching pass assumed that all of an entity's modes
were mutually exclusive. However, the upcoming SME changes
have an entity with some overlapping modes, so that there is
sometimes a "superunion" mode that contains two given modes.
We can use this relationship to pass something more helpful than
"don't know" to the emit hook.
This patch adds a new hook that targets can use to specify
a mode confluence operator.
With mutually exclusive modes, it's possible to compute a block's
incoming and outgoing modes by looking at its availability sets.
With the confluence operator, we instead need to solve a full
dataflow problem.
However, when emitting a mode transition, the upcoming SME use of
mode-switching benefits from having as much information as possible
about the starting mode. Calculating this information is definitely
worth the compile time.
The dataflow problem is written to work before and after the LCM
problem has been solved. A later patch makes use of this.
While there (since git blame would ping me for the reindented code),
I used a lambda to avoid the cut-&-pasted loops.
gcc/
* target.def (mode_switching.confluence): New hook.
* doc/tm.texi (TARGET_MODE_CONFLUENCE): New @hook.
* doc/tm.texi.in: Regenerate.
* mode-switching.cc (confluence_info): New variable.
(mode_confluence, forward_confluence_n, forward_transfer): New
functions.
(optimize_mode_switching): Use them to calculate mode_in when
TARGET_MODE_CONFLUENCE is defined.
The pass used the edge aux field to record which mode change
should happen on the edge, with -1 meaning "none". It's more
convenient for later patches to leave aux zero for "none",
and use numbers based at 1 to record a change.
gcc/
* mode-switching.cc (commit_mode_sets): Use 1-based edge aux values.
mode-switching: Pass set of live registers to the needed hook
The emit hook already takes the set of live hard registers as input.
This patch passes it to the needed hook too. SME uses this to
optimise the mode choice based on whether state is live or dead.
The main caller already had access to the required info, but the
special handling of return values did not.
mode-switching: Allow targets to set the mode for EH handlers
The mode-switching pass already had hooks to say what mode
an entity is in on entry to a function and what mode it must
be in on return. For SME, we also want to say what mode an
entity is guaranteed to be in on entry to an exception handler.
gcc/
* target.def (mode_switching.eh_handler): New hook.
* doc/tm.texi.in (TARGET_MODE_EH_HANDLER): New @hook.
* doc/tm.texi: Regenerate.
* mode-switching.cc (optimize_mode_switching): Use eh_handler
to get the mode on entry to an exception handler.