]> gcc.gnu.org Git - gcc.git/log
gcc.git
2 years agoAVX512FP16: Fix masm=intel output for vfc?(madd|mul)csh [PR 104977]
Hongyu Wang [Fri, 18 Mar 2022 15:47:35 +0000 (23:47 +0800)]
AVX512FP16: Fix masm=intel output for vfc?(madd|mul)csh [PR 104977]

Fix typo in subst for scalar complex mask_round operand.

gcc/ChangeLog:

PR target/104977
* config/i386/sse.md
(avx512fp16_fma<complexopname>sh_v8hf<mask_scalarcz_name><round_scalarcz_name>):
Correct round operand for intel dialect.

gcc/testsuite/ChangeLog:

PR target/104977
* gcc.target/i386/pr104977.c: New test.

2 years agoDaily bump.
GCC Administrator [Mon, 21 Mar 2022 00:16:22 +0000 (00:16 +0000)]
Daily bump.

2 years agoFix testsuite fallout from pr104960 change
Jeff Law [Sun, 20 Mar 2022 21:29:29 +0000 (17:29 -0400)]
Fix testsuite fallout from pr104960 change

Recent changes twiddled the output for s390/arch13/sel-1.c causing testsuite failures.  As far as I can tell both sequences are equivalent from a performance standpoint.   This patch changes the test to accept both forms.

gcc/testsuite
* gcc.target/s390/arch13/sel-1.c: Adjust expected output.

2 years agoDaily bump.
GCC Administrator [Sun, 20 Mar 2022 00:16:30 +0000 (00:16 +0000)]
Daily bump.

2 years agofortran: Separate associate character lengths earlier [PR104570]
Mikael Morin [Sun, 13 Mar 2022 21:22:55 +0000 (22:22 +0100)]
fortran: Separate associate character lengths earlier [PR104570]

This change workarounds an ICE in the evaluation of the character length
of an array expression referencing an associate variable; the code is
not prepared to see a non-scalar expression as it doesn’t initialize the
scalarizer.

Before this change, associate length symbols get a new gfc_charlen at
resolution stage to unshare them from the associate expression, so that
at translation stage it is a decl specific to the associate symbol that
is initialized, not the decl of some other symbol.  This
reinitialization of gfc_charlen happens after expressions referencing
the associate symbol have been parsed, so that those expressions retain
the original gfc_charlen they have copied from the symbol.
At translation stage, the gfc_charlen for the associate symbol is setup
with the decl holding the actual length value, but the expressions have
retained the original gfc_charlen without any decl.  So they need to
evaluate the character length, and this is where the ICE happens.

This change moves the reinitialization of gfc_charlen earlier at parsing
stage, so that at resolution stage the gfc_charlen can be retained as
it’s already not shared with any other symbol, and the expressions which
now share their gfc_charlen with the symbol are automatically updated
when the length decl is setup at translation stage.  There is no need
any more to evaluate the character length as it has all the required
information, and the ICE doesn’t happen.

The first resolve.cc hunk is necessary to avoid regressing on the
associate_35.f90 testcase.

PR fortran/104228
PR fortran/104570

gcc/fortran/ChangeLog:

* parse.cc (parse_associate): Use a new distinct gfc_charlen if the
copied type has one whose length is not known to be constant.
* resolve.cc (resolve_assoc_var): Reset charlen if it’s shared with
the associate target regardless of the expression type.
Don’t reinitialize charlen if it’s deferred.

gcc/testsuite/ChangeLog:

* gfortran.dg/associate_58.f90: New test.

2 years agolibgcc: m68k: avoid TEXTRELs in shared library (PR 86224)
Sergei Trofimovich [Sat, 19 Mar 2022 19:09:36 +0000 (15:09 -0400)]
libgcc: m68k: avoid TEXTRELs in shared library (PR 86224)

libgcc/
PR libgcc/86224
* config/m68k/lb1sf68.S (__mulsi3_internal): Internal, hidden alias
for __mulsi3.
(__udivsi3_internal, __divsi3_internal): Similarly.
(__umodsi3, __modsi3): Use the internal function names.

2 years agoselftest: Move C-specific tests to c_family
Arthur Cohen [Sat, 19 Mar 2022 18:25:51 +0000 (14:25 -0400)]
selftest: Move C-specific tests to c_family

When trying to make use of the selftest framework over on the rust
frontend, we ran into issues where rust1 was expected to produce errors
containing C-like type names such as `int`.

I had gotten in contact with David Malcolm on the gcc mailing list [1],
who advised moving some test functions to a better location. The
offending functions have also been renamed in order to better fit the C
family of tests, and are thus not called when performing general
selftests anymore.

Kindly,

[1]: https://gcc.gnu.org/pipermail/gcc/2021-November/237703.html

2022-02-16 Arthur Cohen <arthur.cohen@embecosm.com>

gcc/c-family/

* c-common.cc (c_family_tests): Call the new tests.
* c-common.h (c_diagnostic_tests): Declare.
(c_opt_problem_cc_tests): Likewise.

gcc/
* diagnostic.cc (diagnostic_cc_tests): Rename to...
(c_diagnostic_cc_tests): ...this.
* opt-problem.cc (opt_problem_cc_tests): Rename to...
(c_opt_problem_cc_tests): ...this.
* selftest-run-tests.cc (selftest::run_tests): No longer run
opt_problem_cc_tests or diagnostic_cc_tests.
* selftest.h (diagnostic_cc_tests): Remove declaration.
(opt_problem_cc_tests): Likewise.

2 years ago[PATCH] gcc: pass-manager: Fix memory leak. [PR jit/63854]
Marc Nieper-Wißkirchen [Sat, 19 Mar 2022 17:42:26 +0000 (13:42 -0400)]
[PATCH] gcc: pass-manager: Fix memory leak. [PR jit/63854]

Before the patch, compiling the hello world example of libgccjit with
the external driver under Valgrind shows a loss of 12,611 (48 direct)
bytes.  After the patch, no memory leaks are reported anymore.
(Memory leaks occurring when using the internal driver are mostly in
the driver code in gcc/gcc.c and have to be fixed separately.)

The patch has been tested by fully bootstrapping the compiler with the
frontends C, C++, Fortran, LTO, ObjC, JIT and running the test suite
under a x86_64-pc-linux-gnu host.

gcc/ChangeLog:

PR jit/63854
* hash-traits.h (struct typed_const_free_remove): New.
(struct free_string_hash): New.
* pass_manager.h: Use free_string_hash.
* passes.cc (pass_manager::register_pass_name): Use free_string_hash.
(pass_manager::~pass_manager): Delete allocated m_name_to_pass_map.

2 years agorename floatformat_ia64_quad_{big, little} to floatformat_ieee_quad_{big, little}
Tiezhu Yang [Sat, 19 Mar 2022 17:33:40 +0000 (13:33 -0400)]
rename floatformat_ia64_quad_{big, little} to floatformat_ieee_quad_{big, little}

I submitted a GDB patch [1] to rename floatformats_ia64_quad to
floatformats_ieee_quad to reflect the reality, and then we can
clean up the related code.

As GDB Global Maintainer Tom Tromey said [2]:

  These files are maintained in gcc and then imported into the
  binutils-gdb repository, so any changes to them will have to
  be proposed there first.

this GCC patch is preparation for the GDB patch, no functionality
change.

[1] https://sourceware.org/pipermail/gdb-patches/2022-March/186452.html
[2] https://sourceware.org/pipermail/gdb-patches/2022-March/186569.html

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
include/
* floatformat.h (floatformat_ieee_quad_big): Renamed from
floatformat_ia64_quad_big.
(floatformat_ieee_quad_little): Similarly.

libiberty/
* floatformat.c (floatformat_ieee_quad_big): Renamed from
floatformat_ia64_quad_big.
(floatformat_ieee_quad_little): Similarly.

2 years agoi386: Don't emit pushf;pop for __builtin_ia32_readeflags_u* with unused lhs [PR104971]
Jakub Jelinek [Sat, 19 Mar 2022 12:53:12 +0000 (13:53 +0100)]
i386: Don't emit pushf;pop for __builtin_ia32_readeflags_u* with unused lhs [PR104971]

__builtin_ia32_readeflags_u* aren't marked const or pure I think
intentionally, so that they aren't CSEd from different regions of a function
etc. because we don't and can't easily track all dependencies between
it and surrounding code (if somebody looks at the condition flags, it is
dependent on the vast majority of instructions).
But the builtin itself doesn't have any side-effects, so if we ignore the
result of the builtin, there is no point to emit anything.

There is a LRA bug that miscompiles the testcase which this patch makes
latent, which is certainly worth fixing too, but IMHO this change
(and maybe ix86_gimple_fold_builtin too which would fold it even earlier
when it looses lhs) is worth it as well.

2022-03-19  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/104971
* config/i386/i386-expand.cc
(ix86_expand_builtin) <case IX86_BUILTIN_READ_FLAGS>: If ignore,
don't push/pop anything and just return const0_rtx.

* gcc.target/i386/pr104971.c: New test.

2 years agoc-family: Fix up ICE during pretty-printing of PMF related expression [PR101515]
Jakub Jelinek [Sat, 19 Mar 2022 07:40:47 +0000 (08:40 +0100)]
c-family: Fix up ICE during pretty-printing of PMF related expression [PR101515]

The intent of r11-6729 is that it prints something that helps user to figure
out what exactly is being accessed.
When we find a unique non-static data member that is being accessed, even
when we can't fold it nicely, IMNSHO it is better to print
  ((sometype *)&var)->field
or
  (*(sometype *)&var).field
instead of
  *(fieldtype *)((char *)&var + 56)
because the user doesn't know what is at offset 56, we shouldn't ask user
to decipher structure layout etc.

One question is if we could return something better for the TYPE_PTRMEMFUNC_FLAG
RECORD_TYPE members here (something that would print it more naturally/readably
in a C++ way), though the fact that the routine is in c-family makes it
harder.

Another one is whether we shouldn't punt for FIELD_DECLs that don't have
nicely printable name of its containing scope, something like:
                if (tree scope = get_containing_scope (field))
                  if (TYPE_P (scope) && TYPE_NAME (scope) == NULL_TREE)
                    break;
                return cop;
or so.  This patch implements that.

Note the returned cop is a COMPONENT_REF where the first argument has a
nicely printable type name (x with type sp), but sp's TYPE_MAIN_VARIANT
is the unnamed TYPE_PTRMEMFUNC_FLAG.  So another possibility would be if
we see such a problem for the FIELD_DECL's scope, check if TYPE_MAIN_VARIANT
of the first COMPONENT_REF's argument is equal to that scope and in that
case use TREE_TYPE of the first COMPONENT_REF's argument as the scope
instead.

2022-03-19  Jakub Jelinek  <jakub@redhat.com>

PR c++/101515
* c-pretty-print.cc (c_fold_indirect_ref_for_warn): For C++ don't
return COMPONENT_REFs with FIELD_DECLs whose containing scope can't
be printed.

* g++.dg/warn/pr101515.C: New test.

2 years agoDaily bump.
GCC Administrator [Sat, 19 Mar 2022 00:16:22 +0000 (00:16 +0000)]
Daily bump.

2 years agoanalyzer: extend state-purging to locals [PR104943]
David Malcolm [Wed, 8 Dec 2021 00:22:47 +0000 (19:22 -0500)]
analyzer: extend state-purging to locals [PR104943]

The existing analyzer code attempts to purge the state of SSA names
where it can in order to minimize the size of program_state instances,
and to increase the chances of being able to reuse exploded_node
instances whilst exploring the user's code.

PR analyzer/104943 identifies that we fail to purge state of local
variables, based on behavior seen in PR analyzer/104954 when attempting
to profile slow performance of -fanalyzer on a particular file in the
Linux kernel, where that testcase has many temporary "boxed" values of
structs containing ints, which are never cleaned up, leading to bloat
of the program_state instances (specifically, of the store objects).

This patch generalizes the state purging from just being on SSA names
to also work on local variables.  Doing so requires that we detect where
addresses to a local variable (or within them) are taken; we assume that
once a pointer has been taken, it's not longer safe to purge the value
of that decl at any successor point within the function.

Doing so speeds up the PR analyzer/104954 Linux kernel analyzer testcase
from taking 254 seconds to "just" 186 seconds (and I have a followup
patch in development that seems to further reduce this to 37 seconds).

The patch may also help with scaling up taint-detection so that it can
eventually be turned on by default, but we're not quite there (this
is PR analyzer/103533).

gcc/analyzer/ChangeLog:
PR analyzer/104943
PR analyzer/104954
PR analyzer/103533
* analyzer.h (class state_purge_per_decl): New forward decl.
* engine.cc (impl_run_checkers): Pass region_model_manager to
state_purge_map ctor.
* program-point.cc (function_point::final_stmt_p): New.
(function_point::get_next): New.
* program-point.h (function_point::final_stmt_p): New decl.
(function_point::get_next): New decl.
* program-state.cc (program_state::prune_for_point): Generalize to
purge local decls as well as SSA names.
(program_state::can_purge_base_region_p): New.
* program-state.h (program_state::can_purge_base_region_p): New
decl.
* region-model.cc (struct append_ssa_names_cb_data): Rename to...
(struct append_regions_cb_data): ...this.
(region_model::get_ssa_name_regions_for_current_frame): Rename
to...
(region_model::get_regions_for_current_frame): ...this, updating
for other renamings.
(region_model::append_ssa_names_cb): Rename to...
(region_model::append_regions_cb): ...this, and drop the requirement
that the subregion be a SSA name.
* region-model.h (struct append_ssa_names_cb_data): Rename decl
to...
(struct append_regions_cb_data): ...this.
(region_model::get_ssa_name_regions_for_current_frame): Rename
decl to...
(region_model::get_regions_for_current_frame): ...this.
(region_model::append_ssa_names_cb): Rename decl to...
(region_model::append_regions_cb): ...this.
* state-purge.cc: Include "tristate.h", "selftest.h",
"analyzer/store.h", "analyzer/region-model.h", and
"gimple-walk.h".
(get_candidate_for_purging): New.
(class gimple_op_visitor): New.
(my_load_cb): New.
(my_store_cb): New.
(my_addr_cb): New.
(state_purge_map::state_purge_map): Add "mgr" param.  Update for
renamings.  Find uses of local variables.
(state_purge_map::~state_purge_map): Update for renaming of m_map
to m_ssa_map.  Clean up m_decl_map.
(state_purge_map::get_or_create_data_for_decl): New.
(state_purge_per_ssa_name::state_purge_per_ssa_name): Update for
inheriting from state_purge_per_tree.
(state_purge_per_ssa_name::add_to_worklist): Likewise.
(state_purge_per_decl::state_purge_per_decl): New.
(state_purge_per_decl::add_needed_at): New.
(state_purge_per_decl::add_pointed_to_at): New.
(state_purge_per_decl::process_worklists): New.
(state_purge_per_decl::add_to_worklist): New.
(same_binding_p): New.
(fully_overwrites_p): New.
(state_purge_per_decl::process_point_backwards): New.
(state_purge_per_decl::process_point_forwards): New.
(state_purge_per_decl::needed_at_point_p): New.
(state_purge_annotator::print_needed): Generalize to print local
decls as well as SSA names.
* state-purge.h (class state_purge_map): Update leading comment.
(state_purge_map::map_t): Rename to...
(state_purge_map::ssa_map_t): ...this.
(state_purge_map::iterator): Rename to...
(state_purge_map::ssa_iterator): ...this.
(state_purge_map::decl_map_t): New typedef.
(state_purge_map::decl_iterator): New typedef.
(state_purge_map::state_purge_map): Add "mgr" param.
(state_purge_map::get_data_for_ssa_name): Update for renaming.
(state_purge_map::get_any_data_for_decl): New.
(state_purge_map::get_or_create_data_for_decl): New decl.
(state_purge_map::begin): Rename to...
(state_purge_map::begin_ssas): ...this.
(state_purge_map::end): Rename to...
(state_purge_map::end_ssa): ...this.
(state_purge_map::begin_decls): New.
(state_purge_map::end_decls): New.
(state_purge_map::m_map): Rename to...
(state_purge_map::m_ssa_map): ...this.
(state_purge_map::m_decl_map): New field.
(class state_purge_per_tree): New class.
(class state_purge_per_ssa_name): Inherit from state_purge_per_tree.
(state_purge_per_ssa_name::get_function): Move to base class.
(state_purge_per_ssa_name::point_set_t): Likewise.
(state_purge_per_ssa_name::m_fun): Likewise.
(class state_purge_per_decl): New.

gcc/testsuite/ChangeLog:
PR analyzer/104943
PR analyzer/104954
PR analyzer/103533
* gcc.dg/analyzer/torture/boxed-ptr-1.c: Update expected number
of exploded nodes to reflect improvements in state purging.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2 years agoanalyzer: add tests of boxed values [PR104943]
David Malcolm [Thu, 17 Mar 2022 22:12:46 +0000 (18:12 -0400)]
analyzer: add tests of boxed values [PR104943]

This patch adds various regression tests as preparatory work for
purging irrelevant local decls from state (PR analyzer/104943)

gcc/testsuite/ChangeLog:
PR analyzer/104943
* gcc.dg/analyzer/boxed-malloc-1-29.c: New test.
* gcc.dg/analyzer/boxed-malloc-1.c: New test.
* gcc.dg/analyzer/taint-alloc-5.c: New test.
* gcc.dg/analyzer/torture/boxed-int-1.c: New test.
* gcc.dg/analyzer/torture/boxed-ptr-1.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2 years ago[PR104961] LRA: split hard reg for reload pseudo with clobber.
Vladimir N. Makarov [Fri, 18 Mar 2022 18:23:40 +0000 (14:23 -0400)]
[PR104961] LRA: split hard reg for reload pseudo with clobber.

Splitting hard register live range did not work for subreg of a
multi-reg reload pseudo.  Reload insns for such pseudo contain clobber
of the pseudo and splitting did not take this into account.  The patch
fixes it.

gcc/ChangeLog:

PR rtl-optimization/104961
* lra-assigns.cc (find_reload_regno_insns): Process reload pseudo clobber.

gcc/testsuite/ChangeLog:

PR rtl-optimization/104961
* gcc.target/i386/pr104961.c: New.

2 years agotree: Add comment.
Jason Merrill [Tue, 8 Mar 2022 16:34:15 +0000 (11:34 -0500)]
tree: Add comment.

gcc/ChangeLog:

* tree.h (IDENTIFIER_LENGTH): Add comment.

2 years agoc++: using lookup within class defn [PR104476]
Jason Merrill [Fri, 25 Feb 2022 19:07:15 +0000 (15:07 -0400)]
c++: using lookup within class defn [PR104476]

The problem in both PR92918 and PR104476 is overloading of base member
functions brought in by 'using' with direct member functions during parsing
of the class body.  To this point they've had a troublesome coexistence
which was resolved by set_class_bindings when the class is complete, but we
also need to handle lookup within the class body, such as in a trailing
return type.

The problem was that push_class_level_binding would either clobber the
using-decl with the direct members or vice-versa.  In older versions of GCC
we only pushed dependent usings, and preferring the dependent using made
sense, as it expresses a type-dependent overload set that we can't do
anything useful with.  But when we started keeping non-dependent usings
around, push_class_level_binding in particular wasn't adjusted accordingly.

This patch makes that adjustment, and pushes the functions imported by a
non-dependent using immediately from finish_member_declaration.  This made
diagnosing redundant using-decls a bit awkward, since we no longer push the
using-decl itself; I handle that by noticing when we try to add the same
function again and searching TYPE_FIELDS for the previous using-decl.

PR c++/92918
PR c++/104476

gcc/cp/ChangeLog:

* class.cc (add_method): Avoid adding the same used function twice.
(handle_using_decl): Don't add_method.
(finish_struct): Don't using op= if we have one already.
(maybe_push_used_methods): New.
* semantics.cc (finish_member_declaration): Call it.
* name-lookup.cc (diagnose_name_conflict): No longer static.
(push_class_level_binding): Revert 92918 patch, limit
to dependent using.
* cp-tree.h: Adjust.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/pr85070.C: Remove expected error.
* g++.dg/lookup/using66a.C: New test.
* g++.dg/lookup/using67.C: New test.

2 years agoAllow (void *) 0xdeadbeef accesses without warnings [PR99578]
Jakub Jelinek [Fri, 18 Mar 2022 17:58:06 +0000 (18:58 +0100)]
Allow (void *) 0xdeadbeef accesses without warnings [PR99578]

Starting with GCC11 we keep emitting false positive -Warray-bounds or
-Wstringop-overflow etc. warnings on widely used *(type *)0x12345000
style accesses (or memory/string routines to such pointers).
This is a standard programming style supported by all C/C++ compilers
I've ever tried, used mostly in kernel or DSP programming, but sometimes
also together with mmap MAP_FIXED when certain things, often I/O registers
but could be anything else too are known to be present at fixed
addresses.

Such INTEGER_CST addresses can appear in code either because a user
used it like that (in which case it is fine) or because somebody used
pointer arithmetics (including &((struct whatever *)NULL)->field) on
a NULL pointer.  The middle-end warning code wrongly assumes that the
latter case is what is very likely, while the former is unlikely and
users should change their code.

The following patch adds a min-pagesize param defaulting to 4KB,
and treats INTEGER_CST addresses smaller than that as assumed results
of pointer arithmetics from NULL while addresses equal or larger than
that as expected user constant addresses.  For GCC 13 we can
represent results from pointer arithmetics on NULL using
&MEM[(void*)0 + offset] instead of (void*)offset INTEGER_CSTs.

2022-03-18  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/99578
PR middle-end/100680
PR tree-optimization/100834
* params.opt (--param=min-pagesize=): New parameter.
* pointer-query.cc
(compute_objsize_r) <case ARRAY_REF>: Formatting fix.
(compute_objsize_r) <case INTEGER_CST>: Use maximum object size instead
of zero for pointer constants equal or larger than min-pagesize.

* gcc.dg/tree-ssa/pr99578-1.c: New test.
* gcc.dg/pr99578-1.c: New test.
* gcc.dg/pr99578-2.c: New test.
* gcc.dg/pr99578-3.c: New test.
* gcc.dg/pr100680.c: New test.
* gcc.dg/pr100834.c: New test.

2 years agoc++: Fix up constexpr evaluation of new with zero sized types [PR104568]
Jakub Jelinek [Fri, 18 Mar 2022 17:49:23 +0000 (18:49 +0100)]
c++: Fix up constexpr evaluation of new with zero sized types [PR104568]

The new expression constant expression evaluation right now tries to
deduce how many elts the array it uses for the heap or heap [] vars
should have (or how many elts should its trailing array have if it has
cookie at the start).  As new is lowered at that point to
(some_type *) ::operator new (size)
or so, it computes it by subtracting cookie size if any from size, then
divides the result by sizeof (some_type).
This works fine for most types, except when sizeof (some_type) is 0,
then we divide by zero; size is then equal to cookie_size (or if there
is no cookie, to 0).
The following patch special cases those cases so that we don't divide
by zero and also recover the original outer_nelts from the expression
by forcing the size not to be folded in that case but be explicit
0 * outer_nelts or cookie_size + 0 * outer_nelts.

Note, we have further issues, we accept-invalid various cases, for both
zero sized elt_type and even non-zero sized elts, we aren't able to
diagnose out of bounds POINTER_PLUS_EXPR like:
constexpr bool
foo ()
{
  auto p = new int[2];
  auto q1 = &p[0];
  auto q2 = &p[1];
  auto q3 = &p[2];
  auto q4 = &p[3];
  delete[] p;
  return true;
}
constexpr bool a = foo ();
That doesn't look like a regression so I think we should resolve that for
GCC 13, but there are 2 problems.  Figure out why
cxx_fold_pointer_plus_expression doesn't deal with the &heap []
etc. cases, and for the zero sized arrays, I think we really need to preserve
whether user wrote an array ref or pointer addition, because in the
&p[3] case if sizeof(p[0]) == 0 we know that if it has 2 elements it is
out of bounds, while if we see p p+ 0 the information if it was
p + 2 or p + 3 in the source is lost.
clang++ seems to handle it fine even in the zero sized cases or with
new expressions.

2022-03-18  Jakub Jelinek  <jakub@redhat.com>

PR c++/104568
* init.cc (build_new_constexpr_heap_type): Remove FULL_SIZE
argument and its handling, instead add ITYPE2 argument.  Only
support COOKIE_SIZE != NULL.
(build_new_1): If size is 0, change it to 0 * outer_nelts if
outer_nelts is non-NULL.  Pass type rather than elt_type to
maybe_wrap_new_for_constexpr.
* constexpr.cc (build_new_constexpr_heap_type): New function.
(cxx_eval_constant_expression) <case CONVERT_EXPR>:
If elt_size is zero sized type, try to recover outer_nelts from
the size argument to operator new/new[] and pass that as
arg_size to build_new_constexpr_heap_type.  Pass ctx,
non_constant_p and overflow_p to that call too.

* g++.dg/cpp2a/constexpr-new22.C: New test.

2 years agotestsuite: Add missing <vector> header to test
Jonathan Wakely [Fri, 18 Mar 2022 17:45:07 +0000 (17:45 +0000)]
testsuite: Add missing <vector> header to test

gcc/testsuite/ChangeLog:

* g++.dg/torture/pr104601.C: Include <vector>.

2 years agoc++: alias template and empty parameter packs [PR104008]
Marek Polacek [Wed, 16 Mar 2022 13:34:34 +0000 (09:34 -0400)]
c++: alias template and empty parameter packs [PR104008]

Zero-length pack expansions are treated as if no list were provided
at all, that is, with

  template<typename...> struct S { };
  template<typename T, typename... Ts>
  void g() {
    S<std::is_same<T, Ts>...>;
  }

g<int> will result in S<>.  In the following test we have something
similar:

  template <typename T, typename... Ts>
  using IsOneOf = disjunction<is_same<T, Ts>...>;

and then we have "IsOneOf<OtherHolders>..." where OtherHolders is an
empty pack.  Since r11-7931, we strip_typedefs in TYPE_PACK_EXPANSION.
In this test that results in "IsOneOf<OtherHolders>" being turned into
"disjunction<>".  So the whole expansion is now "disjunction<>...".  But
then we error in make_pack_expansion because find_parameter_packs_r won't
find the pack OtherHolders.

We strip the alias template because dependent_alias_template_spec_p says
it's not dependent.  It it not dependent because this alias is not
TEMPLATE_DECL_COMPLEX_ALIAS_P.  My understanding is that currently we
consider an alias complex if it

1) expands a pack from the enclosing class, as in

    template<template<typename... U> typename... TT>
    struct S {
      template<typename... Args>
      using X = P<TT<Args...>...>;
    };

   where the alias expands TT; or

2) the expansion does *not* name all the template parameters, as in

    template<typename...> struct R;
    template<typename T, typename... Ts>
    using U = R<X<Ts>...>;

   where T is not named in the expansion.

But IsOneOf is neither.  And it can't know how it's going to be used.
Therefore I think we cannot make it complex (and in turn dependent) to fix
this bug.

After much gnashing of teeth, I think we simply want to avoid stripping
the alias if the new pattern doesn't have any parameter packs to expand.

PR c++/104008

gcc/cp/ChangeLog:

* tree.cc (strip_typedefs): Don't strip an alias template when
doing so would result in losing a parameter pack.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/variadic-alias3.C: New test.
* g++.dg/cpp0x/variadic-alias4.C: New test.

2 years agoFortran/OpenMP: Fix privatization of associated names
Tobias Burnus [Fri, 18 Mar 2022 16:40:22 +0000 (17:40 +0100)]
Fortran/OpenMP: Fix privatization of associated names

gfc_omp_predetermined_sharing cases the associate-name pointer variable
to be OMP_CLAUSE_DEFAULT_FIRSTPRIVATE, which is fine. However, the associated
selector is shared. Thus, the target of associate-name pointer should not get
copied. (It was before but because of gfc_omp_privatize_by_reference returning
false, the selector was not only wrongly copied but this was also not done
properly.)

gcc/fortran/ChangeLog:

PR fortran/103039
* trans-openmp.cc (gfc_omp_clause_copy_ctor, gfc_omp_clause_dtor):
Only privatize pointer for associate names.

libgomp/ChangeLog:

PR fortran/103039
* testsuite/libgomp.fortran/associate4.f90: New test.

2 years agolibstdc++: Simplify constraints for std::any construction [PR104242]
Jonathan Wakely [Fri, 18 Mar 2022 13:10:01 +0000 (13:10 +0000)]
libstdc++: Simplify constraints for std::any construction [PR104242]

Partially revert r12-4190-g6da36b7d0e43b6f9281c65c19a025d4888a25b2d
because using __and_<..., is_copy_constructible<T>> when T is incomplete
results in an error about deriving from is_copy_constructible<T> when
that is incomplete. I don't know how to fix that, so this simply
restores the previous constraint which worked in this case (even though
I think it's technically undefined to use is_copy_constructible<T> with
incomplete T). This doesn't restore exactly what we had before, but uses
the is_copy_constructible_v and __is_in_place_type_v variable templates
instead of the ::value member.

libstdc++-v3/ChangeLog:

PR libstdc++/104242
* include/std/any (any(T&&)): Revert change to constraints.
* testsuite/20_util/any/cons/104242.cc: New test.

2 years agotestsuite, modules, Darwin: Adjust expected output for older OS versions.
Iain Sandoe [Sun, 13 Mar 2022 16:38:49 +0000 (16:38 +0000)]
testsuite, modules, Darwin: Adjust expected output for older OS versions.

Darwin versions <= 10 (macOS 10.6) emit different diagnostics for the failure
case being tested by bad-mapper-1.C.  Adjust the dg- expressions to reflect this.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/testsuite/ChangeLog:

* g++.dg/modules/bad-mapper-1.C: Make dg- expressions that match the
diagnostics output by earlier Darwin too.

2 years agoFix "[openmp] Set location for taskloop stmts"
Tom de Vries [Fri, 18 Mar 2022 15:19:25 +0000 (16:19 +0100)]
Fix "[openmp] Set location for taskloop stmts"

I accidentally committed an outdated version of patch "[openmp] Set location
for taskloop stmts".

Fix this by adding the missing changes.

gcc/ChangeLog:

2022-03-18  Tom de Vries  <tdevries@suse.de>

* gimplify.cc (gimplify_omp_for): Set location using 'input_location'.
Set gfor location only when dealing with a OMP_TASKLOOP.

2 years agoc++tools: Work around a BSD bug in getaddrinfo().
Iain Sandoe [Sun, 13 Mar 2022 16:34:54 +0000 (16:34 +0000)]
c++tools: Work around a BSD bug in getaddrinfo().

Some versions of the BSD getaddrinfo() call do not work with the specific
input of "0" for the servname entry (a segv results).  Since we are making
the call with a dummy port number, the value is actually no important, other
than it should be in range.  Work around the BSD bug by using "1" instead.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
c++tools/ChangeLog:

* server.cc (accept_from): Use "1" as the dummy port number.

2 years agolibcody: Do not use a dummy port number in getaddrinfo().
Iain Sandoe [Sun, 13 Mar 2022 16:29:45 +0000 (16:29 +0000)]
libcody: Do not use a dummy port number in getaddrinfo().

the getaddrinfo() requires either a non-null name for the server or
a port service / number.  In the code that opens a connection we have
been calling this with a dummy port number of "0".  Unfortunately this
triggers a bug in some BSD versions and OSes importing that code.

In this part of the code we do not really need a port number, since it
is not reasonable to open a connection to an unspecified host.

Setting hints info field to 0, and the servname parm to nullptr works
around the BSD bug in this case.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
libcody/ChangeLog:

* netclient.cc (OpenInet6): Do not provide a dummy port number
in the getaddrinfo() call.

2 years ago[openmp] Set location for taskloop stmts
Tom de Vries [Fri, 18 Mar 2022 09:34:02 +0000 (10:34 +0100)]
[openmp] Set location for taskloop stmts

The test-case included in this patch contains:
...
  #pragma omp taskloop simd shared(a) lastprivate(myId)
...

This is translated to 3 taskloop statements in gimple, visible with
-fdump-tree-gimple:
...
  #pragma omp taskloop private(D.2124)
    #pragma omp taskloop shared(a) shared(myId) private(i.0) firstprivate(a_h)
      #pragma omp taskloop lastprivate(myId)
...

But when exposing the gimple statement locations using
-fdump-tree-gimple-lineno, we find that only the first one has location
information.

Fix this by adding the missing location information.

Tested gomp.exp on x86_64.

Tested libgomp testsuite on x86_64 with nvptx accelerator.

gcc/ChangeLog:

2022-03-18  Tom de Vries  <tdevries@suse.de>

* gimplify.cc (gimplify_omp_for): Set taskloop location.

gcc/testsuite/ChangeLog:

2022-03-18  Tom de Vries  <tdevries@suse.de>

* c-c++-common/gomp/pr104968.c: New test.

2 years ago[openmp] Fix SIMT reduction using TRUTH_{AND,OR}IF_EXPR
Tom de Vries [Thu, 17 Mar 2022 13:37:28 +0000 (14:37 +0100)]
[openmp] Fix SIMT reduction using TRUTH_{AND,OR}IF_EXPR

Consider test-case pr104952-1.c, included in this commit, containing:
...
  #pragma omp target map(tofrom:result) map(to:arr)
  #pragma omp simd reduction(||: result)
...

When run on x86_64 with nvptx accelerator, the test-case either aborts or
hangs.

The reduction clause is translated by the SIMT code (active for nvptx) as a
butterfly reduction loop with this butterfly shuffle / update pair:
...
  D.2163 = D.2163 || .GOMP_SIMT_XCHG_BFLY (D.2163, D.2164)
...
in the loop body.

The problem is that the butterfly shuffle is possibly not executed, while it
needs to be executed unconditionally.

Fix this by translating instead as:
...
  D.tmp_bfly = .GOMP_SIMT_XCHG_BFLY (D.2163, D.2164)
  D.2163 = D.2163 || D.tmp_bfly
...

Tested on x86_64-linux with nvptx accelerator.

gcc/ChangeLog:

2022-03-17  Tom de Vries  <tdevries@suse.de>

PR target/104952
* omp-low.cc (lower_rec_input_clauses): Make sure GOMP_SIMT_XCHG_BFLY
is executed unconditionally.

libgomp/ChangeLog:

2022-03-17  Tom de Vries  <tdevries@suse.de>

PR target/104952
* testsuite/libgomp.c/pr104952-1.c: New test.
* testsuite/libgomp.c/pr104952-2.c: New test.

2 years agoFortran/OpenMP: Improve associate-name diagnostic [PR103039]
Tobias Burnus [Fri, 18 Mar 2022 13:50:36 +0000 (14:50 +0100)]
Fortran/OpenMP: Improve associate-name diagnostic [PR103039]

gcc/fortran/ChangeLog:

PR fortran/103039
* openmp.cc (resolve_omp_clauses): Improve associate-name diagnostic
for select type/rank.

gcc/testsuite/ChangeLog:

PR fortran/103039
* gfortran.dg/gomp/associate1.f90: Update dg-error.
* gfortran.dg/gomp/associate2.f90: New test.

2 years agoRefine HImode movement for "v" to "v".
liuhongt [Fri, 18 Mar 2022 08:11:04 +0000 (16:11 +0800)]
Refine HImode movement for "v" to "v".

Set attr from HImode to HFmode which uses vmovsh instead of vmovw for
movment between sse registers.

gcc/ChangeLog:

PR target/104974
* config/i386/i386.md (*movhi_internal): Set attr type from HI
to HF for alternative 12 under TARGET_AVX512FP16.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr104974.c: New test.

2 years agolibstdc++: Reduce header dependencies from PSTL headers [PR92546]
Jonathan Wakely [Thu, 17 Mar 2022 16:45:43 +0000 (16:45 +0000)]
libstdc++: Reduce header dependencies from PSTL headers [PR92546]

This avoids including the whole of <functional> in <algorithm>, as the
<pstl/glue_algorithm_defs.h> header only actually needs std::pair.

This also avoids including <iterator> in <pstl/utils.h>, which only
needs <type_traits>, std::bad_alloc, and std::terminate (which can be
repalced with std::__terminate). This matters less, because
<pstl/utils.h> is only included by the <pstl/*_impl.h> headers and they
all use <iterator> anyway, and are only included by <execution>.

libstdc++-v3/ChangeLog:

PR libstdc++/92546
* include/pstl/glue_algorithm_defs.h: Replace <functional> with
<bits/stl_pair.h>.
* include/pstl/utils.h: Replace <iterator> with <type_traits>.
(__pstl::__internal::__except_handler): Use std::__terminate
instead of std::terminate.
* src/c++17/fs_path.cc: Include <array>.
* testsuite/25_algorithms/adjacent_find/constexpr.cc: Include
<functional>.
* testsuite/25_algorithms/binary_search/constexpr.cc: Likewise.
* testsuite/25_algorithms/clamp/constrained.cc: Likewise.
* testsuite/25_algorithms/equal/constrained.cc: Likewise.
* testsuite/25_algorithms/for_each/constrained.cc: Likewise.
* testsuite/25_algorithms/includes/constrained.cc: Likewise.
* testsuite/25_algorithms/is_heap/constexpr.cc: Likewise.
* testsuite/25_algorithms/is_heap_until/constexpr.cc: Likewise.
* testsuite/25_algorithms/is_permutation/constrained.cc: Include
<iterator>.
* testsuite/25_algorithms/is_sorted/constexpr.cc: Include
<functional>.
* testsuite/25_algorithms/is_sorted_until/constexpr.cc:
Likewise.
* testsuite/25_algorithms/lexicographical_compare/constexpr.cc:
Likewise.
* testsuite/25_algorithms/lexicographical_compare/constrained.cc:
Likewise.
* testsuite/25_algorithms/lexicographical_compare_three_way/1.cc:
Include <array>.
* testsuite/25_algorithms/lower_bound/constexpr.cc: Include
<functional>.
* testsuite/25_algorithms/max/constrained.cc: Likewise.
* testsuite/25_algorithms/max_element/constrained.cc: Likewise.
* testsuite/25_algorithms/min/constrained.cc: Likewise.
* testsuite/25_algorithms/min_element/constrained.cc: Likewise.
* testsuite/25_algorithms/minmax_element/constrained.cc:
Likewise.
* testsuite/25_algorithms/mismatch/constexpr.cc: Likewise.
* testsuite/25_algorithms/move/93872.cc: Likewise.
* testsuite/25_algorithms/move_backward/93872.cc: Include
<iterator>.
* testsuite/25_algorithms/nth_element/constexpr.cc: Include
<functional>.
* testsuite/25_algorithms/partial_sort/constexpr.cc: Likewise.
* testsuite/25_algorithms/partial_sort_copy/constexpr.cc:
Likewise.
* testsuite/25_algorithms/search/constexpr.cc: Likewise.
* testsuite/25_algorithms/search_n/constrained.cc: Likewise.
* testsuite/25_algorithms/set_difference/constexpr.cc: Likewise.
* testsuite/25_algorithms/set_difference/constrained.cc:
Likewise.
* testsuite/25_algorithms/set_intersection/constexpr.cc:
Likewise.
* testsuite/25_algorithms/set_intersection/constrained.cc:
Likewise.
* testsuite/25_algorithms/set_symmetric_difference/constexpr.cc:
Likewise.
* testsuite/25_algorithms/set_union/constexpr.cc: Likewise.
* testsuite/25_algorithms/set_union/constrained.cc: Likewise.
* testsuite/25_algorithms/sort/constexpr.cc: Likewise.
* testsuite/25_algorithms/sort_heap/constexpr.cc: Likewise.
* testsuite/25_algorithms/transform/constrained.cc: Likewise.
* testsuite/25_algorithms/unique/constexpr.cc: Likewise.
* testsuite/25_algorithms/unique/constrained.cc: Likewise.
* testsuite/25_algorithms/unique_copy/constexpr.cc: Likewise.
* testsuite/25_algorithms/upper_bound/constexpr.cc: Likewise.
* testsuite/std/ranges/adaptors/elements.cc: Include <vector>.
* testsuite/std/ranges/adaptors/lazy_split.cc: Likewise.
* testsuite/std/ranges/adaptors/split.cc: Likewise.

2 years agoopenmp: Fix up gomp_affinity_init_numa_domains
Jakub Jelinek [Fri, 18 Mar 2022 10:02:13 +0000 (11:02 +0100)]
openmp: Fix up gomp_affinity_init_numa_domains

On Thu, Nov 11, 2021 at 02:14:05PM +0100, Thomas Schwinge wrote:
> There appears to be yet another issue: there still are quite a number of
> 'FAIL: libgomp.c/places-10.c execution test' reports on
> <gcc-testresults@gcc.gnu.org>.  Also in my testing testing, on a system
> where '/sys/devices/system/node/online' contains '0-1', I get a FAIL:
>
>     [...]
>     OPENMP DISPLAY ENVIRONMENT BEGIN
>       _OPENMP = '201511'
>       OMP_DYNAMIC = 'FALSE'
>       OMP_NESTED = 'FALSE'
>       OMP_NUM_THREADS = '8'
>       OMP_SCHEDULE = 'DYNAMIC'
>       OMP_PROC_BIND = 'TRUE'
>       OMP_PLACES = '{0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30},{FAIL: libgomp.c/places-10.c execution test

I've finally managed to debug this (by dumping used /sys/ files from
an affected system in Fedora build system, replacing /sys/ with /tmp/
in gcc sources and populating there those files), I think following patch
ought to fix it.

2022-03-18  Jakub Jelinek  <jakub@redhat.com>

* config/linux/affinity.c (gomp_affinity_init_numa_domains): Move seen
variable next to pl variable.

2 years agox86: Correct march=sapphirerapids to base on icelake server
Cui,Lili [Thu, 17 Mar 2022 06:34:49 +0000 (14:34 +0800)]
x86: Correct march=sapphirerapids to base on icelake server

march=sapphirerapids should be based on icelake server not cooperlake.

gcc/ChangeLog:

PR target/104963
* config/i386/i386.h (PTA_SAPPHIRERAPIDS): change it to base on ICX.
* doc/invoke.texi: Update documents for Intel sapphirerapids.

gcc/testsuite/ChangeLog:

PR target/104963
* gcc.target/i386/pr104963.c: New test case.

2 years agoDaily bump.
GCC Administrator [Fri, 18 Mar 2022 00:16:27 +0000 (00:16 +0000)]
Daily bump.

2 years agoanalyzer: fixes to -fdump-analyzer-state-purge
David Malcolm [Thu, 17 Mar 2022 20:08:59 +0000 (16:08 -0400)]
analyzer: fixes to -fdump-analyzer-state-purge

gcc/analyzer/ChangeLog:
* state-purge.cc (state_purge_annotator::add_node_annotations):
Avoid duplicate before-supernode annotations when returning from
an interprocedural call.  Show after-supernode annotations.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2 years agoanalyzer: fix program_point::get_next for PK_BEFORE_STMT
David Malcolm [Thu, 17 Mar 2022 16:08:44 +0000 (12:08 -0400)]
analyzer: fix program_point::get_next for PK_BEFORE_STMT

gcc/analyzer/ChangeLog:
* program-point.cc (program_point::get_next): Fix missing
increment of index.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2 years agoPR 90356: Use xor to load const_double 0.0 on SSE (always)
Roger Sayle [Thu, 17 Mar 2022 21:56:32 +0000 (21:56 +0000)]
PR 90356: Use xor to load const_double 0.0 on SSE (always)

Implementations of the x87 floating point instruction set have always
had some pretty strange characteristics.  For example on the original
Intel Pentium the FLDPI instruction (to load 3.14159... into a register)
took 5 cycles, and the FLDZ instruction (to load 0.0) took 2 cycles,
when a regular FLD (load from memory) took just 1 cycle!?  Given that
back then memory latencies were much lower (relatively) than they are
today, these instructions were all but useless except when optimizing
for size (impressively FLDZ/FLDPI require only two bytes).

Such was the world back in 2006 when Uros Bizjak first added support for
fldz https://gcc.gnu.org/pipermail/gcc-patches/2006-November/202589.html
and then shortly after sensibly disabled them for !optimize_size with
https://gcc.gnu.org/pipermail/gcc-patches/2006-November/204405.html

Alas this vestigial logic still persists in the compiler today,
so for example on x86_64 for the following function:

double foo(double x) { return x + 0.0; }

generates with -O2

foo:    addsd   .LC0(%rip), %xmm0
        ret
.LC0:   .long   0
        .long   0

preferring to read the constant 0.0 from memory [the constant pool],
except when optimizing for size.  With -Os we get:

foo:    xorps   %xmm1, %xmm1
        addsd   %xmm1, %xmm0
        ret

Which is not only smaller (the two instructions require seven bytes vs.
eight for the original addsd from mem, even without considering the
constant pool) but is also faster on modern hardware.  The latter code
sequence is generated by both clang and msvc with -O2.  Indeed Agner
Fogg documents the set of floating point/SSE constants that it's
cheaper to materialize than to load from memory.

This patch shuffles the conditions on the i386 backend's *movtf_internal,
*movdf_internal and *movsf_internal define_insns to untangle the newer
TARGET_SSE_MATH clauses from the historical standard_80387_constant_p
conditions.  Amongst the benefits of this are that it improves the code
generated for PR tree-optimization/90356 and resolves PR target/86722.

2022-03-17  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
PR target/86722
PR tree-optimization/90356
* config/i386/i386.md (*movtf_internal): Don't guard
standard_sse_constant_p clause by optimize_function_for_size_p.
(*movdf_internal): Likewise.
(*movsf_internal): Likewise.

gcc/testsuite/ChangeLog
PR target/86722
PR tree-optimization/90356
* gcc.target/i386/pr86722.c: New test case.
* gcc.target/i386/pr90356.c: New test case.

2 years agoAlways use dominators in the cache when available.
Andrew MacLeod [Thu, 17 Mar 2022 14:52:10 +0000 (10:52 -0400)]
Always use dominators in the cache when available.

This patch adjusts range_from_dom to follow the dominator tree through the
cache until value is found, then apply any outgoing ranges encountered
along the way.  This reduces the amount of cache storage required.

PR tree-optimization/102943
* gimple-range-cache.cc (ranger_cache::range_from_dom): Find range via
dominators and apply intermediary outgoing edge ranges.

2 years agolibstdc++: Avoid including <algorithm> in <filesystem> [PR92546]
Jonathan Wakely [Thu, 17 Mar 2022 14:36:07 +0000 (14:36 +0000)]
libstdc++: Avoid including <algorithm> in <filesystem> [PR92546]

This only affects Windows, but reduces the preprocessed size of
<filesystem> significantly.

libstdc++-v3/ChangeLog:

PR libstdc++/92546
* include/bits/fs_path.h (path::make_preferred): Use
handwritten loop instead of std::replace.

2 years agolibstdc++: Rewrite __moneypunct_cache::_M_cache [PR104966]
Jonathan Wakely [Thu, 17 Mar 2022 13:33:07 +0000 (13:33 +0000)]
libstdc++: Rewrite __moneypunct_cache::_M_cache [PR104966]

GCC thinks the following can lead to a buffer overflow when __ns.size()
equals zero:

  const basic_string<_CharT>& __ns = __mp.negative_sign();
  _M_negative_sign_size = __ns.size();
  __negative_sign = new _CharT[_M_negative_sign_size];
  __ns.copy(__negative_sign, _M_negative_sign_size);

This happens because operator new might be replaced with something that
writes to this->_M_negative_sign_size and so the basic_string::copy call
could use a non-zero size to write to a zero-length buffer.

The solution suggested by Richi is to cache the size in a local variable
so that the compiler knows it won't be changed between the allocation
and the copy.

This commit goes further and rewrites the whole function to use RAII and
delay all modifications of *this until after all allocations have
succeeded. The RAII helper type caches the size and copies the string
and owns the memory until told to release it.

libstdc++-v3/ChangeLog:

PR middle-end/104966
* include/bits/locale_facets_nonio.tcc
(__moneypunct_cache::_M_cache): Replace try-catch with RAII and
make all string copies before any stores to *this.

2 years agolibatomic: Improve 16-byte atomics on Intel AVX [PR104688]
Jakub Jelinek [Thu, 17 Mar 2022 17:49:00 +0000 (18:49 +0100)]
libatomic: Improve 16-byte atomics on Intel AVX [PR104688]

As mentioned in the PR, the latest Intel SDM has added:
"Processors that enumerate support for Intel® AVX (by setting the feature flag CPUID.01H:ECX.AVX[bit 28])
guarantee that the 16-byte memory operations performed by the following instructions will always be
carried out atomically:
• MOVAPD, MOVAPS, and MOVDQA.
• VMOVAPD, VMOVAPS, and VMOVDQA when encoded with VEX.128.
• VMOVAPD, VMOVAPS, VMOVDQA32, and VMOVDQA64 when encoded with EVEX.128 and k0 (masking disabled).
(Note that these instructions require the linear addresses of their memory operands to be 16-byte
aligned.)"

The following patch deals with it just on the libatomic library side so far,
currently (since ~ 2017) we emit all the __atomic_* 16-byte builtins as
library calls since and this is something that we can hopefully backport.

The patch simply introduces yet another ifunc variant that takes priority
over the pure CMPXCHG16B one, one that checks AVX and CMPXCHG16B bits and
on non-Intel clears the AVX bit during detection for now (if AMD comes
with the same guarantee, we could revert the config/x86/init.c hunk),
which implements 16-byte atomic load as vmovdqa and 16-byte atomic store
as vmovdqa followed by mfence.

2022-03-17  Jakub Jelinek  <jakub@redhat.com>

PR target/104688
* Makefile.am (IFUNC_OPTIONS): Change on x86_64 to -mcx16 -mcx16.
(libatomic_la_LIBADD): Add $(addsuffix _16_2_.lo,$(SIZEOBJS)) for
x86_64.
* Makefile.in: Regenerated.
* config/x86/host-config.h (IFUNC_COND_1): For x86_64 define to
both AVX and CMPXCHG16B bits.
(IFUNC_COND_2): Define.
(IFUNC_NCOND): For x86_64 define to 2 * (N == 16).
(MAYBE_HAVE_ATOMIC_CAS_16, MAYBE_HAVE_ATOMIC_EXCHANGE_16,
MAYBE_HAVE_ATOMIC_LDST_16): Define to IFUNC_COND_2 rather than
IFUNC_COND_1.
(HAVE_ATOMIC_CAS_16): Redefine to 1 whenever IFUNC_ALT != 0.
(HAVE_ATOMIC_LDST_16): Redefine to 1 whenever IFUNC_ALT == 1.
(atomic_compare_exchange_n): Define whenever IFUNC_ALT != 0
on x86_64 for N == 16.
(__atomic_load_n, __atomic_store_n): Redefine whenever IFUNC_ALT == 1
on x86_64 for N == 16.
(atomic_load_n, atomic_store_n): New functions.
* config/x86/init.c (__libat_feat1_init): On x86_64 clear bit_AVX
if CPU vendor is not Intel.

2 years agolibstdc++: Fix comment in testsuite utility
Jonathan Wakely [Thu, 17 Mar 2022 12:23:02 +0000 (12:23 +0000)]
libstdc++: Fix comment in testsuite utility

libstdc++-v3/ChangeLog:

* testsuite/util/testsuite_character.h: Fix comment.

2 years agotree-optimization/104960 - unsplit edges after late sinking
Richard Biener [Thu, 17 Mar 2022 07:10:59 +0000 (08:10 +0100)]
tree-optimization/104960 - unsplit edges after late sinking

Something went wrong when testing the earlier patch to move the
late sinking to before the late phiopt for PR102008.  The following
makes sure to unsplit edges after the late sinking since the split
edges confuse the following phiopt leading to missed optimizations.

I've went for a new pass parameter for this to avoid changing the
CFG after the early sinking pass at this point.

2022-03-17  Richard Biener  <rguenther@suse.de>

PR tree-optimization/104960
* passes.def: Add pass parameter to pass_sink_code, mark
last one to unsplit edges.
* tree-ssa-sink.cc (pass_sink_code::set_pass_param): New.
(pass_sink_code::execute): Always execute TODO_cleanup_cfg
when we need to unsplit edges.

* gcc.dg/gimplefe-37.c: Adjust to allow either the true
or false edge to have a forwarder.

2 years agogimplify: Emit clobbers for TARGET_EXPR_SLOT vars later [PR103984]
Jakub Jelinek [Thu, 17 Mar 2022 08:23:45 +0000 (09:23 +0100)]
gimplify: Emit clobbers for TARGET_EXPR_SLOT vars later [PR103984]

As mentioned in the PR, we emit a bogus uninitialized warning but
easily could emit wrong-code for it or similar testcases too.
The bug is that we emit clobber for a TARGET_EXPR_SLOT too early:
          D.2499.e = B::qux (&h); [return slot optimization]
          D.2516 = 1;
          try
            {
              B::B (&D.2498, &h);
              try
                {
                  _2 = baz (&D.2498);
                  D.2499.f = _2;
                  D.2516 = 0;
                  try
                    {
                      try
                        {
                          bar (&D.2499);
                        }
                      finally
                        {
                          C::~C (&D.2499);
                        }
                    }
                  finally
                    {
                      D.2499 = {CLOBBER(eol)};
                    }
                }
              finally
                {
                  D.2498 = {CLOBBER(eol)};
                }
            }
          catch
            {
              if (D.2516 != 0) goto <D.2517>; else goto <D.2518>;
              <D.2517>:
              A::~A (&D.2499.e);
              goto <D.2519>;
              <D.2518>:
              <D.2519>:
            }
The CLOBBER for D.2499 is essentially only emitted on the non-exceptional
path, if B::B or baz throws, then there is no CLOBBER for it but there
is a conditional destructor A::~A (&D.2499.e).  Now, ehcleanup1
sink_clobbers optimization assumes that clobbers in the EH cases are
emitted after last use and so sinks the D.2499 = {CLOBBER(eol)}; later,
so we then have
  # _3 = PHI <1(3), 0(9)>
<L2>:
  D.2499 ={v} {CLOBBER(eol)};
  D.2498 ={v} {CLOBBER(eol)};
  if (_3 != 0)
    goto <bb 11>; [INV]
  else
    goto <bb 15>; [INV]

  <bb 11> :
  _35 = D.2499.a;
  if (&D.2499.b != _35)
where that _35 = D.2499.a comes from inline expansion of the A::~A dtor,
and that is a load from a clobbered memory.

Now, what the gimplifier sees in this case is a CLEANUP_POINT_EXPR with
somewhere inside of it a TARGET_EXPR for D.2499 (with the C::~C (&D.2499)
cleanup) which in its TARGET_EXPR_INITIAL has another TARGET_EXPR for
D.2516 bool flag which has CLEANUP_EH_ONLY which performs that conditional
A::~A (&D.2499.e) call.
The following patch ensures that CLOBBERs (and asan poisoning) are emitted
after even those gimple_push_cleanup pushed cleanups from within the
TARGET_EXPR_INITIAL gimplification (i.e. the last point where the slot could
be in theory used).  In my first version of the patch I've done it by just
moving the
      /* Add a clobber for the temporary going out of scope, like
         gimplify_bind_expr.  */
      if (gimplify_ctxp->in_cleanup_point_expr
          && needs_to_live_in_memory (temp))
        {
...
        }
block earlier in gimplify_target_expr, but that regressed a couple of tests
where temp is marked TREE_ADDRESSABLE only during (well, very early during
that) the gimplification of TARGET_EXPR_INITIAL, so we didn't emit e.g. on
pr80032.C or stack2.C tests any clobbers for the slots and thus stack slot
reuse wasn't performed.
So that we don't regress those tests, this patch gimplifies
TARGET_EXPR_INITIAL as before, but doesn't emit it directly into pre_p,
emits it into a temporary sequence.  Then emits the CLOBBER cleanup
into pre_p, then asan poisoning if needed, then appends the
TARGET_EXPR_INITIAL temporary sequence and finally adds TARGET_EXPR_CLEANUP
gimple_push_cleanup.  The earlier a GIMPLE_WCE appears in the sequence, the
outer try/finally or try/catch it is.
So, with this patch the part of the testcase in gimple dump cited above
looks instead like:
          try
            {
              D.2499.e = B::qux (&h); [return slot optimization]
              D.2516 = 1;
              try
                {
                  try
                    {
                      B::B (&D.2498, &h);
                      _2 = baz (&D.2498);
                      D.2499.f = _2;
                      D.2516 = 0;
                      try
                        {
                          bar (&D.2499);
                        }
                      finally
                        {
                          C::~C (&D.2499);
                        }
                    }
                  finally
                    {
                      D.2498 = {CLOBBER(eol)};
                    }
                }
              catch
                {
                  if (D.2516 != 0) goto <D.2517>; else goto <D.2518>;
                  <D.2517>:
                  A::~A (&D.2499.e);
                  goto <D.2519>;
                  <D.2518>:
                  <D.2519>:
                }
            }
          finally
            {
              D.2499 = {CLOBBER(eol)};
            }

2022-03-17  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/103984
* gimplify.cc (gimplify_target_expr): Gimplify type sizes and
TARGET_EXPR_INITIAL into a temporary sequence, then push clobbers
and asan unpoisioning, then append the temporary sequence and
finally the TARGET_EXPR_CLEANUP clobbers.

* g++.dg/opt/pr103984.C: New test.

2 years agoEnhance further testcases to verify Openacc 'kernels' decomposition
Thomas Schwinge [Wed, 16 Mar 2022 13:19:41 +0000 (14:19 +0100)]
Enhance further testcases to verify Openacc 'kernels' decomposition

gcc/testsuite/
* c-c++-common/goacc-gomp/nesting-1.c: Enhance.
* c-c++-common/goacc/kernels-loop-g.c: Likewise.
* c-c++-common/goacc/nesting-1.c: Likewise.
* gcc.dg/goacc/nested-function-1.c: Likewise.
* gfortran.dg/goacc/common-block-3.f90: Likewise.
* gfortran.dg/goacc/nested-function-1.f90: Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c:
Enhance.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c: Likewise.
* testsuite/libgomp.oacc-fortran/if-1.f90: Likewise.

2 years agoEnhance further testcases to verify handling of OpenACC privatization level [PR90115]
Thomas Schwinge [Wed, 16 Mar 2022 11:15:01 +0000 (12:15 +0100)]
Enhance further testcases to verify handling of OpenACC privatization level [PR90115]

As originally introduced in commit 11b8286a83289f5b54e813f14ff56d730c3f3185
"[OpenACC privatization] Largely extend diagnostics and corresponding testsuite
coverage [PR90115]".

PR middle-end/90115
gcc/testsuite/
* c-c++-common/goacc-gomp/nesting-1.c: Enhance.
* gfortran.dg/goacc/common-block-3.f90: Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Enhance.
* testsuite/libgomp.oacc-fortran/if-1.f90: Likewise.

2 years agoDaily bump.
GCC Administrator [Thu, 17 Mar 2022 00:17:00 +0000 (00:17 +0000)]
Daily bump.

2 years agoFix strange binary corruption with last commit.
Roger Sayle [Wed, 16 Mar 2022 23:28:21 +0000 (23:28 +0000)]
Fix strange binary corruption with last commit.

2022-03-16  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* config/i386/sse.md: Delete corrupt character/typo.

2 years agoPR c/98198: ICE-on-invalid-code error recovery.
Roger Sayle [Wed, 16 Mar 2022 23:20:34 +0000 (23:20 +0000)]
PR c/98198: ICE-on-invalid-code error recovery.

This is Christophe Lyon's fix to PR c/98198, an ICE-on-invalid-code
regression affecting mainline, and a suitable testcase.
Tested on x86_64-pc-linux-gnu with make bootstrap and make -k check
with no new failures.  Ok for mainline?

2022-03-16  Christophe Lyon  <christophe.lyon@arm.com>
    Roger Sayle  <roger@nextmovesoftware.com>

gcc/c-family/ChangeLog
PR c/98198
* c-attribs.cc (decl_or_type_attrs): Add error_mark_node check.

gcc/testsuite/ChangeLog
PR c/98198
* gcc.dg/pr98198.c: New test case.

2 years agoPR target/94680: Clear upper bits of V2DF using movq (like V2DI).
Roger Sayle [Wed, 16 Mar 2022 23:15:20 +0000 (23:15 +0000)]
PR target/94680: Clear upper bits of V2DF using movq (like V2DI).

This simple i386 patch unblocks a more significant change.  The testcase
gcc.target/i386/sse2-pr94680.c isn't quite testing what's intended, and
alas the fix for PR target/94680 doesn't (yet) handle V2DF mode.

For the first test from sse2-pr94680.c, below

v2df foo_v2df (v2df x) {
  return __builtin_shuffle (x, (v2df) { 0, 0 }, (v2di) { 0, 2 });
}

GCC on x86_64-pc-linux-gnu with -O2 currently generates:

        movhpd  .LC0(%rip), %xmm0
        ret
.LC0:
        .long   0
        .long   0

which passes the test as it contains a mov insn and no xor.
Alas reading a zero from the constant pool isn't quite the
desired implementation.  With this patch we now generate:

        movq    %xmm0, %xmm0
        ret

The same code as we generate for V2DI, and add a stricter
test case.  This implementation generalizes the sse2_movq128
to V2DI and V2DF modes using a VI8F_128 mode iterator and
renames it *sse2_movq128_<mode>.  A new define_expand is
introduced for sse2_movq128 so that the exisiting builtin
interface (CODE_FOR_sse2_movq128) remains the same.

2022-03-16  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
PR target/94680
* config/i386/sse.md (sse2_movq128): New define_expand to
preserve previous named instruction.
(*sse2_movq128_<mode>): Renamed from sse2_movq128, and
generalized to VI8F_128 (both V2DI and V2DF).

gcc/testsuite/ChangeLog
PR target/94680
* gcc.target/i386/sse2-pr94680-2.c: New stricter V2DF test case.

2 years agolibstdc++: Fix symbol versioning for Solaris 11.3 [PR103407]
Jonathan Wakely [Wed, 16 Mar 2022 20:35:47 +0000 (20:35 +0000)]
libstdc++: Fix symbol versioning for Solaris 11.3 [PR103407]

The new std::from_chars implementation means that those symbols are now
defined on Solaris 11.3, which lacks uselocale. They were not present in
gcc-11, but the linker script gives them the GLIBCXX_3.4.29 symbol
version because that is the version where they appeared for systems with
uselocale.

This makes the version for those symbols depend on whether uselocale is
available or not, so that they get version GLIBCXX_3.4.30 on targets
where they weren't defined in gcc-11.

In order to avoid needing separate ABI baseline files for Solaris 11.3
and 11.4, the ABI checker program now treats the floating-point
std::from_chars overloads as undesignated if they are not found in the
baseline symbols file. This means they can be left out of the SOlaris
baseline without causing the check-abi target to fail.

libstdc++-v3/ChangeLog:

PR libstdc++/103407
* config/abi/pre/gnu.ver: Make version for std::from_chars
depend on HAVE_USELOCALE macro.
* testsuite/util/testsuite_abi.cc (compare_symbols): Treat
std::from_chars for floating-point types as undesignated if
not found in the baseline symbols file.

2 years agolibgo: update to final Go 1.18 release
Ian Lance Taylor [Wed, 16 Mar 2022 17:31:57 +0000 (10:31 -0700)]
libgo: update to final Go 1.18 release

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/393377

2 years agoanalyzer: early rejection of disabled warnings [PR104955]
David Malcolm [Wed, 16 Mar 2022 14:54:44 +0000 (10:54 -0400)]
analyzer: early rejection of disabled warnings [PR104955]

Avoid generating execution paths for warnings that are ultimately
rejected due to -Wno-analyzer-* flags.

This improves the test case from taking at least several minutes
(before I killed it) to taking under a second.

This doesn't fix the slowdown seen in PR analyzer/104955 with large
numbers of warnings when the warnings are still enabled.

gcc/analyzer/ChangeLog:
PR analyzer/104955
* diagnostic-manager.cc (get_emission_location): New.
(diagnostic_manager::diagnostic_manager): Initialize
m_num_disabled_diagnostics.
(diagnostic_manager::add_diagnostic): Reject diagnostics that
will eventually be rejected due to being disabled.
(diagnostic_manager::emit_saved_diagnostics): Log the number
of disabled diagnostics.
(diagnostic_manager::emit_saved_diagnostic): Split out logic for
determining emission location to get_emission_location.
* diagnostic-manager.h
(diagnostic_manager::m_num_disabled_diagnostics): New field.
* engine.cc (stale_jmp_buf::get_controlling_option): New.
(stale_jmp_buf::emit): Use it.
* pending-diagnostic.h
(pending_diagnostic::get_controlling_option): New vfunc.
* region-model.cc
(poisoned_value_diagnostic::get_controlling_option): New.
(poisoned_value_diagnostic::emit): Use it.
(shift_count_negative_diagnostic::get_controlling_option): New.
(shift_count_negative_diagnostic::emit): Use it.
(shift_count_overflow_diagnostic::get_controlling_option): New.
(shift_count_overflow_diagnostic::emit): Use it.
(dump_path_diagnostic::get_controlling_option): New.
(dump_path_diagnostic::emit): Use it.
(write_to_const_diagnostic::get_controlling_option): New.
(write_to_const_diagnostic::emit): Use it.
(write_to_string_literal_diagnostic::get_controlling_option): New.
(write_to_string_literal_diagnostic::emit): Use it.
* sm-file.cc (double_fclose::get_controlling_option): New.
(double_fclose::emit): Use it.
(file_leak::get_controlling_option): New.
(file_leak::emit): Use it.
* sm-malloc.cc (mismatching_deallocation::get_controlling_option):
New.
(mismatching_deallocation::emit): Use it.
(double_free::get_controlling_option): New.
(double_free::emit): Use it.
(possible_null_deref::get_controlling_option): New.
(possible_null_deref::emit): Use it.
(possible_null_arg::get_controlling_option): New.
(possible_null_arg::emit): Use it.
(null_deref::get_controlling_option): New.
(null_deref::emit): Use it.
(null_arg::get_controlling_option): New.
(null_arg::emit): Use it.
(use_after_free::get_controlling_option): New.
(use_after_free::emit): Use it.
(malloc_leak::get_controlling_option): New.
(malloc_leak::emit): Use it.
(free_of_non_heap::get_controlling_option): New.
(free_of_non_heap::emit): Use it.
* sm-pattern-test.cc (pattern_match::get_controlling_option): New.
(pattern_match::emit): Use it.
* sm-sensitive.cc
(exposure_through_output_file::get_controlling_option): New.
(exposure_through_output_file::emit): Use it.
* sm-signal.cc (signal_unsafe_call::get_controlling_option): New.
(signal_unsafe_call::emit): Use it.
* sm-taint.cc (tainted_array_index::get_controlling_option): New.
(tainted_array_index::emit): Use it.
(tainted_offset::get_controlling_option): New.
(tainted_offset::emit): Use it.
(tainted_size::get_controlling_option): New.
(tainted_size::emit): Use it.
(tainted_divisor::get_controlling_option): New.
(tainted_divisor::emit): Use it.
(tainted_allocation_size::get_controlling_option): New.
(tainted_allocation_size::emit): Use it.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/many-disabled-diagnostics.c: New test.
* gcc.dg/plugin/analyzer_gil_plugin.c
(gil_diagnostic::get_controlling_option): New.
(double_save_thread::emit): Use it.
(fncall_without_gil::emit): Likewise.
(pyobject_usage_without_gil::emit): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2 years agolibstdc++: Ensure that std::from_chars is declared when supported
Jonathan Wakely [Fri, 11 Mar 2022 14:36:18 +0000 (14:36 +0000)]
libstdc++: Ensure that std::from_chars is declared when supported

This adjusts the declarations in <charconv> to match when the definition
is present. This solves the issue that std::from_chars is present on
Solaris 11.3 (using fast_float) but was not declared in the header
(because the declarations were guarded by _GLIBCXX_HAVE_USELOCALE).

Additionally, do not define __cpp_lib_to_chars unless both from_chars
and to_chars are supported (which is only true for IEEE float and
double). We might still provide from_chars (via strtold) but if to_chars
isn't provided, we shouldn't define the feature test macro.

Finally, this simplifies some of the preprocessor checks in the bodies
of std::from_chars in src/c++17/floating_from_chars.cc and hoists the
repeated code for the strtod version into a new function template.

N.B. the long double overload of std::from_chars will always be defined
if the float and double overloads are defined. We can always use one of
strtold or fast_float's binary64 routines (although the latter might
produce errors for some long double values if they are not representable
as binary64).

libstdc++-v3/ChangeLog:

* include/std/charconv (__cpp_lib_to_chars): Only define when
both from_chars and to_chars are supported for floating-point
types.
(from_chars, to_chars): Adjust preprocessor conditions guarding
declarations.
* include/std/version (__cpp_lib_to_chars): Adjust condition to
match <charconv> definition.
* src/c++17/floating_from_chars.cc (from_chars_strtod): New
function template.
(from_chars): Simplify preprocessor checks and use
from_chars_strtod when appropriate.

2 years agotree-optimization/104941: Actually assign the conversion result
Siddhesh Poyarekar [Wed, 16 Mar 2022 15:15:47 +0000 (20:45 +0530)]
tree-optimization/104941: Actually assign the conversion result

Assign the result of fold_convert to offset.  Also make the useless
conversion check lighter since the two way check is not needed here.

gcc/ChangeLog:

PR tree-optimization/104941
* tree-object-size.cc (size_for_offset): Make useless conversion
check lighter and assign result of fold_convert to OFFSET.

gcc/testsuite/ChangeLog:

PR tree-optimization/104941
* gcc.dg/builtin-dynamic-object-size-0.c (S1, S2): New structs.
(test_alloc_nested_structs, g): New functions.
(main): Call test_alloc_nested_structs.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
2 years agoOpenMP, Fortran: Bugfix for omp_set_num_teams.
Marcel Vollweiler [Wed, 16 Mar 2022 14:38:54 +0000 (07:38 -0700)]
OpenMP, Fortran: Bugfix for omp_set_num_teams.

This patch fixes a small bug in the omp_set_num_teams implementation.

libgomp/ChangeLog:

* fortran.c (omp_set_num_teams_8_): Call omp_set_num_teams instead of
omp_set_max_active_levels.
* testsuite/libgomp.fortran/icv-8.f90: New test.

2 years agox86: Also check _SOFT_FLOAT in <x86gprintrin.h>
H.J. Lu [Sun, 13 Mar 2022 15:57:51 +0000 (08:57 -0700)]
x86: Also check _SOFT_FLOAT in <x86gprintrin.h>

Push target("general-regs-only") in <x86gprintrin.h> if x87 is enabled.

gcc/

PR target/104890
* config/i386/x86gprintrin.h: Also check _SOFT_FLOAT before
pushing target("general-regs-only").

gcc/testsuite/

PR target/104890
* gcc.target/i386/pr104890.c: New test.

2 years agoRISC-V: Add version info for zk, zkn and zks
Kito Cheng [Tue, 15 Mar 2022 01:04:03 +0000 (09:04 +0800)]
RISC-V: Add version info for zk, zkn and zks

We just expand `zk`, `zkn` and `zks` before, but need version for
combine them back.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_ext_version_table):
Add version info for zk, zks and zkn.

2 years agoRISC-V: Handle combine extension in canonical ordering.
LiaoShihua [Tue, 8 Mar 2022 03:30:51 +0000 (11:30 +0800)]
RISC-V: Handle combine extension in canonical ordering.

The crypto extension have several shorthand extensions that don't consist of any extra instructions.
Take zk for example, while the extension would imply zkn, zkr, zkt.
The 3 extensions should also combine back into zk to maintain the canonical order in isa strings.
This patch addresses the above.
And if the other extension has the same situation, you can add them in riscv_combine_info[]

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc
(riscv_combine_info): New.
(riscv_subset_list::handle_combine_ext): Combine back into zk to
maintain the canonical order in isa strings.
(riscv_subset_list::parse): Ditto.
* config/riscv/riscv-subset.h (handle_combine_ext): New.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/predef-17.c: New test.

2 years agotree-optimization/102008 - restore if-conversion of adjacent loads
Richard Biener [Wed, 16 Mar 2022 12:39:31 +0000 (13:39 +0100)]
tree-optimization/102008 - restore if-conversion of adjacent loads

The following re-orders the newly added code sinking pass before
the last phiopt pass which performs hoisting of adjacent loads
with the intent to enable if-conversion on those.

I've added the aarch64 specific testcase from the PR.

2022-03-16  Richard Biener  <rguenther@suse.de>

PR tree-optimization/102008
* passes.def: Move the added code sinking pass before the
preceeding phiopt pass.

* gcc.target/aarch64/pr102008.c: New testcase.

2 years agoc++: further lookup_member simplification
Patrick Palka [Wed, 16 Mar 2022 12:26:11 +0000 (08:26 -0400)]
c++: further lookup_member simplification

As a minor followup to r12-7656-gffe9c0a0d3564a, this condenses the
handling of ambiguity and access w.r.t. the value of 'protect' so that
the logic is more clear.

gcc/cp/ChangeLog:

* search.cc (lookup_member): Simplify by handling all values
of protect together in the ambiguous case.  Don't modify protect.

2 years agoc++: fold calls to std::move/forward [PR96780]
Patrick Palka [Wed, 16 Mar 2022 12:25:54 +0000 (08:25 -0400)]
c++: fold calls to std::move/forward [PR96780]

A well-formed call to std::move/forward is equivalent to a cast, but the
former being a function call means the compiler generates debug info,
which persists even after the call gets inlined, for an operation that's
never interesting to debug.

This patch addresses this problem by folding calls to std::move/forward
and other cast-like functions into simple casts as part of the frontend's
general expression folding routine.  This behavior is controlled by a
new flag -ffold-simple-inlines, and otherwise by -fno-inline, so that
users can enable this folding with -O0 (which implies -fno-inline).

After this patch with -O2 and a non-checking compiler, debug info size
for some testcases from range-v3 and cmcstl2 decreases by as much as ~10%
and overall compile time and memory usage decreases by ~2%.

PR c++/96780

gcc/ChangeLog:

* doc/invoke.texi (C++ Dialect Options): Document
-ffold-simple-inlines.

gcc/c-family/ChangeLog:

* c.opt: Add -ffold-simple-inlines.

gcc/cp/ChangeLog:

* cp-gimplify.cc (cp_fold) <case CALL_EXPR>: Fold calls to
std::move/forward and other cast-like functions into simple
casts.

gcc/testsuite/ChangeLog:

* g++.dg/opt/pr96780.C: New test.

2 years agotree-optimization/104942: Retain sizetype conversions till the end
Siddhesh Poyarekar [Wed, 16 Mar 2022 10:40:51 +0000 (16:10 +0530)]
tree-optimization/104942: Retain sizetype conversions till the end

Retain the sizetype alloc_object_size to guarantee the assertion in
size_for_offset and to avoid adding a conversion there.  nop conversions
are eliminated at the end anyway in dynamic object size computation.

gcc/ChangeLog:

PR tree-optimization/104942
* tree-object-size.cc (alloc_object_size): Remove STRIP_NOPS.

gcc/testsuite/ChangeLog:

PR tree-optimization/104942
* gcc.dg/builtin-dynamic-object-size-0.c (alloc_func_long,
test_builtin_malloc_long): New functions.
(main): Use it.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
2 years agoaarch64: Fix up RTL sharing bug in aarch64_load_symref_appropriately [PR104910]
Jakub Jelinek [Wed, 16 Mar 2022 10:04:16 +0000 (11:04 +0100)]
aarch64: Fix up RTL sharing bug in aarch64_load_symref_appropriately [PR104910]

We unshare all RTL created during expansion, but when
aarch64_load_symref_appropriately is called after expansion like in the
following testcases, we use imm in both HIGH and LO_SUM operands.
If imm is some RTL that shouldn't be shared like a non-sharable CONST,
we get at least with --enable-checking=rtl a checking ICE, otherwise might
just get silently wrong code.

The following patch fixes that by copying it if it can't be shared.

2022-03-16  Jakub Jelinek  <jakub@redhat.com>

PR target/104910
* config/aarch64/aarch64.cc (aarch64_load_symref_appropriately): Copy
imm rtx.

* gcc.dg/pr104910.c: New test.

2 years agoPerformance/size improvement to single_use when matching GIMPLE.
Roger Sayle [Wed, 16 Mar 2022 09:27:33 +0000 (09:27 +0000)]
Performance/size improvement to single_use when matching GIMPLE.

This patch improves the implementation of single_use as used in code
generated from match.pd for patterns using :s.  The current implementation
contains the logic "has_zero_uses (t) || has_single_use (t)" which
performs a loop over the uses to first check if there are zero non-debug
uses [which is rare], then another loop over these uses to check if there
is exactly one non-debug use.  This can be better implemented using a
single loop.

This function is currently inlined over 800 times in gimple-match.cc,
whose .o on x86_64-pc-linux-gnu is now up to 30 Mbytes, so speeding up
and shrinking this function should help offset the growth in match.pd
for GCC 12.

I've also done an analysis of the stage3 sizes of gimple-match.o on
x86_64-pc-linux-gnu, which I believe is dominated by debug information,
the .o file is 30MB in stage3, but only 4.8M in stage2.  Before my
proposed patch gimple-match.o is 31385160 bytes.  The patch as proposed
yesterday (using a single loop in single_use) reduces that to 31105040
bytes, saving 280120 bytes.  The suggestion to remove the "inline"
keyword saves only 56 more bytes, but annotating ATTRIBUTE_PURE on a
function prototype was curiously effective, saving 1888 bytes.

before:   31385160
after:    31105040 saved 280120
-inline:  31104984 saved 56
+pure:    31103096 saved 1888

2022-03-16  Roger Sayle  <roger@nextmovesoftware.com>
    Richard Biener  <rguenther@suse.de>

gcc/ChangeLog
* gimple-match-head.cc (single_use): Implement inline using a
single loop.

2 years agoSome minor HONOR_NANS improvements to match.pd
Roger Sayle [Wed, 16 Mar 2022 09:25:34 +0000 (09:25 +0000)]
Some minor HONOR_NANS improvements to match.pd

Tweak the constant folding of X CMP X in when X can't be a NaN.

2022-03-16  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* match.pd (X CMP X -> true): Test tree_expr_maybe_nan_p
instead of HONOR_NANS.
(X LTGT X -> false): Enable if X is not tree_expr_maybe_nan_p, as
this can't trap/signal.

2 years agoOpenACC privatization diagnostics vs. 'assert' [PR102841]
Thomas Schwinge [Wed, 16 Mar 2022 07:02:39 +0000 (08:02 +0100)]
OpenACC privatization diagnostics vs. 'assert' [PR102841]

It's an orthogonal concern why these diagnostics do appear at all for
non-offloaded OpenACC constructs (where they're not relevant at all); PR90115.

Depending on how 'assert' is implemented, it may cause temporaries to be
created, and/or may lower into 'COND_EXPR's, and
'gcc/gimplify.cc:gimplify_cond_expr' uses 'create_tmp_var (type, "iftmp")'.

Fix-up for commit 11b8286a83289f5b54e813f14ff56d730c3f3185
"[OpenACC privatization] Largely extend diagnostics and
corresponding testsuite coverage [PR90115]".

PR testsuite/102841
libgomp/
* testsuite/libgomp.oacc-c-c++-common/host_data-7.c: Adjust.

2 years agoDon't fold __builtin_ia32_blendvpd w/o sse4.2.
liuhongt [Wed, 16 Mar 2022 07:59:57 +0000 (15:59 +0800)]
Don't fold __builtin_ia32_blendvpd w/o sse4.2.

__builtin_ia32_blendvpd is defined under sse4.1 and gimple folded
to ((v2di) c) < 0 ? b : a where vec_cmpv2di is under sse4.2 w/o which
it's veclowered to scalar operations and not combined back in rtl.

gcc/ChangeLog:

PR target/104946
* config/i386/i386-builtin.def (BDESC): Add
CODE_FOR_sse4_1_blendvpd for IX86_BUILTIN_BLENDVPD.
* config/i386/i386.cc (ix86_gimple_fold_builtin): Don't fold
__builtin_ia32_blendvpd w/o sse4.2

gcc/testsuite/ChangeLog:

* gcc.target/i386/sse4_1-blendvpd-1.c: New test.

2 years agoMAINTAINERS: Add myself to DCO section
Chung-Ju Wu [Wed, 16 Mar 2022 03:20:00 +0000 (03:20 +0000)]
MAINTAINERS: Add myself to DCO section

ChangeLog:

* MAINTAINERS: Add myself to DCO section.

2 years agoDaily bump.
GCC Administrator [Wed, 16 Mar 2022 00:16:44 +0000 (00:16 +0000)]
Daily bump.

2 years agoanalyzer: add test coverage for PR 95000
David Malcolm [Tue, 15 Mar 2022 21:56:29 +0000 (17:56 -0400)]
analyzer: add test coverage for PR 95000

PR analyzer/95000 isn't fixed yet; add test coverage with XFAILs.

gcc/testsuite/ChangeLog:
PR analyzer/95000
* gcc.dg/analyzer/pr95000-1.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2 years agoanalyzer: presize m_cluster_map in store copy ctor
David Malcolm [Tue, 15 Mar 2022 21:55:14 +0000 (17:55 -0400)]
analyzer: presize m_cluster_map in store copy ctor

Testing cc1 on pr93032-mztools-unsigned-char.c

Benchmark #1: (without patch)
  Time (mean ± σ):     338.8 ms ±  13.6 ms    [User: 323.2 ms, System: 14.2 ms]
  Range (min … max):   326.7 ms … 363.1 ms    10 runs

Benchmark #2: (with patch)
  Time (mean ± σ):     332.3 ms ±  12.8 ms    [User: 316.6 ms, System: 14.3 ms]
  Range (min … max):   322.5 ms … 357.4 ms    10 runs

Summary
  ./cc1.new ran 1.02 ± 0.06 times faster than ./cc1.old

gcc/analyzer/ChangeLog:
* store.cc (store::store): Presize m_cluster_map.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2 years agors6000: Fix invalid address passed to __builtin_mma_disassemble_acc [PR104923]
Peter Bergner [Tue, 15 Mar 2022 13:46:47 +0000 (08:46 -0500)]
rs6000: Fix invalid address passed to __builtin_mma_disassemble_acc [PR104923]

The mma_disassemble_output_operand predicate is too lenient on the types
of addresses it will accept, leading to combine creating invalid address
that eventually lead to ICEs in LRA.  The solution is to restrict the
addresses to indirect, indexed or those valid for quad memory accesses.

2022-03-15  Peter Bergner  <bergner@linux.ibm.com>

gcc/
PR target/104923
* config/rs6000/predicates.md (mma_disassemble_output_operand): Restrict
acceptable MEM addresses.

gcc/testsuite/
PR target/104923
* gcc.target/powerpc/pr104923.c: New test.

2 years agoc++: extraneous access error with ambiguous lookup [PR103177]
Patrick Palka [Tue, 15 Mar 2022 12:50:24 +0000 (08:50 -0400)]
c++: extraneous access error with ambiguous lookup [PR103177]

When a lookup is ambiguous, lookup_member still attempts to check
access of the first member found before diagnosing the ambiguity and
propagating the error, and this may cause us to issue an extraneous
access error as in the testcase below (for B1::foo).

This patch fixes this by swapping the order of the ambiguity and access
checks within lookup_member.  In passing, since the only thing that could
go wrong during lookup_field_r is ambiguity, we might as well hardcode
that in lookup_member and get rid of lookup_field_info::errstr.

PR c++/103177

gcc/cp/ChangeLog:

* search.cc (lookup_field_info::errstr): Remove this data
member.
(lookup_field_r): Don't set errstr.
(lookup_member): Check ambiguity before checking access.
Simplify accordingly after errstr removal.  Exit early upon
error or empty result.

gcc/testsuite/ChangeLog:

* g++.dg/lookup/ambig6.C: New test.

2 years agoriscv: Allow -Wno-psabi to turn off ABI warnings [PR91229]
Jakub Jelinek [Tue, 15 Mar 2022 12:34:33 +0000 (13:34 +0100)]
riscv: Allow -Wno-psabi to turn off ABI warnings [PR91229]

While checking if all targets honor -Wno-psabi for ABI related warnings
or messages, I found that almost all do, except for riscv.
In the testsuite when we want to ignore ABI related messages we
typically use -Wno-psabi -w, but it would be nice to get rid of those
-w uses eventually.

The following allows silencing those warnings with -Wno-psabi rather than
just -w even on riscv.

2022-03-15  Jakub Jelinek  <jakub@redhat.com>

PR target/91229
* config/riscv/riscv.cc (riscv_pass_aggregate_in_fpr_pair_p,
riscv_pass_aggregate_in_fpr_and_gpr_p): Pass OPT_Wpsabi instead of 0
to warning calls.

2 years agoi386: Use no-mmx,no-sse for LIBGCC2_UNWIND_ATTRIBUTE [PR104890]
Jakub Jelinek [Tue, 15 Mar 2022 09:24:22 +0000 (10:24 +0100)]
i386: Use no-mmx,no-sse for LIBGCC2_UNWIND_ATTRIBUTE [PR104890]

Regardless of the outcome of the general-regs-only stuff in x86gprintrin.h,
apparently general-regs-only is much bigger hammer than no-sse, and e.g.
using 387 instructions in the unwinder isn't a big deal, it never needs
to realign the stack because of it.

So, the following patch uses no-sse (and adds no-mmx to it, even when not
strictly needed).

2022-03-15  Jakub Jelinek  <jakub@redhat.com>

PR target/104890
* config/i386/i386.h (LIBGCC2_UNWIND_ATTRIBUTE): Use no-mmx,no-sse
instead of general-regs-only.

2 years agoPR tree-optimization/101895: Fold VEC_PERM to help recognize FMA.
Roger Sayle [Tue, 15 Mar 2022 09:05:28 +0000 (09:05 +0000)]
PR tree-optimization/101895: Fold VEC_PERM to help recognize FMA.

This patch resolves PR tree-optimization/101895 a missed optimization
regression, by adding a costant folding simplification to match.pd to
simplify the transform "mult; vec_perm; plus" into "vec_perm; mult; plus"
with the aim that keeping the multiplication and addition next to each
other allows them to be recognized as fused-multiply-add on suitable
targets.  This transformation requires a tweak to match.pd's
vec_same_elem_p predicate to handle CONSTRUCTOR_EXPRs using the same
SSA_NAME_DEF_STMT idiom used for constructors elsewhere in match.pd.

The net effect is that the following code example:

void foo(float * __restrict__ a, float b, float *c) {
  a[0] = c[0]*b + a[0];
  a[1] = c[2]*b + a[1];
  a[2] = c[1]*b + a[2];
  a[3] = c[3]*b + a[3];
}

when compiled on x86_64-pc-linux-gnu with -O2 -march=cascadelake
currently generates:

        vbroadcastss    %xmm0, %xmm0
        vmulps  (%rsi), %xmm0, %xmm0
        vpermilps       $216, %xmm0, %xmm0
        vaddps  (%rdi), %xmm0, %xmm0
        vmovups %xmm0, (%rdi)
        ret

but with this patch now generates the improved:

        vpermilps       $216, (%rsi), %xmm1
        vbroadcastss    %xmm0, %xmm0
        vfmadd213ps     (%rdi), %xmm0, %xmm1
        vmovups %xmm1, (%rdi)
        ret

2022-03-15  Roger Sayle  <roger@nextmovesoftware.com>
    Marc Glisse  <marc.glisse@inria.fr>
    Richard Biener  <rguenther@suse.de>

gcc/ChangeLog
PR tree-optimization/101895
* match.pd (vec_same_elem_p): Handle CONSTRUCTOR_EXPR def.
(plus (vec_perm (mult ...) ...) ...): New reordering simplification.

gcc/testsuite/ChangeLog
PR tree-optimization/101895
* gcc.target/i386/pr101895.c: New test case.

2 years agoc++: Fix up cp_parser_skip_to_pragma_eol [PR104623]
Jakub Jelinek [Tue, 15 Mar 2022 08:15:27 +0000 (09:15 +0100)]
c++: Fix up cp_parser_skip_to_pragma_eol [PR104623]

We ICE on the following testcase, because we tentatively parse it multiple
times and the erroneous attribute syntax results in
cp_parser_skip_to_end_of_statement, which when seeing CPP_PRAGMA (can be
any deferred one, OpenMP/OpenACC/ivdep etc.) it calls
cp_parser_skip_to_pragma_eol, which calls cp_lexer_purge_tokens_after.
That call purges all the tokens from CPP_PRAGMA until CPP_PRAGMA_EOL,
excluding the initial CPP_PRAGMA though (but including the final
CPP_PRAGMA_EOL).  This means the second time we parse this, we see
CPP_PRAGMA with no tokens after it from the pragma, most importantly
not the CPP_PRAGMA_EOL, so either if it is the last pragma in the TU,
we ICE, or if there are other pragmas we treat everything in between
as a pragma.

I've tried various things, including making the CPP_PRAGMA token
itself also purged, or changing the cp_parser_skip_to_end_of_statement
(and cp_parser_skip_to_end_of_block_or_statement) to call it with
NULL instead of token, so that this purging isn't done there,
but each patch resulted in lots of regressions.
But removing the purging altogether surprisingly doesn't regress anything,
and I think it is the right thing, if we e.g. parse tentatively, why can't
we parse the pragma multiple times or at least skip over it?

2022-03-15  Jakub Jelinek  <jakub@redhat.com>

PR c++/104623
* parser.cc (cp_parser_skip_to_pragma_eol): Don't purge any tokens.

* g++.dg/gomp/pr104623.C: New test.

2 years agoifcvt: Punt if not onlyjump_p for find_if_case_{1,2} [PR104814]
Jakub Jelinek [Tue, 15 Mar 2022 08:12:03 +0000 (09:12 +0100)]
ifcvt: Punt if not onlyjump_p for find_if_case_{1,2} [PR104814]

find_if_case_{1,2} implicitly assumes conditional jumps and rewrites them,
so if they have extra side-effects or are say asm goto, things don't work
well, either the side-effects are lost or we could ICE.
In particular, the testcase below on s390x has there a doloop instruction
that decrements a register in addition to testing it for non-zero and
conditionally jumping based on that.

The following patch fixes that by punting for !onlyjump_p case, i.e.
if there are side-effects in the jump instruction or it isn't a plain PC
setter.

Also, it assumes BB_END (test_bb) will be always non-NULL, because basic
blocks with 2 non-abnormal successor edges should always have some instruction
at the end that determines which edge to take.

2022-03-15  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/104814
* ifcvt.cc (find_if_case_1, find_if_case_2): Punt if test_bb doesn't
end with onlyjump_p.  Assume BB_END (test_bb) is always non-NULL.

* gcc.c-torture/execute/pr104814.c: New test.

2 years agoAvoid -Wdangling-pointer for by-transparent-reference arguments [PR104436].
Martin Sebor [Tue, 15 Mar 2022 00:23:08 +0000 (18:23 -0600)]
Avoid -Wdangling-pointer for by-transparent-reference arguments [PR104436].

This change avoids -Wdangling-pointer for by-value arguments transformed
into by-transparent-reference.

Resolves:
PR middle-end/104436 - spurious -Wdangling-pointer assigning local address to a class passed by value

gcc/ChangeLog:

PR middle-end/104436
* gimple-ssa-warn-access.cc (pass_waccess::check_dangling_stores):
Check for warning suppression.  Avoid by-value arguments transformed
into by-transparent-reference.

gcc/testsuite/ChangeLog:

PR middle-end/104436
* c-c++-common/Wdangling-pointer-8.c: New test.
* g++.dg/warn/Wdangling-pointer-5.C: New test.

2 years agoDaily bump.
GCC Administrator [Tue, 15 Mar 2022 00:16:49 +0000 (00:16 +0000)]
Daily bump.

2 years agoUpdate gcc de.po, fr.po, sv.po
Joseph Myers [Mon, 14 Mar 2022 22:28:33 +0000 (22:28 +0000)]
Update gcc de.po, fr.po, sv.po

* de.po, fr.po, sv.po: Update.

2 years agoFix libitm.c/memset-1.c test fails with new peephole2s.
Roger Sayle [Mon, 14 Mar 2022 18:12:55 +0000 (18:12 +0000)]
Fix libitm.c/memset-1.c test fails with new peephole2s.

My sincere apologies for the breakage, but alas handling SImode in the
recently added "xorl;movb -> movzbl" peephole2 turns out to be slightly
more complicated that just using SWI48 as a mode iterator.  I'd failed
to check the machine description carefully, but the *zero_extend<mode>si2
define_insn is conditionally defined, based on x86 target tuning using
TARGET_ZERO_EXTEND_WITH_AND, and therefore unavailable on 486 and pentium
unless optimizing the code for size.  It turns out that the libitm testsuite
specifies -m486 with make check RUNTESTFLAGS="--target_board='unix{-m32}'"
and therefore encounters/catches oversight.

Fixed by adding the appropriate conditions to the new peephole2 patterns.

2022-03-14  Roger Sayle  <roger@nextmovesoftware.com>
    Uroš Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog
* config/i386/i386.md (peephole2 xorl;movb -> movzbl): Disable
transformation when *zero_extend<mode>si2 is not available.

gcc/testsuite/ChangeLog
* gcc.target/i386/pr98335.c: Skip this test if tuning for i486
or pentium, and not optimizing for size.

2 years agoEnable libsanitizer build on mips64
Xi Ruoyao [Fri, 11 Mar 2022 03:07:00 +0000 (11:07 +0800)]
Enable libsanitizer build on mips64

Bootstrapped and regtested on mips64-linux-gnuabi64.

bootstrap-ubsan revealed 3 bugs (PR 104842, 104843, 104851).
bootstrap-asan did not reveal any new bug.

gcc/

* config/mips/mips.h (SUBTARGET_SHADOW_OFFSET): Define.
* config/mips/mips.cc (mips_option_override): Make
-fsanitize=address imply -fasynchronous-unwind-tables.  This is
needed by libasan for stack backtrace on MIPS.
(mips_asan_shadow_offset): Return SUBTARGET_SHADOW_OFFSET.

gcc/testsuite:

* c-c++-common/asan/global-overflow-1.c: Skip for MIPS with some
optimization levels because inaccurate debug info is causing
dg-output mismatch on line numbers.
* g++.dg/asan/large-func-test-1.C: Likewise.

libsanitizer/

* configure.tgt: Enable build on mips*64*-*-linux*.

2 years agolibsanitizer: cherry-pick db7bca28638e from upstream
Xi Ruoyao [Fri, 11 Mar 2022 02:59:29 +0000 (10:59 +0800)]
libsanitizer: cherry-pick db7bca28638e from upstream

libsanitizer/

* sanitizer_common/sanitizer_atomic_clang.h: Ensures to only
include sanitizer_atomic_clang_mips.h for O32.

2 years agolra: Fix up debug_p handling in lra_substitute_pseudo [PR104778]
Jakub Jelinek [Mon, 14 Mar 2022 13:49:09 +0000 (14:49 +0100)]
lra: Fix up debug_p handling in lra_substitute_pseudo [PR104778]

The following testcase ICEs on powerpc-linux, because lra_substitute_pseudo
substitutes (const_int 1) into a subreg operand.  First a subreg of subreg
of a reg appears in a debug insn (which surely is invalid outside of
debug insns, but in debug insns we allow even what is normally invalid in
RTL like subregs which the target doesn't like, because either dwarf2out
is able to handle it, or we just throw away the location expression,
making some var <optimized out>.

lra_substitute_pseudo already has some code to deal with specifically
SUBREG of REG with the REG being substituted for VOIDmode constant,
but that doesn't cover this case, so the following patch extends
lra_substitute_pseudo for debug_p mode to treat stuff like e.g.
combiner's subst function to ensure we don't lose mode which is essential
for the IL.

2022-03-14  Jakub Jelinek  <jakub@redhat.com>

PR debug/104778
* lra.cc (lra_substitute_pseudo): For debug_p mode, simplify
SUBREG, ZERO_EXTEND, SIGN_EXTEND, FLOAT or UNSIGNED_FLOAT if recursive
call simplified the first operand into VOIDmode constant.

* gcc.target/powerpc/pr104778.c: New test.

2 years agolibstdc++: Fix reading UTF-8 characters for 16-bit targets [PR104875]
Jonathan Wakely [Fri, 11 Mar 2022 14:52:38 +0000 (14:52 +0000)]
libstdc++: Fix reading UTF-8 characters for 16-bit targets [PR104875]

The current code in read_utf8_code_point assumes that integer promotion
will create a 32-bit int, but that's not true for 16-bit targets like
msp430 and avr. This changes the intermediate variables used for each
octet from unsigned char to char32_t, so that (c << N) works correctly
when N > 8.

libstdc++-v3/ChangeLog:

PR libstdc++/104875
* src/c++11/codecvt.cc (read_utf8_code_point): Use char32_t to
hold octets that will be left-shifted.

2 years agotop-level: Fix comment about --enable-libstdcxx in configure
Jonathan Wakely [Fri, 11 Mar 2022 18:33:40 +0000 (18:33 +0000)]
top-level: Fix comment about --enable-libstdcxx in configure

The custom option for enabling/disabling libstdc++ is not spelled the
same as the directory name:

AC_ARG_ENABLE(libstdcxx,
AS_HELP_STRING([--disable-libstdcxx],
  [do not build libstdc++-v3 directory])

The comment referring to it later use the wrong name.

ChangeLog:

* configure.ac: Fix incorrect option in comment.
* configure: Regenerate.

2 years agoc++: Reject __builtin_clear_padding on non-trivially-copyable types with one exceptio...
Jakub Jelinek [Mon, 14 Mar 2022 09:47:38 +0000 (10:47 +0100)]
c++: Reject __builtin_clear_padding on non-trivially-copyable types with one exception [PR102586]

As mentioned by Jason in the PR, non-trivially-copyable types (or non-POD
for purposes of layout?) types can be base classes of derived classes in
which the padding in those non-trivially-copyable types can be reused for
some real data members or even the layout can change and data members can
be moved to other positions.
__builtin_clear_padding is right now used for multiple purposes,
in <atomic> where it isn't used yet but was planned as the main spot
it can be used for trivially copyable types only, ditto for std::bit_cast
where we also use it.  It is used for OpenMP long double atomics too but
long double is trivially copyable, and lastly for -ftrivial-auto-var-init=.

The following patch restricts the builtin to pointers to trivially-copyable
types, with the exception when it is called directly on an address of a
variable, in that case already the FE can verify it is the complete object
type and so it is safe to clear all the paddings in it.

2022-03-14  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/102586
gcc/
* doc/extend.texi (__builtin_clear_padding): Clearify that for C++
argument type should be pointer to trivially-copyable type unless it
is address of a variable or parameter.
gcc/cp/
* call.cc (build_cxx_call): Diagnose __builtin_clear_padding where
first argument's type is pointer to non-trivially-copyable type unless
it is address of a variable or parameter.
gcc/testsuite/
* g++.dg/cpp2a/builtin-clear-padding1.C: New test.

2 years agoi386: Fix up _mm_loadu_si{16,32} [PR99754]
Jakub Jelinek [Mon, 14 Mar 2022 09:44:38 +0000 (10:44 +0100)]
i386: Fix up _mm_loadu_si{16,32} [PR99754]

These intrinsics are supposed to do an unaligned may_alias load
of a 16-bit or 32-bit value and store it as the first element of
a 128-bit integer vector, with all other elements cleared.

The current _mm_storeu_* implementation implements that correctly, uses
__*_u types to do the store and extracts the first element of a vector into
it.
But _mm_loadu_si{16,32} gets it all wrong.  It performs an aligned
non-may_alias load and because _mm_set_epi{16,32} has the args reversed,
it also inserts it into the last vector element instead of first.

The following patch fixes that.

Note, while the Intrinsics guide for _mm_loadu_si32 says SSE2,
for _mm_loadu_si16 it says strangely SSE.  But the intrinsics
returns __m128i, which is only defined in emmintrin.h, and
_mm_set_epi16 is also only SSE2 and later in emmintrin.h.
Even clang defines it in emmintrin.h and ends up with inlining
failure when calling _mm_loadu_si16 from sse,no-sse2 function.
So, isn't that a bug in the intrinsic guide instead?

2022-03-14  Jakub Jelinek  <jakub@redhat.com>

PR target/99754
* config/i386/emmintrin.h (_mm_loadu_si32): Put loaded value into
first  rather than last element of the vector, use __m32_u to do
a really unaligned load, use just 0 instead of (int)0.
(_mm_loadu_si16): Put loaded value into first rather than last
element of the vector, use __m16_u to do a really unaligned load,
use just 0 instead of (short)0.

* gcc.target/i386/pr99754-1.c: New test.
* gcc.target/i386/pr99754-2.c: New test.

2 years agoSpelling fix - cannott -> cannot [PR104899]
Jakub Jelinek [Mon, 14 Mar 2022 09:40:47 +0000 (10:40 +0100)]
Spelling fix - cannott -> cannot [PR104899]

This fixes typos and while changing that, also uses %< %> around attribute
names and fixes up formatting.

2022-03-14  Jakub Jelinek  <jakub@redhat.com>

PR other/104899
* config/bfin/bfin.cc (bfin_handle_longcall_attribute): Fix a typo
in diagnostic message - cannott -> cannot.  Use %< and %> around
names of attribute.  Avoid too long line.
* range-op.cc (operator_logical_and::op1_range): Fix up a typo
in comment - cannott -> cannot.  Use 2 spaces after . instead of one.

2 years agoDon't fold builtin into gimple when isa mismatches.
liuhongt [Thu, 24 Feb 2022 06:42:14 +0000 (14:42 +0800)]
Don't fold builtin into gimple when isa mismatches.

The patch fixes ICE in ix86_gimple_fold_builtin.

gcc/ChangeLog:

PR target/104666
* config/i386/i386-expand.cc
(ix86_check_builtin_isa_match): New func.
(ix86_expand_builtin): Move code to
ix86_check_builtin_isa_match and call it.
* config/i386/i386-protos.h
(ix86_check_builtin_isa_match): Declare.
* config/i386/i386.cc (ix86_gimple_fold_builtin): Don't fold
builtin into gimple when isa mismatches.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr104666.c: New test.

2 years agoDaily bump.
GCC Administrator [Mon, 14 Mar 2022 00:16:20 +0000 (00:16 +0000)]
Daily bump.

2 years agod: Merge upstream dmd 02a3fafc6, druntime 26b58167, phobos 16cb085b5.
Iain Buclaw [Sun, 13 Mar 2022 11:28:05 +0000 (12:28 +0100)]
d: Merge upstream dmd 02a3fafc6, druntime 26b58167, phobos 16cb085b5.

D front-end changes:

    - Import dmd v2.099.0.
    - The deprecation period for D1-style operators has ended, any use
      of the D1 overload operators will now result in a compiler error.
    - `scope' as a type constraint on class, struct, union, and enum
      declarations has been deprecated.
    - Fix segmentation fault when emplacing a new front-end Expression
      node during CTFE (PR104835).

D runtime changes:

    - Import druntime v2.099.0.
    - Fix C bindings for stdint types (PR104738).
    - Fix bus error when allocating new array on the GC (PR104742).
    - Fix bus error when allocating new pointer on the GC (PR104745).

Phobos changes:

    - Import phobos v2.099.0.
    - New function `bind' in `std.functional'.

gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd 02a3fafc6.
* dmd/VERSION: Update version to v2.099.0.
* imports.cc (ImportVisitor::visit (EnumDeclaration *)): Don't cache
decl in front-end AST node.
(ImportVisitor::visit (AggregateDeclaration *)): Likewise.
(ImportVisitor::visit (ClassDeclaration *)): Likewise.

libphobos/ChangeLog:

* libdruntime/MERGE: Merge upstream druntime 26b58167.
* src/MERGE: Merge upstream phobos 16cb085b5.

2 years agotexi + c-target.def: Fix typos
Tobias Burnus [Sun, 13 Mar 2022 09:23:07 +0000 (10:23 +0100)]
texi + c-target.def: Fix typos

gcc/c-family/ChangeLog:

* c-target.def (check_string_object_format_arg): Fix description typo.

gcc/ChangeLog:

* doc/invoke.texi: Fix typos.
* doc/tm.texi.in: Remove duplicated word.
* doc/tm.texi: Regenerate.

libgomp/ChangeLog:

* libgomp.texi: Fix typo.

2 years agoDaily bump.
GCC Administrator [Sun, 13 Mar 2022 00:16:20 +0000 (00:16 +0000)]
Daily bump.

2 years agoc++: naming a dependently-scoped template for CTAD [PR104641]
Patrick Palka [Sat, 12 Mar 2022 20:00:51 +0000 (15:00 -0500)]
c++: naming a dependently-scoped template for CTAD [PR104641]

In order to be able to perform CTAD for a dependently-scoped template
(such as A<T>::B in the testcase below), we need to permit a
typename-specifier to resolve to a template as per [dcl.type.simple]/3,
at least when it appears in a CTAD-enabled context.

This patch implements this using a new tsubst flag tf_tst_ok to control
when a TYPENAME_TYPE is allowed to name a template, and sets this flag
when substituting into the type of a CAST_EXPR, CONSTRUCTOR or VAR_DECL
(each of which is a CTAD-enabled context).

PR c++/104641

gcc/cp/ChangeLog:

* cp-tree.h (tsubst_flags::tf_tst_ok): New flag.
* decl.cc (make_typename_type): Allow a typename-specifier to
resolve to a template when tf_tst_ok, in which case return
a CTAD placeholder for the template.
* pt.cc (tsubst_decl) <case VAR_DECL>: Set tf_tst_ok when
substituting the type.
(tsubst): Clear tf_tst_ok and remember if it was set.
<case TYPENAME_TYPE>: Pass tf_tst_ok to make_typename_type
appropriately.
(tsubst_copy) <case CAST_EXPR>: Set tf_tst_ok when substituting
the type.
(tsubst_copy_and_build) <case CAST_EXPR>: Likewise.
<case CONSTRUCTOR>: Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/class-deduction107.C: New test.

2 years agoc++: ICE with bad conversion shortcutting [PR104622]
Patrick Palka [Sat, 12 Mar 2022 20:00:49 +0000 (15:00 -0500)]
c++: ICE with bad conversion shortcutting [PR104622]

When shortcutting bad argument conversions during overload resolution,
we assume conversions get computed in sequential order and that therefore
the conversion array is incomplete iff the last conversion is missing.
But this assumption turns out to be wrong for templates, because during
deduction check_non_deducible_conversion can compute an argument
conversion out of order.

So in the testcase below, at the end of add_template_candidate the
conversion array looks like {bad_conv, NULL, good_conv} where the last
conversion was computed during deduction and the first one later from
add_function_candidate.  We need to add this candidate to bad_fns since
not all of its argument conversions were computed, but we don't do so
because the last conversion isn't missing.

This patch fixes this by checking for a missing conversion exhaustively
instead.  In passing, this cleans up check_non_deducible_conversion given
that the only values of 'strict' we expect to see here the enumerators
of unification_kind_t.

PR c++/104622

gcc/cp/ChangeLog:

* call.cc (missing_conversion_p): Define.
(add_candidates): Use it.
* pt.cc (check_non_deducible_conversion): Change type of strict
parameter to unification_kind_t and directly test for DEDUCE_CALL.

gcc/testsuite/ChangeLog:

* g++.dg/template/conv18.C: New test.

2 years agoc++: return-type-req in constraint using only outer tparms [PR104527]
Patrick Palka [Sat, 12 Mar 2022 20:00:40 +0000 (15:00 -0500)]
c++: return-type-req in constraint using only outer tparms [PR104527]

Here the template context for the atomic constraint has two levels of
template parameters, but since it depends only on the innermost parameter
T we use a single-level argument vector (built by get_mapped_args) during
substitution into the atom.  We eventually pass this vector to
do_auto_deduction as part of checking the return-type-requirement within
the atom, but do_auto_deduction expects outer_targs to be a full set of
arguments for sake of satisfaction.

This patch fixes this by making get_mapped_args always return an
argument vector whose depth corresponds to the template depth of the
context in which the atomic constraint expression was written, instead
of the highest parameter level that the expression happens to use.

PR c++/104527

gcc/cp/ChangeLog:

* constraint.cc (normalize_atom): Set
ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P appropriately.
(get_mapped_args):  Make static, adjust parameters.  Always
return a vector whose depth corresponds to the template depth of
the context of the atomic constraint expression.  Micro-optimize
by passing false as exact to safe_grow_cleared and by collapsing
a multi-level depth-one argument vector.
(satisfy_atom): Adjust call to get_mapped_args and
diagnose_atomic_constraint.
(diagnose_atomic_constraint): Replace map parameter with an args
parameter.
* cp-tree.h (ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P): Define.
(get_mapped_args): Remove declaration.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-return-req4.C: New test.

This page took 0.127685 seconds and 5 git commands to generate.