Richard Biener [Thu, 5 Dec 2019 13:02:57 +0000 (13:02 +0000)]
re PR tree-optimization/92818 (Typo in vec_perm -> bit_insert pattern)
2019-12-05 Richard Biener <rguenther@suse.de>
PR middle-end/92818
* tree-ssa-forwprop.c (simplify_vector_constructor): Improve
heuristics on what don't care element to choose.
* match.pd (VEC_PERM_EXPR -> BIT_INSERT_EXPR): Fix typo.
Jonathan Wakely [Thu, 5 Dec 2019 12:46:50 +0000 (12:46 +0000)]
libstdc++: Define std::lexicographical_compare_three_way for C++20
* include/bits/stl_algobase.h (lexicographical_compare_three_way):
Define for C++20.
* testsuite/25_algorithms/lexicographical_compare_three_way/1.cc: New
test.
* testsuite/25_algorithms/lexicographical_compare_three_way/
constexpr.cc: New test.
Jakub Jelinek [Thu, 5 Dec 2019 09:04:24 +0000 (10:04 +0100)]
re PR target/92791 (ICE in extract_insn, at recog.c:2311 since r278645)
PR target/92791
* config/i386/i386.md (movstrict<mode>): Move test for
TARGET_PARTIAL_REG_STALL and not optimizing for size from
expander's condition to the body - FAIL; in that case.
Jakub Jelinek [Thu, 5 Dec 2019 09:03:34 +0000 (10:03 +0100)]
re PR fortran/92781 (ICE in convert_nonlocal_reference_op, at tree-nested.c:1065)
PR fortran/92781
* trans-decl.c (gfc_get_symbol_decl): If sym->backend_decl is
current_function_decl, add length to current rather than parent
function and expect DECL_CONTEXT (length) to be current_function_decl.
Jonathan Wakely [Thu, 5 Dec 2019 00:42:06 +0000 (00:42 +0000)]
libstdc++: Implement spaceship for std::array (P1614R2)
As done for std::pair, this defines operator<=> as a non-member function
template and does not alter operator==, as expected to be proposed as
the resolution to an unpublished LWG issue.
Instead of calling std::lexicographical_compare_three_way the <=>
overload is implemented by hand to take advantage of the fact the
element types and array sizes are known to be the same.
* include/bits/cpp_type_traits.h (__is_byte<char8_t>): Add
specialization.
* include/std/array (operator<=>): Likewise.
* testsuite/23_containers/array/comparison_operators/constexpr.cc:
Test three-way comparisons and arrays of unsigned char.
* testsuite/23_containers/array/tuple_interface/get_neg.cc: Adjust
dg-error line numbers.
Joseph Myers [Wed, 4 Dec 2019 23:26:10 +0000 (23:26 +0000)]
Fix C handling of use of lvalues of incomplete types (PR c/36941, PR c/88827).
Bug 88827 points out that GCC should not be rejecting C code that
dereferences a pointer to an incomplete type in the case that uses &*
to take the address of the resulting lvalue, because no constraint is
violated in that case (other than for C90 when the incomplete type is
unqualified void, which we already handle correctly) and as the lvalue
never gets converted to an rvalue there is no undefined behavior
either.
This means that the diagnostic for such a dereference is bogus and
should be removed; if the lvalue gets converted to an rvalue, there
should be an appropriate error later for the use of the incomplete
type. In most cases, there is, but bug 36941 points out the lack of a
diagnostic when the incomplete (non-void) type gets cast to void
(where a diagnostic seems appropriate for this undefined behavior as a
matter of quality of implementation).
This patch removes the bogus diagnostic (and C_TYPE_ERROR_REPORTED
which was only used in the code that is removed - only that one, bogus
diagnostic had this duplicate suppression, not any of the other, more
legitimate diagnostics for use of incomplete types) and makes
convert_lvalue_to_rvalue call require_complete_type for arguments not
of void types, so that all relevant code paths (possibly except some
for ObjC) get incomplete types diagnosed. It's possible that this
makes some other checks for incomplete types obsolete, but no attempt
is made to remove any such checks.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
PR c/36941
PR c/88827
gcc/c:
* c-typeck.c (convert_lvalue_to_rvalue): Call
require_complete_type for arguments not of void types.
(build_indirect_ref): Do not diagnose dereferencing pointers to
incomplete types.
* c-tree.h (C_TYPE_ERROR_REPORTED): Remove.
Peter Bergner [Wed, 4 Dec 2019 19:53:26 +0000 (19:53 +0000)]
Do not define builtins that overload disabled builtins.
PR bootstrap/92661
* config/rs6000/rs6000-c.c (struct altivec_builtin_types): Move to
rs6000.h.
(altivec_overloaded_builtins): Move to rs6000-call.c.
* config/rs6000/rs6000.h (struct altivec_builtin_types): Moved from
rs6000-c.c.
* config/rs6000/rs6000-call.c (rs6000_builtin_info): Make static.
(altivec_overloaded_builtins): Moved from rs6000-c.c.
(rs6000_common_init_builtins): Do no define builtins that overload
builtins that have been disabled.
Wilco Dijkstra [Wed, 4 Dec 2019 15:40:41 +0000 (15:40 +0000)]
[ARM] Improve max_cond_insns setting for Cortex cores
To enable cores to use the correct max_cond_insns setting, use the core-specific
tuning when a CPU/tune is selected unless -mrestrict-it is explicitly set.
On Cortex-A57 this gives 1.1% performance gain on SPECINT2006 as well as a
0.4% codesize reduction.
gcc/
* config/arm/arm.c (arm_option_override_internal):
Use max_cond_insns from CPU tuning unless -mrestrict-it is used.
Wilco Dijkstra [Wed, 4 Dec 2019 14:45:59 +0000 (14:45 +0000)]
[AArch64] Add support for fused compare and branch
Add support for fused compare with branch. Rename the existing
AARCH64_FUSE_CMP_BRANCH to ALU_BRANCH, and AARCH64_FUSE_ALU_BRANCH
to ALU_CBZ to make it clear what is being fused.
gcc/
* config/aarch64/aarch64.c
(thunderxt88_tunings): Use AARCH64_FUSE_ALU_BRANCH.
(thunderx_tunings): Likewise.
(tsv110_tunings): Use AARCH64_FUSE_ALU_BRANCH and AARCH64_FUSE_ALU_CBZ.
(thunderx2t99_tunings): Likewise.
(aarch_macro_fusion_pair_p): Add support for AARCH64_FUSE_CMP_BRANCH.
* config/aarch64/aarch64-fusion-pairs.def: Add ALU_CBZ fusion.
Richard Biener [Wed, 4 Dec 2019 13:21:39 +0000 (13:21 +0000)]
tree-ssa-sccvn.c (vn_reference_lookup_3): Properly guard empty CTOR and memset partial-def registering.
2019-12-04 Richard Biener <rguenther@suse.de>
* tree-ssa-sccvn.c (vn_reference_lookup_3): Properly guard
empty CTOR and memset partial-def registering. Take advantage
of fancy offset analysis in memset handling.
In r278410 I added code to handle VIEW_CONVERT_EXPRs between
variable-length vectors. This included support for decoding
a VECTOR_BOOLEAN_TYPE_P with subbyte elements.
However, it turns out that we were already mishandling such bool vectors
for fixed-length vectors: we treated each element as a stand-alone byte
instead of putting multiple elements into the same byte. I think in
principle this could have been an issue for AVX512 as well.
This patch adds encoding support for boolean vectors and reuses
a version of the new decode support for fixed-length vectors.
2019-12-04 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* fold-const.c (native_encode_vector_part): Handle
VECTOR_BOOLEAN_TYPE_Ps that have subbyte precision.
(native_decode_vector_tree): Delete, moving the bulk of the code to...
(native_interpret_vector_part): ...this new function. Use a pointer
and length instead of a vec<> and start index.
(native_interpret_vector): Use native_interpret_vector_part.
(fold_view_convert_vector_encoding): Likewise.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/general/whilelt_5.c: New test.
Richard Biener [Wed, 4 Dec 2019 12:23:58 +0000 (12:23 +0000)]
tree-ssa-sccvn.c (vn_walk_cb_data::push_partial_def): Handle non-constant defs in the most trivial way.
2019-12-04 Richard Biener <rguenther@suse.de>
* tree-ssa-sccvn.c (vn_walk_cb_data::push_partial_def): Handle
non-constant defs in the most trivial way.
(vn_reference_lookup_3): Also push down SSA partial defs.
[C++] Opt out of GNU vector extensions for built-in SVE types
This is the C++ equivalent of r277950. The changes are very similar
to there. Perhaps the only noteworthy thing (that I know of) is that
the patch continues to treat !gnu_vector_type_p vector types as literal
types/potential constexprs. Disabling the GNU vector extensions
shouldn't in itself stop the types from being literal types, since
whatever the target provides instead might be constexpr material.
2019-12-04 Richard Sandiford <richard.sandiford@arm.com>
gcc/cp/
* cp-tree.h (CP_AGGREGATE_TYPE_P): Check for gnu_vector_type_p
instead of VECTOR_TYPE.
* call.c (build_conditional_expr_1): Restrict vector handling
to vectors that satisfy gnu_vector_type_p.
* cvt.c (ocp_convert): Only allow vectors to be converted
to bool if they satisfy gnu_vector_type_p.
(build_expr_type_conversion): Only allow conversions from
vectors if they satisfy gnu_vector_type_p.
* typeck.c (cp_build_binary_op): Only allow binary operators to be
applied to vectors if they satisfy gnu_vector_type_p.
(cp_build_unary_op): Likewise unary operators.
(build_reinterpret_cast_1):
gcc/testsuite/
* g++.target/aarch64/sve/acle/general-c++/gnu_vectors_1.C: New test.
* g++.target/aarch64/sve/acle/general-c++/gnu_vectors_2.C: New test.
Kewen Lin [Wed, 4 Dec 2019 05:10:46 +0000 (05:10 +0000)]
[rs6000] Fix PR92760 by checking VECTOR_MEM_NONE_P instead
PR92760 exposed one issue that VECTOR_UNIT_NONE_P (V2DImode) is true on Power7
then we won't return it as preferred_simd_mode but ISA 2.06 (Power7) does
introduce partial support on vector doubleword (very limitted) and more basic
support origins from ISA 2.07 (Power8) though. To make vectorizer still
leverage those few but available V2DImode related instructions, we need to
claim it's available on VSX (Power7 and up).
gcc/ChangeLog
PR target/92760
* gcc/config/rs6000/rs6000.c (rs6000_preferred_simd_mode): Use
VECTOR_MEM_NONE_P instead of VECTOR_UNIT_NONE_P.
Jonathan Wakely [Tue, 3 Dec 2019 23:57:46 +0000 (23:57 +0000)]
libstdc++: Implement spaceship for std::pair (P1614R2)
This defines operator<=> as a non-member function template and does not
alter operator==. This contradicts the changes made by P1614R2, which
specify both as hidden friends, but that specification of operator<=> is
broken and the subject of a soon-to-be-published LWG issue.
* include/bits/stl_pair.h [__cpp_lib_three_way_comparison]
(operator<=>): Define for C++20.
* libsupc++/compare (__cmp2way_res_t): Rename to __cmp3way_res_t,
move into __detail namespace. Do not turn argument types into lvalues.
(__cmp3way_helper): Rename to __cmp3way_res_impl, move into __detail
namespace. Constrain with concepts instead of using void_t.
(compare_three_way_result): Adjust name of base class.
(compare_three_way_result_t): Use __cmp3way_res_impl directly.
(__detail::__3way_cmp_with): Add workaround for PR 91073.
(compare_three_way): Use workaround.
(__detail::__synth3way, __detail::__synth3way_t): Define new helpers
implementing synth-three-way and synth-three-way-result semantics.
* testsuite/20_util/pair/comparison_operators/constexpr_c++20.cc: New
test.
* g++.dg/cpp2a/srcloc1.C: New test.
* g++.dg/cpp2a/srcloc2.C: New test.
* g++.dg/cpp2a/srcloc3.C: New test.
* g++.dg/cpp2a/srcloc4.C: New test.
* g++.dg/cpp2a/srcloc5.C: New test.
* g++.dg/cpp2a/srcloc6.C: New test.
* g++.dg/cpp2a/srcloc7.C: New test.
* g++.dg/cpp2a/srcloc8.C: New test.
* g++.dg/cpp2a/srcloc9.C: New test.
* g++.dg/cpp2a/srcloc10.C: New test.
* g++.dg/cpp2a/srcloc11.C: New test.
* g++.dg/cpp2a/srcloc12.C: New test.
* g++.dg/cpp2a/srcloc13.C: New test.
* g++.dg/cpp2a/srcloc14.C: New test.
Jakub Jelinek [Tue, 3 Dec 2019 19:27:47 +0000 (20:27 +0100)]
re PR c++/91369 (Implement P0784R7: constexpr new)
PR c++/91369
* constexpr.c (struct constexpr_global_ctx): Add cleanups member,
initialize it in the ctor.
(cxx_eval_constant_expression) <case TARGET_EXPR>: If TARGET_EXPR_SLOT
is already in the values hash_map, don't evaluate it again. Put
TARGET_EXPR_SLOT into hash_map even if not lval, and push it into
save_exprs too. If there is TARGET_EXPR_CLEANUP and not
CLEANUP_EH_ONLY, push the cleanup to cleanups vector.
<case CLEANUP_POINT_EXPR>: Save outer cleanups, set cleanups to
local auto_vec, after evaluating the body evaluate cleanups and
restore previous cleanups.
<case TRY_CATCH_EXPR>: Don't crash if the first operand is NULL_TREE.
(cxx_eval_outermost_constant_expr): Set cleanups to local auto_vec,
after evaluating the expression evaluate cleanups.
Jan Hubicka [Tue, 3 Dec 2019 18:24:00 +0000 (19:24 +0100)]
Clear calls_comdat_local when comdat group is dissolved
while looking into Firefox inlining dumps I noticed that we often do not
inline because we think function calls comdat local while the comdat group
itself has been dissolved.
* cgraph.c (cgraph_node::verify_node): Check that calls_comdat_local
is set only for symbol in comdat group.
* symtab.c (symtab_node::dissolve_same_comdat_group_1): Clear it.
Even EXACT_DIV_EXPR doesn't distribute across addition for wrapping
types, so in general we can't fold EXACT_DIV_EXPRs of POLY_INT_CSTs
at compile time. This was causing an ICE when trying to gimplify the
element size field in an ARRAY_REF.
If the result of that EXACT_DIV_EXPR is an invariant, we don't bother
recording it in the ARRAY_REF and simply read the element size from the
element type. This avoids the overhead of doing:
/* ??? tree_ssa_useless_type_conversion will eliminate casts to
sizetype from another type of the same width and signedness. */
if (TREE_TYPE (aligned_size) != sizetype)
aligned_size = fold_convert_loc (loc, sizetype, aligned_size);
return size_binop_loc (loc, MULT_EXPR, aligned_size,
size_int (TYPE_ALIGN_UNIT (elmt_type)));
each time array_ref_element_size is called.
So rather than read array_ref_element_size, do some arithmetic on it,
and only then check whether the result is an invariant, we might as
well check whether the element size is an invariant to start with.
We're then directly testing whether array_ref_element_size gives
a reusable value.
For consistency, the patch makes the same change for the offset field
in a COMPONENT_REF, although I don't think that can trigger yet.
2019-12-03 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* gimplify.c (gimplify_compound_lval): Don't gimplify and install
an array element size if array_element_size is already an invariant.
Similarly don't gimplify and install a field offset if
component_ref_field_offset is already an invariant.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/general-c/struct_1.c: New test.
Mark constant-sized objects as addressable if they have poly-int accesses
If SVE code is written for a specific vector length, it might load from
or store to fixed-sized objects. This needs to work even without
-msve-vector-bits=N (which should never be needed for correctness).
There's no way of handling a direct poly-int sized reference to a
fixed-size register; it would have to go via memory. And in that
case it's more efficient to mark the fixed-size object as
addressable from the outset, like we do for array references
with non-constant indices.
2019-12-03 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* cfgexpand.c (discover_nonconstant_array_refs_r): If an access
with POLY_INT_CST size is made to a fixed-size object, force the
object to live in memory.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/general/deref_1.c: New test.
Marek Polacek [Tue, 3 Dec 2019 15:59:40 +0000 (15:59 +0000)]
PR c++/91363 - P0960R3: Parenthesized initialization of aggregates.
This patch implements C++20 P0960R3: Parenthesized initialization of aggregates
(<wg21.link/p0960>; see R0 for more background info). Essentially, if you have
an aggregate, you can now initialize it by (x, y), similarly to {x, y}. E.g.
struct A {
int x, y;
// no A(int, int) ctor (see paren-init14.C for = delete; case)
};
A a(1, 2);
The difference between ()-init and {}-init is that narrowing conversions are
permitted, designators are not permitted, a temporary object bound to
a reference does not have its lifetime extended, and there is no brace elision.
Further, things like
int a[](1, 2, 3); // will deduce the array size
const A& r(1, 2.3, 3); // narrowing is OK
int (&&rr)[](1, 2, 3);
int b[3](1, 2); // b[2] will be value-initialized
now work as expected. Note that
char f[]("fluff");
has always worked and this patch keeps it that way. Also note that A a((1, 2))
is not the same as A a{{1,2}}; the inner (1, 2) remains a COMPOUND_EXPR.
The approach I took was to handle (1, 2) similarly to {1, 2} -- conjure up
a CONSTRUCTOR, and introduce LOOKUP_AGGREGATE_PAREN_INIT to distinguish
between the two. This kind of initialization is only supported in C++20;
I've made no attempt to support it in earlier standards, like we don't
support CTAD pre-C++17, for instance.
* c-cppbuiltin.c (c_cpp_builtins): Predefine
__cpp_aggregate_paren_init=201902 for -std=c++2a.
* call.c (build_new_method_call_1): Handle parenthesized initialization
of aggregates by building up a CONSTRUCTOR.
(extend_ref_init_temps): Do nothing for CONSTRUCTOR_IS_PAREN_INIT.
* cp-tree.h (CONSTRUCTOR_IS_PAREN_INIT, LOOKUP_AGGREGATE_PAREN_INIT):
Define.
* decl.c (grok_reference_init): Handle aggregate initialization from
a parenthesized list of values.
(reshape_init): Do nothing for CONSTRUCTOR_IS_PAREN_INIT.
(check_initializer): Handle initialization of an array from a
parenthesized list of values. Use NULL_TREE instead of NULL.
* tree.c (build_cplus_new): Handle BRACE_ENCLOSED_INITIALIZER_P.
* typeck2.c (digest_init_r): Set LOOKUP_AGGREGATE_PAREN_INIT if it
receives a CONSTRUCTOR with CONSTRUCTOR_IS_PAREN_INIT set. Allow
narrowing when LOOKUP_AGGREGATE_PAREN_INIT.
(massage_init_elt): Don't lose LOOKUP_AGGREGATE_PAREN_INIT when passing
flags to digest_init_r.
* g++.dg/cpp0x/constexpr-99.C: Only expect an error in C++17 and
lesser.
* g++.dg/cpp0x/explicit7.C: Likewise.
* g++.dg/cpp0x/initlist12.C: Adjust dg-error.
* g++.dg/cpp0x/pr31437.C: Likewise.
* g++.dg/cpp2a/feat-cxx2a.C: Add __cpp_aggregate_paren_init test.
* g++.dg/cpp2a/paren-init1.C: New test.
* g++.dg/cpp2a/paren-init10.C: New test.
* g++.dg/cpp2a/paren-init11.C: New test.
* g++.dg/cpp2a/paren-init12.C: New test.
* g++.dg/cpp2a/paren-init13.C: New test.
* g++.dg/cpp2a/paren-init14.C: New test.
* g++.dg/cpp2a/paren-init15.C: New test.
* g++.dg/cpp2a/paren-init16.C: New test.
* g++.dg/cpp2a/paren-init17.C: New test.
* g++.dg/cpp2a/paren-init18.C: New test.
* g++.dg/cpp2a/paren-init19.C: New test.
* g++.dg/cpp2a/paren-init2.C: New test.
* g++.dg/cpp2a/paren-init3.C: New test.
* g++.dg/cpp2a/paren-init4.C: New test.
* g++.dg/cpp2a/paren-init5.C: New test.
* g++.dg/cpp2a/paren-init6.C: New test.
* g++.dg/cpp2a/paren-init7.C: New test.
* g++.dg/cpp2a/paren-init8.C: New test.
* g++.dg/cpp2a/paren-init9.C: New test.
* g++.dg/ext/desig10.C: Adjust dg-error.
* g++.dg/template/crash107.C: Likewise.
* g++.dg/template/crash95.C: Likewise.
* g++.old-deja/g++.jason/crash3.C: Likewise.
* g++.old-deja/g++.law/ctors11.C: Likewise.
* g++.old-deja/g++.law/ctors9.C: Likewise.
* g++.old-deja/g++.mike/net22.C: Likewise.
* g++.old-deja/g++.niklas/t128.C: Likewise.
Andrew Stubbs [Tue, 3 Dec 2019 12:53:53 +0000 (12:53 +0000)]
Enable OpenACC GCN testing.
2019-12-03 Andrew Stubbs <ams@codesourcery.com>
libgomp/
* testsuite/lib/libgomp.exp (offload_target_to_openacc_device_type):
Recognize amdgcn.
(check_effective_target_openacc_amdgcn_accel_present): New proc.
(check_effective_target_openacc_amdgcn_accel_selected): New proc.
* testsuite/libgomp.oacc-c++/c++.exp: Add support for amdgcn.
* testsuite/libgomp.oacc-c/c.exp: Likewise.
* testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
Richard Biener [Tue, 3 Dec 2019 10:46:52 +0000 (10:46 +0000)]
re PR tree-optimization/92751 (VN partial def support confused about clobbers)
2019-12-03 Richard Biener <rguenther@suse.de>
PR tree-optimization/92751
* tree-ssa-sccvn.c (vn_walk_cb_data::push_partial_def): Fail
when a clobber ends up in the partial-def vector.
(vn_reference_lookup_3): Let clobbers be handled by the
assignment from CTOR handling.
Eric Botcazou [Tue, 3 Dec 2019 10:23:06 +0000 (10:23 +0000)]
utils.c (potential_alignment_gap): Delete.
* gcc-interface/utils.c (potential_alignment_gap): Delete.
(rest_of_record_type_compilation): Do not call above function. Use
the alignment of the field instead of that of its type, if need be.
When the original field has variable size, always lower the alignment
of the pointer type. Reset the bit-field status of the new field if
it does not encode a bit-field.
Eric Botcazou [Tue, 3 Dec 2019 10:06:15 +0000 (10:06 +0000)]
decl.c (gnat_to_gnu_subprog_type): With the Copy-In/ Copy-Out mechanism...
* gcc-interface/decl.c (gnat_to_gnu_subprog_type): With the Copy-In/
Copy-Out mechanism, do not promote the mode of the return type to an
integral mode if it contains a field on a non-integral type and even
demote it for 64-bit targets.
Jonathan Wakely [Tue, 3 Dec 2019 09:51:49 +0000 (09:51 +0000)]
libstdc++: Fix copyright date on new test header
The slow_clock type was introduced to the testsuite in 2018 in the
testsuite/30_threads/condition_variable/members/2.cc test, so the new
header should have that date.
Jason Merrill [Tue, 3 Dec 2019 08:20:18 +0000 (09:20 +0100)]
re PR c++/92705 (ICE: Segmentation fault (in build_new_op_1))
PR c++/92705
* call.c (strip_standard_conversion): New function.
(build_new_op_1): Use it for user_conv_p.
(compare_ics): Likewise.
(source_type): Likewise.
Jakub Jelinek [Tue, 3 Dec 2019 08:19:04 +0000 (09:19 +0100)]
re PR c++/92695 (P1064R0 - virtual constexpr fails if object taken from array)
PR c++/92695
* constexpr.c (cxx_bind_parameters_in_call): For virtual calls,
adjust the first argument to point to the derived object rather
than its base.
Joseph Myers [Tue, 3 Dec 2019 01:27:43 +0000 (01:27 +0000)]
Diagnose use of [*] in old-style parameter definitions (PR c/88704).
GCC wrongly accepts [*] in old-style parameter definitions because
because parm_flag is set on the scope used for those definitions and,
unlike the case of a prototype in a function definition, there is no
subsequent check to disallow this invalid usage. This patch adds such
a check. (At this point we don't have location information for the
[*], so the diagnostic location isn't ideal.)
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
PR c/88704
gcc/c:
* c-decl.c (store_parm_decls_oldstyle): Diagnose use of [*] in
old-style parameter definitions.
Jakub Jelinek [Tue, 3 Dec 2019 00:33:27 +0000 (01:33 +0100)]
inline-crossmodule-1_0.C: Use -fdump-ipa-inline-details instead of -fdump-ipa-inline.
* g++.dg/lto/inline-crossmodule-1_0.C: Use -fdump-ipa-inline-details
instead of -fdump-ipa-inline. Use "inline" instead of "inlined" as
last argument to scan-wpa-ipa-dump-times, use \\\( and \\\) instead of
( and ) in the regex.
Tighten check for vector types in fold_convertible_p (PR 92741)
In this PR, IPA-CP was misled into using NOP_EXPR rather than
VIEW_CONVERT_EXPR to reinterpret a vector of 4 shorts as a vector
of 2 ints. This tripped the tree-cfg.c assert I'd added in r278245.
2019-12-02 Richard Sandiford <richard.sandiford@arm.com>
[AArch64] Catch attempts to use SVE types when SVE is disabled
This patch reports an error if code tries to use variable-length
SVE types when SVE is disabled. We already report a similar error
for definitions or uses of SVE functions when SVE is disabled.
2019-12-02 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_report_sve_required): New function.
(aarch64_expand_mov_immediate): Use it when attempting to measure
the length of an SVE vector.
(aarch64_mov_operand_p): Only allow SVE CNT immediates when
SVE is enabled.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/general/nosve_4.c: New test.
* gcc.target/aarch64/sve/acle/general/nosve_5.c: Likewise.
* gcc.target/aarch64/sve/pcs/nosve_4.c: Expected a second error
for the copy.
* gcc.target/aarch64/sve/pcs/nosve_5.c: Likewise.
* gcc.target/aarch64/sve/pcs/nosve_6.c: Likewise.
Now that the C frontend can cope with POLY_INT_CST-length initialisers,
we can make aarch64-sve-acle.exp run the full set of tests. This will
introduce new failures for -mabi=ilp32; I'll make the testsuite ILP32
clean separately.
2019-12-02 Richard Sandiford <richard.sandiford@arm.com>
gcc/testsuite/
* gcc.target/aarch64/sve/acle/aarch64-sve-acle.exp: Run the
general/* tests too.
[AArch64] Add a couple of SVE ACLE comparison folds
When writing vector-length specific SVE code, it's useful to be able
to store an svbool_t predicate in a GNU vector of unsigned chars.
This patch makes sure that there is no overhead when converting
to that form and then immediately reading it back again.
2019-12-02 Richard Sandiford <richard.sandiford@arm.com>
Sudakshina Das [Mon, 2 Dec 2019 17:21:35 +0000 (17:21 +0000)]
[Committed][Arm][testsuite] Fix failure for arm-fp16-ops-*.C
Since r275022 which deprecates some uses of volatile, all arm-fp16-ops-*.C
were failing with warnings of deprecated valatile uses on arm-none-eabi and
arm-none-linux-gnueabihf. This patch removes the volatile declarations from
the header. Since none of the tests are run with any high optimization levels,
this should change should not prevent the real function of the tests.
Mike Crowe [Mon, 2 Dec 2019 16:23:14 +0000 (16:23 +0000)]
libstdc++: Fix try_lock_until and try_lock_shared_until on arbitrary clock
This is the equivalent to PR libstdc++/91906, but for shared_mutex.
A non-standard clock may tick more slowly than std::chrono::steady_clock.
This means that we risk returning false early when the specified timeout
may not have expired. This can be avoided by looping until the timeout time
as reported by the non-standard clock has been reached.
Unfortunately, we have no way to tell whether the non-standard clock ticks
more quickly that std::chrono::steady_clock. If it does then we risk
returning later than would be expected, but that is unavoidable without
waking up periodically to check, which would be rather too expensive.
François Dumont pointed out[1] a flaw in an earlier version of this patch
that revealed a hole in the test coverage, so I've added a new test that
try_lock_until acts as try_lock if the timeout has already expired.
Fix try_lock_until and try_lock_shared_until on arbitrary clock
* include/std/shared_mutex (shared_timed_mutex::try_lock_until)
(shared_timed_mutex::try_lock_shared_until): Loop until the absolute
timeout time is reached as measured against the appropriate clock.
* testsuite/30_threads/shared_timed_mutex/try_lock_until/1.cc: New
file. Test try_lock_until and try_lock_shared_until timeouts against
various clocks.
* testsuite/30_threads/shared_timed_mutex/try_lock_until/1.cc: New
file. Test try_lock_until and try_lock_shared_until timeouts against
various clocks.
Mike Crowe [Mon, 2 Dec 2019 16:23:10 +0000 (16:23 +0000)]
libstdc++: Add full steady_clock support to shared_timed_mutex
The pthread_rwlock_clockrdlock and pthread_rwlock_clockwrlock functions
were added to glibc in v2.30. They have also been added to Android
Bionic. If these functions are available in the C library then they can
be used to implement shared_timed_mutex::try_lock_until,
shared_timed_mutex::try_lock_for,
shared_timed_mutex::try_lock_shared_until and
shared_timed_mutex::try_lock_shared_for so that they are no longer
unaffected by the system clock being warped. (This is the shared_mutex
equivalent of PR libstdc++/78237 for mutex.)
If the new functions are available then steady_clock is deemed to be the
"best" clock available which means that it is used for the relative
try_lock_for calls and absolute try_lock_until calls using steady_clock
and user-defined clocks. It's not possible to have
_GLIBCXX_USE_PTHREAD_RWLOCK_CLOCKLOCK defined without
_GLIBCXX_USE_PTHREAD_RWLOCK_T, so the requirement that the clock be the
same as condition_variable is maintained. Calls explicitly using
system_clock (aka high_resolution_clock) continue to use CLOCK_REALTIME
via the old pthread_rwlock_timedrdlock and pthread_rwlock_timedwrlock
functions.
If the new functions are not available then system_clock is deemed to be
the "best" clock available which means that the previous suboptimal
behaviour remains.
Additionally, the user-defined clock used with
shared_timed_mutex::try_lock_for and shared_mutex::try_lock_shared_for
may have higher precision than __clock_t. We may need to round the
duration up to ensure that the timeout is long enough. (See
__timed_mutex_impl::_M_try_lock_for)
2019-12-02 Mike Crowe <mac@mcrowe.com>
Add full steady_clock support to shared_timed_mutex
* acinclude.m4 (GLIBCXX_CHECK_PTHREAD_RWLOCK_CLOCKLOCK): Define
to check for the presence of both pthread_rwlock_clockrdlock and
pthread_rwlock_clockwrlock.
* config.h.in: Regenerate.
* configure.ac: Call GLIBCXX_USE_PTHREAD_RWLOCK_CLOCKLOCK.
* configure: Regenerate.
* include/std/shared_mutex (shared_timed_mutex): Define __clock_t as
the best clock to use for relative waits.
(shared_timed_mutex::try_lock_for) Round up wait duration if necessary.
(shared_timed_mutex::try_lock_shared_for): Likewise.
(shared_timed_mutex::try_lock_until): Use existing try_lock_until
implementation for system_clock (which matches __clock_t when
_GLIBCCXX_USE_PTHREAD_RWLOCK_CLOCKLOCK is not defined). Add new
overload for steady_clock that uses pthread_rwlock_clockwrlock if it
is available. Simplify overload for non-standard clock to just call
try_lock_for with a relative timeout.
(shared_timed_mutex::try_lock_shared_until): Likewise.
Mike Crowe [Mon, 2 Dec 2019 16:23:06 +0000 (16:23 +0000)]
libstdc++: Fix timed_mutex::try_lock_until on arbitrary clock (PR 91906)
A non-standard clock may tick more slowly than
std::chrono::steady_clock. This means that we risk returning false
early when the specified timeout may not have expired. This can be
avoided by looping until the timeout time as reported by the
non-standard clock has been reached.
Unfortunately, we have no way to tell whether the non-standard clock
ticks more quickly that std::chrono::steady_clock. If it does then we
risk returning later than would be expected, but that is unavoidable and
permitted by the standard.
2019-12-02 Mike Crowe <mac@mcrowe.com>
PR libstdc++/91906 Fix timed_mutex::try_lock_until on arbitrary clock
* include/std/mutex (__timed_mutex_impl::_M_try_lock_until): Loop
until the absolute timeout time is reached as measured against the
appropriate clock.
* testsuite/util/slow_clock.h: New file. Move implementation of
slow_clock test class.
* testsuite/30_threads/condition_variable/members/2.cc: Include
slow_clock from header.
* testsuite/30_threads/shared_timed_mutex/try_lock/3.cc: Convert
existing test to templated function so that it can be called with
both system_clock and steady_clock.
* testsuite/30_threads/timed_mutex/try_lock_until/3.cc: Also run test
using slow_clock to test above fix.
* testsuite/30_threads/recursive_timed_mutex/try_lock_until/3.cc:
Likewise.
* testsuite/30_threads/recursive_timed_mutex/try_lock_until/4.cc: Add
new test that try_lock_until behaves as try_lock if the timeout has
already expired or exactly matches the current time.
Mike Crowe [Mon, 2 Dec 2019 16:23:01 +0000 (16:23 +0000)]
libstdc++: PR 78237 Add full steady_clock support to timed_mutex
The pthread_mutex_clocklock function is available in glibc since the
2.30 release. If this function is available in the C library it can be
used to fix PR libstdc++/78237 by supporting steady_clock properly with
timed_mutex.
This means that code using timed_mutex::try_lock_for or
timed_mutex::wait_until with steady_clock is no longer subject to timing
out early or potentially waiting for much longer if the system clock is
warped at an inopportune moment.
If pthread_mutex_clocklock is available then steady_clock is deemed to
be the "best" clock available which means that it is used for the
relative try_lock_for calls and absolute try_lock_until calls using
steady_clock and user-defined clocks. Calls explicitly using
system_clock (aka high_resolution_clock) continue to use CLOCK_REALTIME
via __gthread_cond_timedwait.
If pthread_mutex_clocklock is not available then system_clock is deemed
to be the "best" clock available which means that the previous
suboptimal behaviour remains.
2019-12-02 Mike Crowe <mac@mcrowe.com>
PR libstdc++/78237 Add full steady_clock support to timed_mutex
* acinclude.m4 (GLIBCXX_CHECK_PTHREAD_MUTEX_CLOCKLOCK): Define to
detect presence of pthread_mutex_clocklock function.
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Call GLIBCXX_CHECK_PTHREAD_MUTEX_CLOCKLOCK.
* include/std/mutex (__timed_mutex_impl): Remove unnecessary __clock_t.
(__timed_mutex_impl::_M_try_lock_for): Use best clock to turn relative
timeout into absolute timeout.
(__timed_mutex_impl::_M_try_lock_until): Keep existing implementation
for system_clock. Add new implementation for steady_clock that calls
_M_clocklock. Modify overload for user-defined clock to use a relative
wait so that it automatically uses the best clock.
[_GLIBCXX_USE_PTHREAD_MUTEX_CLOCKLOCK] (timed_mutex::_M_clocklock):
New member function.
(recursive_timed_mutex::_M_clocklock): Likewise.
Mike Crowe [Mon, 2 Dec 2019 16:22:53 +0000 (16:22 +0000)]
libstdc++: Improve tests for try_lock_until members of mutex types
2019-12-02 Mike Crowe <mac@mcrowe.com>
* testsuite/30_threads/recursive_timed_mutex/try_lock_until/3.cc:
New test. Ensure that timed_mutex::try_lock_until actually times out
after the specified time when using both system_clock and
steady_clock.
* testsuite/30_threads/timed_mutex/try_lock_until/3.cc: New test.
Likewise but for recursive_timed_mutex.
* testsuite/30_threads/timed_mutex/try_lock_until/57641.cc: Template
test functions and use them to test both steady_clock and system_clock.
* testsuite/30_threads/unique_lock/locking/4.cc: Likewise. Wrap call
to timed_mutex::try_lock_until in VERIFY macro to check its return
value.
Jakub Jelinek [Mon, 2 Dec 2019 08:51:49 +0000 (09:51 +0100)]
re PR tree-optimization/92712 (Performance regression with assumed values)
PR tree-optimization/92712
* match.pd ((A * B) +- A -> (B +- 1) * A,
A +- (A * B) -> (1 +- B) * A): Allow optimizing signed integers
even when we don't know anything about range of A, but do know
something about range of B and the simplification won't introduce
new UB.
* gcc.dg/tree-ssa/pr92712-1.c: New test.
* gcc.dg/tree-ssa/pr92712-2.c: New test.
* gcc.dg/tree-ssa/pr92712-3.c: New test.
* gfortran.dg/loop_versioning_1.f90: Adjust expected number of
likely to be innermost dimension messages.
* gfortran.dg/loop_versioning_10.f90: Likewise.
* gfortran.dg/loop_versioning_6.f90: Likewise.
Feng Xue [Mon, 2 Dec 2019 06:37:30 +0000 (06:37 +0000)]
Enable recursive function versioning
2019-12-02 Feng Xue <fxue@os.amperecomputing.com>
PR ipa/92133
* doc/invoke.texi (ipa-cp-max-recursive-depth): Document new option.
(ipa-cp-min-recursive-probability): Likewise.
* params.opt (ipa-cp-max-recursive-depth): New.
(ipa-cp-min-recursive-probability): Likewise.
* ipa-cp.c (ipcp_lattice<valtype>::add_value): Add two new parameters
val_p and unlimited.
(self_recursively_generated_p): New function.
(get_val_across_arith_op): Likewise.
(propagate_vals_across_arith_jfunc): Add constant propagation for
self-recursive function.
(incorporate_penalties): Do not penalize pure self-recursive function.
(good_cloning_opportunity_p): Dump node_is_self_scc flag.
(propagate_constants_topo): Set node_is_self_scc flag for cgraph node.
(get_info_about_necessary_edges): Relax hotness check for edge to
self-recursive function.
* ipa-prop.h (ipa_node_params): Add new field node_is_self_scc.
2019-12-02 Feng Xue <fxue@os.amperecomputing.com>
PR ipa/92133
* gcc.dg/ipa/ipa-clone-2.c: New test.
Fix bugs relating to flexibly-sized objects in nios2 backend.
PR target/92499
gcc/c/
* c-decl.c (flexible_array_type_p): Move to common code.
gcc/
* config/nios2/nios2.c (nios2_in_small_data_p): Do not consider
objects of flexible types to be small if they have internal linkage
or are declared extern.
* config/nios2/nios2.h (ASM_OUTPUT_ALIGNED_LOCAL): Replace with...
(ASM_OUTPUT_ALIGNED_DECL_LOCAL): ...this. Use targetm.in_small_data_p
instead of the size of the object initializer.
* tree.c (flexible_array_type_p): Move from C front end, and
generalize to handle fields in non-C structures.
* tree.h (flexible_array_type_p): Declare.
Luo Xiong Hu [Mon, 2 Dec 2019 01:59:26 +0000 (01:59 +0000)]
PR92398: Fix testcase failure of pr72804.c
P9LE generated instruction is not worse than P8LE.
mtvsrdd;xxlnot;stxv vs. not;not;std;std.
It can have longer latency, but latency via memory is not so critical,
and this does save decode and other resources. It's hard to choose
which is best. Update the test case to fix failures.
gcc/testsuite/ChangeLog:
2019-12-02 Luo Xiong Hu <luoxhu@linux.ibm.com>
PR testsuite/92398
* gcc.target/powerpc/pr72804.c: Split the store function to...
* gcc.target/powerpc/pr92398.h: ... this one. New.
* gcc.target/powerpc/pr92398.p9+.c: New.
* gcc.target/powerpc/pr92398.p9-.c: New.
* lib/target-supports.exp (check_effective_target_p8): New.
(check_effective_target_p9+): New.
Jan Hubicka [Sun, 1 Dec 2019 15:12:52 +0000 (16:12 +0100)]
profile-count.h (profile_count::operator<): Use IPA value for comparsion.
* profile-count.h (profile_count::operator<): Use IPA value for
comparsion.
(profile_count::operator>): Likewise.
(profile_count::operator<=): Likewise.
(profile_count::operator>=): Likewise.
* predict.c (maybe_hot_count_p): Do not convert to gcov_type.
[C] Add a target hook that allows targets to verify type usage
This patch adds a new target hook to check whether there are any
target-specific reasons why a type cannot be used in a certain
source-language context. It works in a similar way to existing
hooks like TARGET_INVALID_CONVERSION and TARGET_INVALID_UNARY_OP.
The reason for adding the hook is to report invalid uses of SVE types.
Throughout a TU, the SVE vector and predicate types represent values
that can be stored in an SVE vector or predicate register. At certain
points in the TU we might be able to generate code that assumes the
registers have a particular size, but often we can't. In some cases
we might even make multiple different assumptions in the same TU
(e.g. when implementing an ifunc for multiple vector lengths).
But SVE types themselves are the same type throughout. The register
size assumptions change how we generate code, but they don't change
the definition of the types.
This means that the types do not have a fixed size at the C level
even when -msve-vector-bits=N is in effect. It also means that the
size does not work in the same way as for C VLAs, where the abstract
machine evaluates the size at a particular point and then carries that
size forward to later code.
The SVE ACLE deals with this by making it invalid to use C and C++
constructs that depend on the size or layout of SVE types. The spec
refers to the types as "sizeless" types and defines their semantics as
edits to the standards. See:
However, since all current sizeless types are target-specific built-in
types, there's no real reason for the frontends to handle them directly.
They can just hand off the checks to target code instead. It's then
possible for the errors to refer to "SVE types" rather than "sizeless
types", which is likely to be more meaningful to users.
There is a slight overlap between the new tests and the ones for
gnu_vector_type_p in r277950, but here the emphasis is on testing
sizelessness.
2019-11-30 Richard Sandiford <richard.sandiford@arm.com>
gcc/c-family/
* c-common.c (pointer_int_sum): Use verify_type_context to check
whether the target allows pointer arithmetic for the types involved.
(c_sizeof_or_alignof_type, c_alignof_expr): Use verify_type_context
to check whether the target allows sizeof and alignof operations
for the types involved.
gcc/c/
* c-decl.c (start_decl): Allow initialization of variables whose
size is a POLY_INT_CST.
(finish_decl): Use verify_type_context to check whether the target
allows variables with a particular type to have static or thread-local
storage duration. Don't raise a second error if such variables do
not have a constant size.
(grokdeclarator): Use verify_type_context to check whether the
target allows fields or array elements to have a particular type.
* c-typeck.c (pointer_diff): Use verify_type_context to test whether
the target allows pointer difference for the types involved.
(build_unary_op): Likewise for pointer increment and decrement.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/general-c/sizeless-1.c: New test.
* gcc.target/aarch64/sve/acle/general-c/sizeless-2.c: Likewise.
Jan Hubicka [Sat, 30 Nov 2019 16:56:45 +0000 (17:56 +0100)]
cgraph.c (cgraph_node::dump): Dump unit_id and merged_extern_inline.
* cgraph.c (cgraph_node::dump): Dump unit_id and merged_extern_inline.
* cgraph.h (cgraph_node): Add unit_id and
merged_extern_inline.
(symbol_table): Add max_unit.
(symbol_table::symbol_table): Initialize it.
* cgraphclones.c (duplicate_thunk_for_node): Copy unit_id.
merged_comdat, merged_extern_inline.
(cgraph_node::create_clone): Likewise.
(cgraph_node::create_version_clone): Likewise.
* ipa-fnsummary.c (dump_ipa_call_summary): Dump info about cross module
calls.
* ipa-fnsummary.h (cross_module_call_p): New inline function.
* ipa-inline-analyssi.c (simple_edge_hints): Use it.
* ipa-inline.c (inline_small_functions): Likewise.
* lto-symtab.c (lto_cgraph_replace_node): Record merged_extern_inline;
copy merged_comdat and merged_extern_inline.
* lto-cgraph.c (lto_output_node): Stream out merged_comdat,
merged_extern_inline and unit_id.
(input_overwrite_node): Stream in these.
(input_cgraph_1): Set unit_base.
* lto-streamer.h (lto_file_decl_data): Add unit_base.
* symtab.c (symtab_node::make_decl_local): Record former_comdat.
* g++.dg/lto/inline-crossmodule-1.h: New testcase.
* g++.dg/lto/inline-crossmodule-1_0.C: New testcase.
* g++.dg/lto/inline-crossmodule-1_1.C: New testcase.
driver: Do not warn about ineffective `-x' option if no inputs were given
Fix an issue with the GCC driver and the `-x' option where a warning is
issued in an invocation like:
$ riscv64-linux-gnu-gcc -print-multi-directory -x c++
riscv64-linux-gnu-gcc: warning: '-x c++' after last input file has no effect
lib64/lp64d
$
where no inputs were given and hence the use of `-x' is irrelevant.
The statement printed is also untrue as the `-x' does not come after the
last input file given that none was given. Do not print it then if no
inputs were supplied.
* gcc.c (process_command): Only warn about an ineffective `-x'
option if any input files have actually been supplied.