Jonathan Wakely [Fri, 29 Apr 2022 11:17:13 +0000 (12:17 +0100)]
libstdc++: Add missing exports for ppc64le --with-long-double-format=ibm [PR105417]
The --with-long-double-abi=ibm build is missing some exports that are
present in the --with-long-double-abi=ieee build. Those symbols never
should have been exported at all, but now that they have been, they
should be exported consistently by both ibm and ieee.
This simply defines them as aliases for equivalent symbols that are
already present. The abi-tag on num_get::_M_extract_int isn't really
needed, because it only uses a std::string as a local variable, not in
the return type or function parameters, so it's safe to define the
_M_extract_int[abi:cxx11] symbols as aliases for the corresponding
function without the abi-tag.
This causes some new symbols to be added to the GLIBCXX_3.4.29 version
for the ibm long double build mode, but there is no advantage to adding
them to 3.4.30 for that build. That would just create more
inconsistencies.
libstdc++-v3/ChangeLog:
PR libstdc++/105417
* config/abi/post/powerpc64-linux-gnu/baseline_symbols.txt:
Regenerate.
* src/c++11/compatibility-ldbl-alt128.cc [_GLIBCXX_USE_DUAL_ABI]:
Define __gnu_ieee128::num_get<C>::_M_extract_int[abi:cxx11]<I>
symbols as aliases for corresponding symbols without abi-tag.
Jason Merrill [Thu, 31 Mar 2022 22:15:24 +0000 (18:15 -0400)]
c++: dump alias-declaration scope
An alias can't be declared with a qualified-id in actual code, but in
diagnostics we want to know which scope it belongs to, and I think a
nested-name-specifier is the best way to provide that.
Jakub Jelinek [Fri, 29 Apr 2022 11:50:10 +0000 (13:50 +0200)]
c++: Improve diagnostics for template args terminated with >= or >>= [PR104319]
As mentioned in the PR, for C++98 we have diagnostics that expect
>> terminating template arguments to be a mistake for > > (C++11
said it has to be treated that way), while if user trying to spare the
spacebar doesn't separate > from following = or >> from following =,
the diagnostics is confusing, while clang suggests adding space in between.
The following patch does that for >= and >>= too.
For some strange reason the error recovery emits further errors,
not really sure what's going on because I overwrite the token->type
like the code does for the C++11 >> case or for the C++98 >> cases,
but at least the first error is nicer (well, for the C++98 nested
template case and >>= I need to overwrite it to > and so the = is lost,
so perhaps some follow-up errors are needed for that case).
2022-04-29 Jakub Jelinek <jakub@redhat.com>
PR c++/104319
* parser.cc (cp_parser_template_argument): Treat >= like C++98 >>
after a type id by setting maybe_type_id and aborting tentative
parse.
(cp_parser_enclosed_template_argument_list): Handle
CPP_GREATER_EQ like misspelled CPP_GREATER CPP_RQ and
CPP_RSHIFT_EQ like misspelled CPP_GREATER CPP_GREATER_EQ
or CPP_RSHIFT CPP_EQ or CPP_GREATER CPP_GREATER CPP_EQ.
(cp_parser_next_token_ends_template_argument_p): Return true
also for CPP_GREATER_EQ and CPP_RSHIFT_EQ.
* g++.dg/parse/template28.C: Adjust expected diagnostics.
* g++.dg/parse/template30.C: New test.
Richard Biener [Mon, 11 Apr 2022 10:18:48 +0000 (12:18 +0200)]
Fix is_gimple_condexpr vs is_gimple_condexpr_for_cond
The following fixes wrongly used is_gimple_condexpr and makes
canonicalize_cond_expr_cond honor either, delaying final checking
to callers where all but two in ifcombine are doing the correct
thing already.
This fixes bugs but is now mainly in preparation for making
COND_EXPRs in GIMPLE assignments no longer have a GENERIC expression
as condition operand like we already transitioned VEC_COND_EXPR earlier.
2022-04-11 Richard Biener <rguenther@suse.de>
* gimple-expr.cc (is_gimple_condexpr): Adjust comment.
(canonicalize_cond_expr_cond): Move here from gimple.cc,
allow both COND_EXPR and GIMPLE_COND forms.
* gimple-expr.h (canonicalize_cond_expr_cond): Declare.
* gimple.cc (canonicalize_cond_expr_cond): Remove here.
* gimple.h (canonicalize_cond_expr_cond): Likewise.
* gimple-loop-versioning.cc (loop_versioning::version_loop):
Use is_gimple_condexpr_for_cond.
* tree-parloops.cc (gen_parallel_loop): Likewise.
* tree-ssa-ifcombine.cc (ifcombine_ifandif): Check for
a proper cond expr after canonicalize_cond_expr_cond.
Use is_gimple_condexpr_for_cond where appropriate.
* tree-ssa-loop-manip.cc (determine_exit_conditions): Likewise.
* tree-vect-loop-manip.cc (slpeel_add_loop_guard): Likewise.
Richard Biener [Tue, 1 Feb 2022 09:48:07 +0000 (10:48 +0100)]
Add gsi_after_labels overload for gimple_seq
The following adds gsi_after_labels for gimple_seq so I do not have
to open-code it. I took the liberty to remove the two #defines
wrapping gsi_start_1 and gsi_last_1 as we now have C++ references.
2022-02-01 Richard Biener <rguenther@suse.de>
* gimple-iterator.h (gsi_after_labels): Add overload for
gimple_seq.
(gsi_start_1): Rename to gsi_start and take a reference.
(gsi_last_1): Likewise.
* gimple-iterator.cc (gsi_for_stmt): Use gsi_start.
* omp-low.cc (lower_rec_input_clauses): Likewise.
(lower_omp_scan): Likewise.
Richard Biener [Fri, 29 Apr 2022 06:45:48 +0000 (08:45 +0200)]
tree-optimization/105431 - another overflow in powi handling
This avoids undefined signed overflow when calling powi_as_mults_1.
2022-04-29 Richard Biener <rguenther@suse.de>
PR tree-optimization/105431
* tree-ssa-math-opts.cc (powi_as_mults_1): Make n unsigned.
(powi_as_mults): Use absu_hwi.
(gimple_expand_builtin_powi): Remove now pointless n != -n
check.
Move common code from range-op.cc to header files.
In preparation for the agnostication of ranger, this patch moves
common code that can be shared between non-integer ranges (initially
pointers) into the relevant header files.
This is a relatively non-invasive change, as any changes that would
need to be ported to GCC 12, would occur in the range-op entries
themselves, not in the supporting glue which I'm moving.
Aldy Hernandez [Mon, 14 Mar 2022 08:57:48 +0000 (09:57 +0100)]
Remove various deprecated methods in class irange.
This patch cleans up some irange methods in preparation for other
cleanups later in the cycle.
First, we prefer the reference overloads for union and intersect as
the pointer versions have been deprecated for a couple releases.
Also, I've renamed the legacy union/intersect whose only function was
to provide additional verbosity for VRP into
legacy_verbose_{union,intersect}. This is a temporary rename to serve
as a visual reminder of which of the methods are bound for the chopping
block when the legacy code gets removed later this cycle.
Tested on x86-64 Linux.
gcc/ChangeLog:
* gimple-fold.cc (size_must_be_zero_p): Use reference
instead of pointer
* gimple-ssa-evrp-analyze.cc
(evrp_range_analyzer::record_ranges_from_incoming_edge): Rename
intersect to legacy_verbose_intersect.
* ipa-cp.cc (ipcp_vr_lattice::meet_with_1): Use reference instead
of pointer.
* tree-ssa-dom.cc (dom_jt_simplifier::simplify): Use value_range
instead of value_range_equiv.
* tree-vrp.cc (extract_range_from_plus_minus_expr): Use reference
instead of pointer.
(find_case_label_range): Same.
* value-range-equiv.cc (value_range_equiv::intersect): Rename to...
(value_range_equiv::legacy_verbose_intersect): ...this.
(value_range_equiv::union_): Rename to...
(value_range_equiv::legacy_verbose_union_): ...this.
* value-range-equiv.h (class value_range_equiv): Rename union and
intersect to legacy_verbose_{intersect,union}.
* value-range.cc (irange::union_): Rename to...
(irange::legacy_verbose_union_): ...this.
(irange::intersect): Rename to...
(irange::legacy_verbose_intersect): ...this.
* value-range.h (irange::union_): Rename union_ to
legacy_verbose_union.
(irange::intersect): Rename intersect to legacy_verbose_intersect.
* vr-values.cc (vr_values::update_value_range): Same.
(vr_values::extract_range_for_var_from_comparison_expr): Same.
(vr_values::extract_range_from_cond_expr): Rename union_ to
legacy_verbose_union.
(vr_values::extract_range_from_phi_node): Same.
Aldy Hernandez [Mon, 7 Mar 2022 13:56:34 +0000 (14:56 +0100)]
Prefer global range info setters that take a range.
This patch consolidates the multiple ways we have of storing global
ranges into one accepting a range.
In an upcoming patch series later this cycle we will be providing a
way to store iranges globally, as opposed to the mechanism we have now
which squishes wider ranges into value_range's. This is preparation
for such work.
Tested and benchmarked on x86-64 Linux.
gcc/ChangeLog:
* gimple-ssa-evrp-analyze.cc
(evrp_range_analyzer::set_ssa_range_info): Use *range_info methods
that take a range.
* gimple-ssa-sprintf.cc (try_substitute_return_value): Same.
* ipa-prop.cc (ipcp_update_vr): Same.
* tree-inline.cc (remap_ssa_name): Same.
* tree-ssa-copy.cc (fini_copy_prop): Same.
* tree-ssa-math-opts.cc (optimize_spaceship): Same.
* tree-ssa-phiopt.cc (replace_phi_edge_with_variable): Same.
* tree-ssa-pre.cc (insert_into_preds_of_block): Same.
* tree-ssa-sccvn.cc (eliminate_dom_walker::eliminate_stmt): Same.
* tree-ssa-strlen.cc (set_strlen_range): Same.
(strlen_pass::handle_builtin_string_cmp): Same.
* tree-ssanames.cc (set_range_info): Make static.
(duplicate_ssa_name_range_info): Make static and add a new variant
calling the static.
* tree-ssanames.h (set_range_info): Remove version taking wide ints.
(duplicate_ssa_name_range_info): Remove version taking a
range_info_def and replace with a version taking SSA names.
* tree-vect-loop-manip.cc (vect_gen_vector_loop_niters): Use *range_info methods
that take a range.
(vect_do_peeling): Same.
* tree-vrp.cc (vrp_asserts::remove_range_assertions): Same.
* vr-values.cc (simplify_truth_ops_using_ranges): Same.
The changes to fix PR 105287 included a tightening of the constraints on which
variables are promoted to frame copies. This has exposed that we are failing
to name some variables that should be promoted.
We avoid the use of DECL_UID to build anonymous symbols since that might not
be stable for -fcompare-debug.
The long-term fix is to address the cases where the naming has been missed,
but for the short-term (and for the GCC-12 branch) backing out the additional
constraint is proposed.
Richard Biener [Wed, 27 Apr 2022 06:28:31 +0000 (08:28 +0200)]
middle-end/105376 - invalid REAL_CST for DFP constant
We are eventually ICEing in decimal_to_decnumber on non-decimal
REAL_VALUE_TYPE that creep in from uses of build_real (..., dconst*)
for DFP types. The following extends the decimal_to_decnumber
special-casing of dconst* to build_real, avoiding the bogus REAL_CSTs
from creeping into the IL and modified to ones not handled by
the decimal_to_decnumber special casing. It also makes sure to
ICE for not handled dconst* values at the point we build the REAL_CST.
2022-04-27 Richard Biener <rguenther@suse.de>
PR middle-end/105376
* tree.cc (build_real): Special case dconst* arguments
for decimal floating point types.
Jason Merrill [Wed, 23 Mar 2022 16:25:18 +0000 (12:25 -0400)]
c++: traits, array of unknown bound of incomplete
My r161129 changed check_trait_type to reject arrays of unknown bound of
incomplete type, but I can't find a rationale for that, and now think it's
wrong: the standard just requires that the type be "complete, cv void, or an
array of unknown bound." I imagine that allowing arrays of unknown bound is
because an array of unknown bound can't change from incomplete to complete
later in the translation unit, so there's no caching problem.
gcc/cp/ChangeLog:
* semantics.cc (check_trait_type): Don't check completeness
of element type of array of unknown bound.
Jason Merrill [Fri, 15 Apr 2022 04:11:00 +0000 (00:11 -0400)]
c++: typeid and instantiation [PR102651]
PR49387 was a problem with initially asking for a typeid for a class
template specialization before it was complete, and later actually filling
in the descriptor when the class was complete, and thus disagreeing on the
form of the descriptor. I fixed that by forcing the class to be complete,
but this testcase shows why that approach is problematic. So instead let's
adjust the type of the descriptor later if needed.
PR c++/102651
PR c++/49387
gcc/cp/ChangeLog:
* rtti.cc (get_tinfo_decl_direct): Don't complete_type.
(emit_tinfo_decl): Update tdesc type if needed.
c++: Add diagnostic when operator= is used as truth cond [PR25689]
When compiling the following code with g++ -Wparentheses, GCC does not
warn on the if statement. For example, there is no warning for this code:
struct A {
A& operator=(int);
operator bool();
};
void f(A a) {
if (a = 0); // no warning
}
This is because a = 0 is a call to operator=, which GCC does not handle.
This patch fixes this issue by handling calls to operator= when deciding
to warn.
Bootstrapped and regression tested on x86_64-pc-linux-gnu.
PR c++/25689
gcc/cp/ChangeLog:
* call.cc (extract_call_expr): Return a NULL_TREE on failure
instead of asserting.
(build_new_method_call): Suppress -Wparentheses diagnostic for
MODIFY_EXPR.
* semantics.cc (is_assignment_op_expr_p): Add function to check
if an expression is a call to an op= operator expression.
(maybe_convert_cond): Handle the case of a op= operator expression
for the -Wparentheses diagnostic.
Sebastian Huber [Thu, 31 Mar 2022 08:10:02 +0000 (10:10 +0200)]
gcov: Record EOF error during read
Use an enum for file error codes.
gcc/
* gcov-io.cc (gcov_file_error): New enum.
(gcov_var): Use gcov_file_error enum for the error member.
(gcov_open): Use GCOV_FILE_NO_ERROR.
(gcov_close): Use GCOV_FILE_WRITE_ERROR.
(gcov_write): Likewise.
(gcov_write_unsigned): Likewise.
(gcov_write_string): Likewise.
(gcov_read_bytes): Set error code if EOF is reached.
(gcov_read_counter): Use GCOV_FILE_COUNTER_OVERFLOW.
Sebastian Huber [Wed, 30 Mar 2022 19:40:55 +0000 (21:40 +0200)]
gcov-tool: Support file input from stdin
gcc/
* gcov-io.cc (GCOV_MODE_STDIN): Define.
(gcov_position): For gcov-tool, return calculated position if file is
stdin.
(gcov_open): For gcov-tool, use stdin if filename is NULL.
(gcov_close): For gcov-tool, do not close stdin.
(gcov_read_bytes): For gcov-tool, update position if file is stdin.
(gcov_sync): For gcov-tool, discard input if file is stdin.
Sebastian Huber [Wed, 30 Mar 2022 19:45:23 +0000 (21:45 +0200)]
gcov: Add __gcov_filename_to_gcfn()
gcc/
* doc/invoke.texi (fprofile-info-section): Mention
__gcov_filename_to_gcfn(). Use "freestanding" to match with C11
standard language. Fix minor example code issues.
* gcov-io.h (GCOV_FILENAME_MAGIC): Define and document.
gcc/testsuite/
* gcc.dg/gcov-info-to-gcda.c: Test __gcov_filename_to_gcfn().
libgcc/
* gcov.h (__gcov_info_to_gcda): Mention __gcov_filename_to_gcfn().
(__gcov_filename_to_gcfn): Declare and document.
* libgcov-driver.c (dump_string): New.
(__gcov_filename_to_gcfn): Likewise.
(__gcov_info_to_gcda): Adjust comment to match C11 standard language.
Sebastian Huber [Thu, 24 Mar 2022 20:59:21 +0000 (21:59 +0100)]
gcov: Add mode to all gcov_open()
gcc/
* gcov-io.cc (gcov_open): Always use the mode parameter.
* gcov-io.h (gcov_open): Declare it unconditionally.
libgcc/
* libgcov-driver-system.c (gcov_exit_open_gcda_file): Open file for
reading and writing.
* libgcov-util.c (read_gcda_file): Open file for reading.
* libgcov.h (gcov_open): Delete declaration.
Sebastian Huber [Wed, 23 Mar 2022 09:20:56 +0000 (10:20 +0100)]
gcov-tool: Allow merging of empty profile lists
The gcov_profile_merge() already had code to deal with profile information
which had no counterpart to merge with. For profile information from files
with no associated counterpart, the profile information is simply used as is
with the weighting transformation applied. Make sure that gcov_profile_merge()
works with an empty target profile list. Return the merged profile list.
gcc/
* gcov-tool.cc (gcov_profile_merge): Adjust return type.
(profile_merge): Allow merging of directories which contain no profile
files.
libgcc/
* libgcov-util.c (gcov_profile_merge): Return the list of merged
profiles. Accept empty target and source profile lists.
if (next_off >= r->size)
/* We should have already returned if this is the case. */
__analyzer_dump_path (); /* { dg-bogus "path" } */
}
where the analyzer erroneously considers this path, where
(next_off >= r->size) is both false and then true:
symbolic-12.c: In function ‘test_1a’:
symbolic-12.c:22:5: note: path
22 | __analyzer_dump_path (); /* { dg-bogus "path" } */
| ^~~~~~~~~~~~~~~~~~~~~~~
‘test_1a’: events 1-5
|
| 17 | if (next_off >= r->size)
| | ^
| | |
| | (1) following ‘false’ branch...
|......
| 20 | if (next_off >= r->size)
| | ~ ~~~~~~~
| | | |
| | | (2) ...to here
| | (3) following ‘true’ branch...
| 21 | /* We should have already returned if this is the case. */
| 22 | __analyzer_dump_path (); /* { dg-bogus "path" } */
| | ~~~~~~~~~~~~~~~~~~~~~~~
| | |
| | (4) ...to here
| | (5) here
|
The root cause is that, at the call to the external function, the
analyzer considers the cluster for *p to have been touched, binding it
to a conjured_svalue, but because p is void * no particular size is
known for the write, and so the cluster is bound using a symbolic key
covering the base region. Later, the accesses to r->size are handled by
binding_cluster::get_any_binding, but binding_cluster::get_binding fails
to find a match for the concrete field lookup, due to the key for the
binding being symbolic, and reaching this code:
1522 /* If this cluster has been touched by a symbolic write, then the content
1523 of any subregion not currently specifically bound is "UNKNOWN". */
1524 if (m_touched)
1525 {
1526 region_model_manager *rmm_mgr = mgr->get_svalue_manager ();
1527 return rmm_mgr->get_or_create_unknown_svalue (reg->get_type ());
1528 }
Hence each access to r->size is an unknown svalue, and thus the
condition (next_off >= r->size) isn't tracked, leading to the path with
contradictory conditions being treated as satisfiable.
In the original reproducer in git's reftable/reader.c, the call to the
external fn is:
reftable_record_type(rec)
which is considered to possibly write to *rec, which is *tab, where tab
is the void * argument to reftable_reader_seek_void, and thus after the
call to reftable_record_type some arbitrary amount of *rec could have
been written to.
This patch fixes things by detecting the "this cluster has been 'filled'
with a conjured value of unknown size" case, and handling
get_any_binding on it by returning a sub_svalue of the conjured_svalue,
so that repeated accesses to r->size give the same symbolic value, so
that the constraint manager rejects the bogus execution path, fixing the
false positive.
gcc/analyzer/ChangeLog:
PR analyzer/105285
* store.cc (binding_cluster::get_any_binding): Handle accessing
sub_svalues of clusters where the base region has a symbolic
binding.
gcc/testsuite/ChangeLog:
PR analyzer/105285
* gcc.dg/analyzer/symbolic-12.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Patrick Palka [Thu, 28 Apr 2022 17:10:56 +0000 (13:10 -0400)]
c++: partial ordering and dependent operator expr [PR105425]
Here ever since r12-6022-gbb2a7f80a98de3 we stopped deeming the partial
specialization #2 to be more specialized than #1 ultimately because
dependent operator expressions now have a DEPENDENT_OPERATOR_TYPE type
instead of an empty type, and this made unify stop deducing T(2) == 1
for K during partial ordering for #1 and #2.
This minimal patch fixes this by making the relevant logic in unify
treat DEPENDENT_OPERATOR_TYPE like an empty type.
PR c++/105425
gcc/cp/ChangeLog:
* pt.cc (unify) <case TEMPLATE_PARM_INDEX>: Treat
DEPENDENT_OPERATOR_TYPE like an empty type.
gcc/testsuite/ChangeLog:
* g++.dg/template/partial-specialization13.C: New test.
Jakub Jelinek [Thu, 28 Apr 2022 15:41:49 +0000 (17:41 +0200)]
i386: Improve ix86_expand_int_movcc
When working on PR105338, I've noticed that in some cases we emit
unnecessarily long sequence which has then higher seq_cost than necessary.
E.g. when ix86_expand_int_movcc is called with
operands[0] (reg/v:SI 83 [ i ])
operands[1] (eq (reg/v:SI 83 [ i ]) (const_int 0 [0]))
operands[2] (reg/v:SI 83 [ i ])
operands[3] (const_int -2 [0xfffffffffffffffe])
i.e. r83 = r83 == 0 ? r83 : -2 which with my PR105338 patch is equivalent to
r83 = r83 == 0 ? 0 : -2, we emit:
(insn 24 0 25 (set (reg:CC 17 flags)
(compare:CC (reg/v:SI 83 [ i ])
(const_int 1 [0x1]))) 11 {*cmpsi_1}
(nil))
(insn 25 24 26 (parallel [
(set (reg:SI 85)
(if_then_else:SI (ltu:SI (reg:CC 17 flags)
(const_int 0 [0]))
(const_int -1 [0xffffffffffffffff])
(const_int 0 [0])))
(clobber (reg:CC 17 flags))
]) 1192 {*x86_movsicc_0_m1}
(nil))
(insn 26 25 27 (set (reg:SI 85)
(not:SI (reg:SI 85))) 683 {*one_cmplsi2_1}
(nil))
(insn 27 26 28 (parallel [
(set (reg:SI 85)
(and:SI (reg:SI 85)
(const_int -2 [0xfffffffffffffffe])))
(clobber (reg:CC 17 flags))
]) 533 {*andsi_1}
(nil))
(insn 28 27 0 (set (reg/v:SI 83 [ i ])
(reg:SI 85)) 81 {*movsi_internal}
(nil))
which has seq_cost (seq, true) 24. But it could have just cost 20
if we didn't decide to use a fresh temporary r85 and used r83 instead
- we could avoid the copy at the end.
The reason for it is in the 2 reg_overlap_mentioned_p calls,
the destination (out) indeed overlaps op0 - it is the same register,
but I don't see why that is a problem, this is in a code path where
we've already called
ix86_expand_carry_flag_compare (code, op0, op1, &compare_op)
earlier, so the fact that we've out overlaps op0 or op1 shouldn't matter
because insn 24 above is already emitted, we should just care if
it overlaps whatever we got from that ix86_expand_carry_flag_compare
call, i.e. compare_op, otherwise we can overwrite out just fine;
we also know at that point that the last 2 operands of ?: are constants.
2022-04-28 Jakub Jelinek <jakub@redhat.com>
* config/i386/i386-expand.cc (ix86_expand_int_movcc): Create a
temporary only if out overlaps compare_op, not when it overlaps
op0 or op1.
Jakub Jelinek [Thu, 28 Apr 2022 14:22:42 +0000 (16:22 +0200)]
Update crontab and git_update_version.py
2022-04-28 Jakub Jelinek <jakub@redhat.com>
maintainer-scripts/
* crontab: Snapshots from trunk are now GCC 13 related.
Add GCC 12 snapshots from the respective branch.
contrib/
* gcc-changelog/git_update_version.py (active_refs): Add
releases/gcc-12.
Jakub Jelinek [Thu, 28 Apr 2022 13:45:33 +0000 (15:45 +0200)]
cgraph: Don't verify semantic_interposition flag for aliases [PR105399]
The following testcase ICEs, because the ctors during cc1plus all have
!opt_for_fn (decl, flag_semantic_interposition) - they have NULL
DECL_FUNCTION_SPECIFIC_OPTIMIZATION (decl) and optimization_default_node
is for -Ofast and so has flag_semantic_interposition cleared.
During free lang data, we set DECL_FUNCTION_SPECIFIC_OPTIMIZATION (decl)
for the ctor which has body (or for thunks), but don't touch it for
aliases.
During lto1 optimization_default_node reflects the lto1 flags which
are -O2 rather than -Ofast and so has flag_semantic_interposition
set, for the ctor which has body that makes no difference, but as the
alias doesn't still have DECL_FUNCTION_SPECIFIC_OPTIMIZATION (decl) set,
we now trigger this verification check.
The following patch just doesn't verify it for aliases during lto1.
Another possibility would be to set DECL_FUNCTION_SPECIFIC_OPTIMIZATION (decl)
during free lang data even for aliases.
2022-04-28 Jakub Jelinek <jakub@redhat.com>
PR lto/105399
* cgraph.cc (cgraph_node::verify_node): Don't verify
semantic_interposition flag against
opt_for_fn (decl, flag_semantic_interposition) for aliases in lto1.
c++, coroutines: Improve check for throwing final await [PR104051].
We check that the final_suspend () method returns a sane type (i.e. a class
or structure) but, unfortunately, that check has to be later than the one
for a throwing case. If the use returns some nonsensical type from the
method, we need to handle that in the checking for noexcept.
c++, coroutines: Account for overloaded promise return_value() [PR105301].
Whether it was intended or not, it is possible to define a coroutine promise
with multiple return_value() methods [which need not even have the same type].
We were not accounting for this possibility in the check to see whether both
return_value and return_void are specifier (which is prohibited by the
standard). Fixed thus and provided an adjusted diagnostic for the case that
multiple return_value() methods are present.
c++, coroutines: Make sure our temporaries are in a bind expr [PR105287]
There are a few cases where we can generate a temporary that does not need
to be added to the coroutine frame (i.e. these are genuinely ephemeral). The
intent was that unnamed temporaries should not be 'promoted' to coroutine
frame entries. However there was a thinko and these were not actually ever
added to the bind expressions being generated for the expanded awaits. This
meant that they were showing in the global namspace, leading to an empty
DECL_CONTEXT and the ICE reported.
* coroutines.cc (maybe_promote_temps): Ensure generated temporaries
are added to the bind expr.
(add_var_to_bind): Fix local var naming to use portable punctuation.
(register_local_var_uses): Do not add synthetic names to unnamed
temporaries.
c++, coroutines: Avoid expanding within templates [PR103868]
This is a forward-port of a patch by Nathan (against 10.x) which fixes an open
PR.
We are ICEing because we ended up tsubst_copying something that had already
been tsubst, leading to an assert failure (mostly such repeated tsubsting is
harmless).
We had a non-dependent co_await in a non-dependent-type template fn, so we
processed it at definition time, and then reprocessed at instantiation time.
We fix this here by deferring substitution while processing templates.
Additional observations (for a better future fix, in the GCC13 timescale):
Exprs only have dependent type if at least one operand is dependent which was
what the current code was intending to do. Coroutines have the additional
wrinkle, that the current fn's type is an implicit operand.
So, if the coroutine function's type is not dependent, and the operand is not
dependent, we should determine the type of the co_await expression using the
DEPENDENT_EXPR wrapper machinery. That allows us to determine the
subexpression type, but leave its operand unchanged and then instantiate it
later.
PR c++/103868
gcc/cp/ChangeLog:
* coroutines.cc (finish_co_await_expr): Do not process non-dependent
coroutine expressions at template definition time.
(finish_co_yield_expr): Likewise.
(finish_co_return_stmt): Likewise.
Jonathan Wakely [Thu, 28 Apr 2022 12:06:31 +0000 (13:06 +0100)]
libstdc++: Fix error reporting in filesystem::copy [PR99290]
The recursive calls to filesystem::copy should stop if any of them
reports an error.
libstdc++-v3/ChangeLog:
PR libstdc++/99290
* src/c++17/fs_ops.cc (fs::copy): Pass error_code to
directory_iterator constructor, and check on each iteration.
* src/filesystem/ops.cc (fs::copy): Likewise.
* testsuite/27_io/filesystem/operations/copy.cc: Check for
errors during recursion.
* testsuite/experimental/filesystem/operations/copy.cc:
Likewise.
Jakub Jelinek [Thu, 28 Apr 2022 10:33:59 +0000 (12:33 +0200)]
i386: Fix up ix86_gimplify_va_arg [PR105331]
On the following testcase we emit a bogus
'va_arg_tmp.5' may be used uninitialized
warning. The reason is that when gimplifying the addr = &temp;
statement, the va_arg_tmp temporary var for which we emit ADDR_EXPR
is not TREE_ADDRESSABLE, prepare_gimple_addressable emits some extra
code to initialize the newly addressable var from its previous value,
but it is a new variable which hasn't been initialized yet and will
be later, so we end up initializing it with uninitialized SSA_NAME:
va_arg_tmp.6 = va_arg_tmp.5_14(D);
addr.2_16 = &va_arg_tmp.6;
_17 = MEM[(double *)sse_addr.4_13];
MEM[(double * {ref-all})addr.2_16] = _17;
and with -O1 we actually don't DSE it before the warning is emitted.
If we make the temp TREE_ADDRESSABLE before the gimplification, then
this prepare_gimple_addressable path isn't taken and we effectively
omit the first statement above and so the bogus warning is gone.
I went through other backends and didn't find another instance of this
problem.
2022-04-28 Jakub Jelinek <jakub@redhat.com>
PR target/105331
* config/i386/i386.cc (ix86_gimplify_va_arg): Mark va_arg_tmp
temporary TREE_ADDRESSABLE before trying to gimplify ADDR_EXPR
of it.
Richard Biener [Wed, 27 Apr 2022 12:06:12 +0000 (14:06 +0200)]
tree-optimization/105219 - bogus max iters for vectorized epilogue
The following makes sure to take into account prologue peeling
when trying to narrow down the maximum number of iterations
computed for the vectorized epilogue. A similar issue exists when
peeling for gaps.
2022-04-27 Richard Biener <rguenther@suse.de>
PR tree-optimization/105219
* tree-vect-loop.cc (vect_transform_loop): Disable
special code narrowing the vectorized epilogue max
iterations when peeling for alignment or gaps was in effect.
Kewen Lin [Thu, 28 Apr 2022 03:34:27 +0000 (22:34 -0500)]
testsuite: Add test case for pack/unpack bifs at soft-float [PR105334]
This patch is to add the test coverage for the two recent fixes
r12-8091 and r12-8226 from Segher, aix is skipped since it takes
soft-float and long-double-128 incompatible.
testsuite: Skip target not support -pthread [PR104676].
The "ftree-parallelize-loops=" imply -pthread option in gcc/gcc.cc,
some target are not support pthread like elf target use newlib,
and will get an error:
Marek Polacek [Tue, 26 Apr 2022 20:12:58 +0000 (16:12 -0400)]
c++: enum in generic lambda at global scope [PR105398]
We crash compiling this test since r11-7993 which changed
lookup_template_class_1 so that we only call tsubst_enum when
!uses_template_parms (current_nonlambda_scope ())
But here current_nonlambda_scope () is the global NAMESPACE_DECL ::, which
doesn't have a type, therefore is considered type-dependent. So we don't
call tsubst_enum, and crash in tsubst_copy/CONST_DECL because we didn't
find the e1 enumerator.
I don't think any namespace can depend on any template parameter, so
this patch tweaks uses_template_parms.
PR c++/105398
gcc/cp/ChangeLog:
* pt.cc (uses_template_parms): Return false for any NAMESPACE_DECL.
Jakub Jelinek [Wed, 27 Apr 2022 16:47:10 +0000 (18:47 +0200)]
testsuite: Add testcase for dangling pointer equality bogus warning [PR104492]
On Wed, Apr 27, 2022 at 12:02:33PM +0200, Richard Biener wrote:
> I did that but the reduction result did not resemble the same failure
> mode. I've failed to manually construct a testcase as well. Possibly
> a testcase using libstdc++ but less Qt internals might be possible.
Here is a testcase that I've managed to reduce, FAILs with:
FAIL: g++.dg/warn/pr104492.C -std=gnu++14 (test for bogus messages, line 111)
FAIL: g++.dg/warn/pr104492.C -std=gnu++17 (test for bogus messages, line 111)
FAIL: g++.dg/warn/pr104492.C -std=gnu++20 (test for bogus messages, line 111)
on both x86_64-linux and i686-linux without your commit and passes with it.
2022-04-27 Jakub Jelinek <jakub@redhat.com>
PR middle-end/104492
* g++.dg/warn/pr104492.C: New test.
Thomas Koenig [Wed, 27 Apr 2022 16:40:18 +0000 (18:40 +0200)]
Split test to remove failing run time test and add check for ICE.
gcc/testsuite/ChangeLog:
PR fortran/70673
PR fortran/78054
* gfortran.dg/pr70673.f90: Remove invalid statement.
* gfortran.dg/pr70673_2.f90: New test to check that
ICE does not re-appear.
The following patch updates baseline_symbols.txt on arches where I have
latest libstdc++ builds (my ws + Fedora package builds).
I've manually excluded:
+FUNC:_ZNKSt17__gnu_cxx_ieee1287num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE14_M_extract_intB5cxx11IjEES4_S4_S4_RSt8ios_baseRSt12_Ios_IostateRT_@@GLIBCXX_IEEE128_3.4.29
+FUNC:_ZNKSt17__gnu_cxx_ieee1287num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE14_M_extract_intB5cxx11IlEES4_S4_S4_RSt8ios_baseRSt12_Ios_IostateRT_@@GLIBCXX_IEEE128_3.4.29
+FUNC:_ZNKSt17__gnu_cxx_ieee1287num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE14_M_extract_intB5cxx11ImEES4_S4_S4_RSt8ios_baseRSt12_Ios_IostateRT_@@GLIBCXX_IEEE128_3.4.29
+FUNC:_ZNKSt17__gnu_cxx_ieee1287num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE14_M_extract_intB5cxx11ItEES4_S4_S4_RSt8ios_baseRSt12_Ios_IostateRT_@@GLIBCXX_IEEE128_3.4.29
+FUNC:_ZNKSt17__gnu_cxx_ieee1287num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE14_M_extract_intB5cxx11IxEES4_S4_S4_RSt8ios_baseRSt12_Ios_IostateRT_@@GLIBCXX_IEEE128_3.4.29
+FUNC:_ZNKSt17__gnu_cxx_ieee1287num_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE14_M_extract_intB5cxx11IyEES4_S4_S4_RSt8ios_baseRSt12_Ios_IostateRT_@@GLIBCXX_IEEE128_3.4.29
+FUNC:_ZNKSt17__gnu_cxx_ieee1287num_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEEE14_M_extract_intB5cxx11IjEES4_S4_S4_RSt8ios_baseRSt12_Ios_IostateRT_@@GLIBCXX_IEEE128_3.4.29
+FUNC:_ZNKSt17__gnu_cxx_ieee1287num_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEEE14_M_extract_intB5cxx11IlEES4_S4_S4_RSt8ios_baseRSt12_Ios_IostateRT_@@GLIBCXX_IEEE128_3.4.29
+FUNC:_ZNKSt17__gnu_cxx_ieee1287num_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEEE14_M_extract_intB5cxx11ImEES4_S4_S4_RSt8ios_baseRSt12_Ios_IostateRT_@@GLIBCXX_IEEE128_3.4.29
+FUNC:_ZNKSt17__gnu_cxx_ieee1287num_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEEE14_M_extract_intB5cxx11ItEES4_S4_S4_RSt8ios_baseRSt12_Ios_IostateRT_@@GLIBCXX_IEEE128_3.4.29
+FUNC:_ZNKSt17__gnu_cxx_ieee1287num_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEEE14_M_extract_intB5cxx11IxEES4_S4_S4_RSt8ios_baseRSt12_Ios_IostateRT_@@GLIBCXX_IEEE128_3.4.29
+FUNC:_ZNKSt17__gnu_cxx_ieee1287num_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEEE14_M_extract_intB5cxx11IyEES4_S4_S4_RSt8ios_baseRSt12_Ios_IostateRT_@@GLIBCXX_IEEE128_3.4.29
additions on ppc64le as those look unexpected.
Those symbols didn't show up in Fedora 11.3.1 build with recent glibc,
while other GLIBCXX_IEEE128_3.4.29 symbols are in 11.x already.
What this patch includes are only @@GLIBCXX_3.4.30 symbol additions, same
symbols on all files, except that powerpc64 adds also
_ZNSt17__gnu_cxx_ieee12816__convert_from_vERKP15__locale_structPciPKcz@@GLIBCXX_IEEE128_3.4.30
so everything included in the patch looks right to me.
Jonathan Wakely [Wed, 27 Apr 2022 13:29:34 +0000 (14:29 +0100)]
libstdc++: Add pretty printer for std::atomic
For the atomic specializations for shared_ptr and weak_ptr we can reuse
the existing SharedPointerPrinter, with a small tweak.
libstdc++-v3/ChangeLog:
* python/libstdcxx/v6/printers.py (SharedPointerPrinter): Add
support for atomic<shared_ptr<T>> and atomic<weak_ptr<T>>.
(StdAtomicPrinter): New printer.
(build_libstdcxx_dictionary): Register new printer.
* testsuite/libstdc++-prettyprinters/cxx11.cc: Test std::atomic.
* testsuite/libstdc++-prettyprinters/cxx20.cc: Test atomic smart
pointers.
Richard Biener [Mon, 25 Apr 2022 08:46:16 +0000 (10:46 +0200)]
middle-end/104492 - avoid all equality compare dangling pointer diags
The following extends the equality compare dangling pointer diagnostics
suppression for uses following free or realloc to also cover those
following invalidation of auto variables via CLOBBERs. That avoids
diagnosing idioms like
for auto candidates which are prone to forwarding of the final
comparison across the storage invalidation as then seen by the
late run access warning pass.
2022-04-25 Richard Biener <rguenther@suse.de>
PR middle-end/104492
* gimple-ssa-warn-access.cc
(pass_waccess::warn_invalid_pointer): Exclude equality compare
diagnostics for all kind of invalidations.
(pass_waccess::check_dangling_uses): Fix post-dominator query.
(pass_waccess::check_pointer_uses): Likewise.
This caused ICEs as the comparison of array spec supported only constant
explicit bounds, but dummy class variable descriptor types can have a
_data field with non-constant array spec bounds.
This change adds support for non-constant bounds. For that,
gfc_dep_compare_expr is used. It does probably more than strictly
necessary, but using it avoids rewriting a specific comparison function,
making mistakes and forgetting cases.
PR fortran/103662
PR fortran/105379
gcc/fortran/ChangeLog:
* array.cc (compare_bounds): Use bool as return type.
Support non-constant expressions.
(gfc_compare_array_spec): Update call to compare_bounds.
gcc/testsuite/ChangeLog:
* gfortran.dg/class_dummy_8.f90: New test.
* gfortran.dg/class_dummy_9.f90: New test.
Mikael Morin [Wed, 27 Apr 2022 09:36:00 +0000 (11:36 +0200)]
fortran: Avoid infinite self-recursion [PR105381]
Dummy array decls are local decls different from the argument decl
accessible through GFC_DECL_SAVED_DESCRIPTOR. If the argument decl has
a DECL_LANG_SPECIFIC set, it is copied over to the local decl at the
time the latter is created, so that the DECL_LANG_SPECIFIC object is
shared between local dummy decl and argument decl, and thus the
GFC_DECL_SAVED_DESCRIPTOR of the argument decl is the argument decl
itself.
The r12-8230-g7964ab6c364c410c34efe7ca2eba797d36525349 change introduced
the non_negative_strides_array_p predicate which recurses through
GFC_DECL_SAVED_DESCRIPTOR to avoid seeing dummy decls as purely local
decls. As the GFC_DECL_SAVED_DESCRIPTOR of the argument decl is itself,
this can cause infinite recursion.
This change adds a check to avoid infinite recursion.
PR fortran/102043
PR fortran/105381
gcc/fortran/ChangeLog:
* trans-array.cc (non_negative_strides_array_p): Inline variable
orig_decl and merge nested if conditions. Add condition to not
recurse if the next argument is the same as the current.
gcc/testsuite/ChangeLog:
* gfortran.dg/character_array_dummy_1.f90: New test.
The test is UNSUPPORTED with the first three ones (because of
-mcpu=cortex-a7), ignored with armv6s-m, and PASSes with all the other
ones, while it used crash without Jakub's fix (r12-8263), ie. FAIL
with options 5,6,7,8,10. The test passed without Jakub's fix with
option 9 because the problem happens only with an integer-only MVE.
Andreas Krebbel [Wed, 27 Apr 2022 07:20:41 +0000 (09:20 +0200)]
PR102024 - IBM Z: Add psabi diagnostics
For IBM Z in particular there is a problem with structs like:
struct A { float a; int :0; };
Our ABI document allows passing a struct in an FPR only if it has
exactly one member. On the other hand it says that structs of 1,2,4,8
bytes are passed in a GPR. So this struct is expected to be passed in
a GPR. Since we don't return structs in registers (regardless of the
number of members) it is always returned in memory.
Situation is as follows:
All compiler versions tested return it in memory - as expected.
gcc 11, gcc 12, g++ 12, and clang 13 pass it in a GPR - as expected.
g++ 11 as well as clang++ 13 pass in an FPR
For IBM Z we stick to the current GCC 12 behavior, i.e. zero-width
bitfields are NOT ignored. A struct as above will be passed in a
GPR. Rational behind this is that not affecting the C ABI is more
important here.
A patch for clang is in progress: https://reviews.llvm.org/D122388
In addition to the usual regression test I ran the compat and
struct-layout-1 testsuites comparing the compiler before and after the
patch.
gcc/testsuite/ChangeLog:
PR target/102024
* g++.target/s390/pr102024-1.C: New test.
* g++.target/s390/pr102024-2.C: New test.
* g++.target/s390/pr102024-3.C: New test.
* g++.target/s390/pr102024-4.C: New test.
* g++.target/s390/pr102024-5.C: New test.
* g++.target/s390/pr102024-6.C: New test.
Jakub Jelinek [Wed, 27 Apr 2022 06:34:18 +0000 (08:34 +0200)]
asan: Fix up asan_redzone_buffer::emit_redzone_byte [PR105396]
On the following testcase, we have in main's frame 3 variables,
some red zone padding, 4 byte d, followed by 12 bytes of red zone padding, then
8 byte b followed by 24 bytes of red zone padding, then 40 bytes c followed
by some red zone padding.
The intended content of shadow memory for that is (note, each byte describes
8 bytes of memory):
f1 f1 f1 f1 04 f2 00 f2 f2 f2 00 00 00 00 00 f3 f3 f3 f3 f3
left red d mr b middle r c right red zone
f1 is left red zone magic
f2 is middle red zone magic
f3 is right red zone magic
00 when all 8 bytes are accessible
01-07 when only 1 to 7 bytes are accessible followed by inaccessible bytes
The -fdump-rtl-expand-details dump makes it clear that it misbehaves:
Flushing rzbuffer at offset -160 with: f1 f1 f1 f1
Flushing rzbuffer at offset -128 with: 04 f2 00 00
Flushing rzbuffer at offset -128 with: 00 00 00 f2
Flushing rzbuffer at offset -96 with: f2 f2 00 00
Flushing rzbuffer at offset -64 with: 00 00 00 f3
Flushing rzbuffer at offset -32 with: f3 f3 f3 f3
In the end we end up with
f1 f1 f1 f1 00 00 00 f2 f2 f2 00 00 00 00 00 f3 f3 f3 f3 f3
shadow bytes because at offset -128 there are 2 overlapping stores
as asan_redzone_buffer::emit_redzone_byte has flushed the temporary 4 byte
buffer in the middle.
The function is called with an offset and value. If the passed offset is
consecutive with the prev_offset + buffer size (off == offset), then
we handle it correctly, similarly if the new offset is far enough from the
old one (we then flush whatever was in the buffer and if needed add up to 3
bytes of 00 before actually pushing value.
But what isn't handled correctly is when the offset isn't consecutive to
what has been added last time, but it is in the same 4 byte word of shadow
memory (32 bytes of actual memory), like the above case where
we have consecutive 04 f2 and then skip one shadow memory byte (aka 8 bytes
of real memory) and then want to emit f2. Emitting that as a store
of little-endian 0x0000f204 followed by a store of 0xf2000000 to the same
address doesn't work, we want to emit 0xf200f204.
The following patch does that by pushing 1 or 2 00 bytes.
Additionally, as a small cleanup, instead of using
m_shadow_bytes.safe_push (value);
flush_if_full ();
in all of if, else if and else bodies it sinks those 2 stmts to the end
of function as all do the same thing.
2022-04-27 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/105396
* asan.cc (asan_redzone_buffer::emit_redzone_byte): Handle the case
where offset is bigger than off but smaller than m_prev_offset + 32
bits by pushing one or more 0 bytes. Sink the
m_shadow_bytes.safe_push (value); flush_if_full (); statements from
all cases to the end of the function.
Kewen Lin [Tue, 26 Apr 2022 11:34:24 +0000 (06:34 -0500)]
rs6000: Move V2DI vec_neg under power8-vector [PR105271]
As PR105271 shows, __builtin_altivec_neg_v2di requires option
-mpower8-vector as its pattern expansion relies on subv2di which
has guard VECTOR_UNIT_P8_VECTOR_P (V2DImode). This fix is to move
the related lines for __builtin_altivec_neg_v2di to the section
of stanza power8-vector.
PR target/105271
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (NEG_V2DI): Move to [power8-vector]
stanza.
Jason Merrill [Tue, 26 Apr 2022 04:19:40 +0000 (00:19 -0400)]
c++: pack init-capture of unresolved overload [PR102629]
Here we were failing to diagnose that the initializer for the capture pack
is an unresolved overload. It turns out that the reason we didn't recognize
the deduction failure in do_auto_deduction was that the individual 'auto' in
the expansion of the capture pack was still marked as a parameter pack, so
we were deducing it to an empty pack instead of failing.
PR c++/102629
gcc/cp/ChangeLog:
* pt.cc (gen_elem_of_pack_expansion_instantiation): Clear
TEMPLATE_TYPE_PARAMETER_PACK on auto.
Patrick Palka [Tue, 26 Apr 2022 14:53:38 +0000 (10:53 -0400)]
c++: decltype of non-dependent call of class type [PR105386]
We need to pass tf_decltype when instantiating a non-dependent decltype
operand, like tsubst does in the dependent case, so that we don't force
completion of a prvalue operand's class type.
PR c++/105386
gcc/cp/ChangeLog:
* semantics.cc (finish_decltype_type): Pass tf_decltype to
instantiate_non_dependent_expr_sfinae.
Martin Liska [Tue, 26 Apr 2022 07:56:37 +0000 (09:56 +0200)]
lto: use diagnostics_context in print_lto_docs_link
Properly parse OPT_fdiagnostics_urls_ and then initialize both urls
and colors for global_dc. Doing that we would follow the configure
option --with-documentation-root-url, -fdiagnostics-urls is respected.
Plus we'll print colored warning and note messages.
libphobos: Don't call free on the TLS array in the emutls destroy function.
Fixes a segfault seen on Darwin when a GC scan is ran after a thread has
been destroyed. As the global emutlsArrays hash still has a reference
to the array itself, and tries to iterate all elements.
Setting the length to zero frees all allocated elements in the array,
and ensures that it is skipped when the _d_emutls_scan is called.
This DR was approved at the February 2022 plenary.
libstdc++-v3/ChangeLog:
* include/bits/shared_ptr_atomic.h (atomic<shared_ptr>): Add
constructor for constant initialization from nullptr_t.
* testsuite/20_util/shared_ptr/atomic/atomic_shared_ptr.cc:
Check for new constructor.
The following testcase regressed on riscv due to the splitting of critical
edges in the sink pass, similarly to x86_64 compared to GCC 11 we now swap
the edges, whether true or false edge goes to an empty forwarded bb.
From GIMPLE POV, those 2 forms are equivalent, but as can be seen here, for
some ifcvt opts it matters one way or another.
On this testcase, noce_try_store_flag_mask used to trigger and transformed
if (pseudo2) pseudo1 = 0;
into
pseudo1 &= -(pseudo2 == 0);
But with the swapped edges ifcvt actually sees
if (!pseudo2) pseudo3 = pseudo1; else pseudo3 = 0;
and noce_try_store_flag_mask punts. IMHO there is no reason why it
should punt those, it is equivalent to
pseudo3 = pseudo1 & -(pseudo2 == 0);
and especially if the target has 3 operand AND, it shouldn't be any more
costly (and even with 2 operand AND, it might very well happen that RA
can make it happen without any extra moves).
Initially I've just removed the rtx_equal_p calls from the conditions
and didn't add anything there, but that broke aarch64 bootstrap and
regressed some testcases on x86_64, where if_info->a or if_info->b could be
some larger expression that we can't force into a register.
Furthermore, the case where both if_info->a and if_info->b are constants is
better handled by other ifcvt optimizations like noce_try_store_flag
or noce_try_inverse_constants or noce_try_store_flag_constants.
So, I've restricted it to just a REG (perhaps SUBREG of REG might be ok too)
next to what has been handled previously.
2022-04-26 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/105314
* ifcvt.cc (noce_try_store_flag_mask): Don't require that the non-zero
operand is equal to if_info->x, instead use the non-zero operand
as one of the operands of AND with if_info->x as target.
Jakub Jelinek [Tue, 26 Apr 2022 07:57:34 +0000 (09:57 +0200)]
reassoc: Don't call fold_convert if !fold_convertible_p [PR105374]
As mentioned in the PR, we ICE because maybe_fold_*_comparisons returns
an expression with V4SImode type and we try to fold_convert it to
V4BImode, which isn't allowed.
IMHO no matter whether we change maybe_fold_*_comparisons we should
play safe on the reassoc side and punt if we can't convert like
we punt for many other reasons. This fixes the testcase on ARM.
Testcase not included, not exactly sure where and what directives it
should have in gcc.target/arm/ testsuite. Christophe, do you think you
could handle that incrementally?
2022-04-26 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/105374
* tree-ssa-reassoc.cc (eliminate_redundant_comparison): Punt if
!fold_convertible_p rather than assuming fold_convert must succeed.
Jakub Jelinek [Tue, 26 Apr 2022 07:52:22 +0000 (09:52 +0200)]
testsuite: Fix up g++.target/i386/vec-tmpl1.C testcase [PR65211]
This test fails on i686-linux:
Excess errors:
.../gcc/testsuite/g++.target/i386/vec-tmpl1.C:13:27: warning: SSE vector return without SSE enabled changes the ABI [-Wpsabi]
2022-04-26 Jakub Jelinek <jakub@redhat.com>
PR c++/65211
* g++.target/i386/vec-tmpl1.C: Add -Wno-psabi as
dg-additional-options.
Jakub Jelinek [Tue, 26 Apr 2022 07:40:03 +0000 (09:40 +0200)]
i386: Fix up ICE with -mveclibabi={acml,svml} [PR105367]
The following testcase ICEs, because conversion between scalar float types
which have the same mode are useless in GIMPLE, but for mathfn_built_in the
exact type matters (it treats say double and _Float64 or float and _Float32
differently, using different suffixes and for the _Float* sometimes
returning NULL when float/double do have a builtin).
In ix86_veclibabi_{svml,acml} we are using mathfn_built_in just so that
we don't have to translate the combined_fn and SFmode vs. DFmode into
strings ourselfs, and we already earlier punt on anything but SFmode and
DFmode. So, this patch just uses the double or float types depending
on the modes, rather than the types we actually got and which might be
_Float64 or _Float32 etc.
2022-04-26 Jakub Jelinek <jakub@redhat.com>
PR target/105367
* config/i386/i386.cc (ix86_veclibabi_svml, ix86_veclibabi_acml): Pass
el_mode == DFmode ? double_type_node : float_type_node instead of
TREE_TYPE (type_in) as first arguments to mathfn_built_in.
On Mon, Apr 25, 2022 at 01:38:25PM +0200, Mikael Morin wrote:
> I have just pushed the attached fix for two UNRESOLVED checks at -O0 that I
> hadn’t seen.
I don't like forcing of DSE in -O0 compilation, wouldn't it be better
to just not check the dse dump at -O0 like in the following patch?
Even better would be to check that the z._data = stores are both present
in *.optimized dump, but that doesn't really work at -O2 or above because
we inline the functions and optimize it completely away (both the stores
and corresponding reads).
The first hunk is needed so that __OPTIMIZE__ effective target works in
Fortran testsuite, otherwise one gets a pedantic error and __OPTIMIZE__
is considered not to match at all.
2022-04-26 Jakub Jelinek <jakub@redhat.com>
PR fortran/103662
* lib/target-supports.exp (check_effective_target___OPTIMIZE__): Add
a var definition to avoid pedwarn about empty translation unit.
* gfortran.dg/unlimited_polymorphic_3.f03: Remove -ftree-dse from
dg-additional-options, guard scan-tree-dump-not directives on
__OPTIMIZE__ target.