Jakub Jelinek [Tue, 15 Dec 2020 08:51:28 +0000 (09:51 +0100)]
i386: Make -march=x86-64-v[234] behave more like other -march= options
If somebody has -march=x86-64-v2 (or -v3 or -v4) in $CFLAGS, $CXXFLAGS etc.,
then -m32 or -mabi=ms stops working.
What is worse, if one configures gcc --with-arch-64=x86-64-v2 (or -v3 or -v4),
then -mabi=ms stops working.
I think that is a nightmare user experience. It is ok that x86-64-v[234]
behave slightly different from other -march= options (in that they imply
unless overridden -mtune=generic rather then -mtune= equal to the -march
argument), but the error when one mixes it with -mabi=ms, or -m32 doesn't
improve anything.
It is true that the exact option set is only defined in the x86-64 psABI
(IMHO that is a mistake too, we should copy that into the GCC documentation
like we document it for any other -march= option), but there is no reason
why that exact set of CPU features can't be used for other ABIs, it is just
a set of CPU features. If we add micro-architecture levels to the 32-bit
ABI (I doubt anyone wants to do that, but just hypothetically), then those
micro-architecture levels wouldn't certainly be called x86-64-v* but perhaps
i386-v*.
In the tests, __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 can't be expected on -m32
not because the CPU feature wouldn't be set, but because the instruction
is 64-bit only and 32-bit code doesn't have __int128 etc. support.
2020-12-15 Jakub Jelinek <jakub@redhat.com>
* config/i386/i386-options.c (ix86_option_override_internal): Don't
error on -march=x86-64-v[234] with -m32 or -mabi=ms.
* config.gcc: Don't reject --with-arch=x86-64-v[234] or
--with-arch_32=x86-64-v[234].
* doc/invoke.texi (-march=x86-64-v[234]): Document what the option
does for other ABIs.
* gcc.target/i386/x86-64-v2.c: Don't expect
__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 to be defined with -m32.
* gcc.target/i386/x86-64-v2-other.c: New test.
* gcc.target/i386/x86-64-v2-msabi.c: New test.
* gcc.target/i386/x86-64-v3.c: Fix a comment pasto. Don't expect
__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 to be defined with -m32.
* gcc.target/i386/x86-64-v3-other.c: New test.
* gcc.target/i386/x86-64-v3-msabi.c: New test.
* gcc.target/i386/x86-64-v4.c:Don't expect
__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 to be defined with -m32.
* gcc.target/i386/x86-64-v4-other.c: New test.
* gcc.target/i386/x86-64-v4-msabi.c: New test.
Ilya Leoshkevich [Thu, 10 Dec 2020 23:54:49 +0000 (00:54 +0100)]
aix: Fixinclude updates [PR98208]
After 92648faa1cb2 ("aix: Fixinclude") make check-fixincludes began to
fail (at least on gcc121 machine). Fix by updating fixincludes/tests
and rerunning genfixes.
* fixincl.x: Rerun genfixes.
* inclhack.def(aix_physadr_t): Change test_text to something
that needs to be replaced.
* tests/base/sys/types.h(aix_physadr_t): Add expectation.
Resolves:
PR middle-end/98166 - bogus -Wmismatched-dealloc on user-defined allocator and inlining
PR c++/57111 - 57111 - Generalize -Wfree-nonheap-object to delete
PR middle-end/98160 - ICE in default_tree_printer at gcc/tree-diagnostic.c:270
gcc/ChangeLog:
PR middle-end/98166
PR c++/57111
PR middle-end/98160
* builtins.c (check_access): Call tree_inlined_location
fndecl_alloc_p): Handle BUILT_IN_ALIGNED_ALLOC and
BUILT_IN_GOMP_ALLOC.
call_dealloc_p): Remove unused function.
(new_delete_mismatch_p): Call valid_new_delete_pair_p and rework.
(matching_alloc_calls_p): Handle built-in deallocation functions.
(warn_dealloc_offset): Corrct the handling of user-defined operators
delete.
(maybe_emit_free_warning): Avoid assuming expression is a decl.
Simplify.
* doc/extend.texi (attribute malloc): Update.
* tree-ssa-dce.c (valid_new_delete_pair_p): Factor code out into
valid_new_delete_pair_p in tree.c.
* tree.c (tree_inlined_location): Define new function.
(valid_new_delete_pair_p): Define.
* tree.h (tree_inlined_location): Declare.
(valid_new_delete_pair_p): Declare.
gcc/c-family/ChangeLog:
PR middle-end/98166
PR c++/57111
PR middle-end/98160
* c-attribs.c (maybe_add_noinline): New function.
(handle_malloc_attribute): Call it. Use ATTR_FLAG_INTERNAL.
Implicitly add attribute noinline to functions not declared inline
and warn on those.
PR middle-end/98166
PR c++/57111
PR middle-end/98160
* g++.dg/warn/Wmismatched-dealloc-2.C: Adjust test of expected warning.
* g++.dg/warn/Wmismatched-new-delete.C: Same.
* gcc.dg/Wmismatched-dealloc.c: Same.
* c-c++-common/Wfree-nonheap-object-2.c: New test.
* c-c++-common/Wfree-nonheap-object-3.c: New test.
* c-c++-common/Wfree-nonheap-object.c: New test.
* c-c++-common/Wmismatched-dealloc.c: New test.
* g++.dg/warn/Wfree-nonheap-object-3.C: New test.
* g++.dg/warn/Wfree-nonheap-object-4.C: New test.
* g++.dg/warn/Wmismatched-dealloc-2.C: New test.
* g++.dg/warn/Wmismatched-new-delete-2.C: New test.
* g++.dg/warn/Wmismatched-new-delete.C: New test.
* gcc.dg/Wmismatched-dealloc-2.c: New test.
* gcc.dg/Wmismatched-dealloc-3.c: New test.
* gcc.dg/Wmismatched-dealloc.c: New test.
Wilco Dijkstra [Thu, 3 Dec 2020 18:40:34 +0000 (18:40 +0000)]
AArch64: Add support for --with-tune
Add support for --with-tune. Like --with-cpu and --with-arch, the argument is
validated and transformed into a -mtune option to be processed like any other
command-line option. --with-tune has no effect if a -mcpu or -mtune option
is used. The validating code didn't allow --with-cpu=native, so explicitly
allow that.
Co-authored-by: Delia Burduv <delia.burduv@arm.com>
Bootstrap OK, regress pass, OK to commit?
Justin Squirek [Fri, 20 Nov 2020 13:11:12 +0000 (08:11 -0500)]
[Ada] Incorrect accessibility level on type in formal package
gcc/ada/
* sem_util.adb, sem_util.ads (In_Generic_Formal_Package):
Created to identify type declarations occurring within generic
formal packages.
* sem_res.adb (Resolve_Allocator): Add condition to avoid
emitting an error for allocators when the type being allocated
is class-wide and from a generic formal package.
Eric Botcazou [Sat, 21 Nov 2020 23:54:18 +0000 (00:54 +0100)]
[Ada] Adjust again previous change to System.Fat_Gen
gcc/ada/
* libgnat/s-fatgen.adb: Add with clause for Interfaces and use
type clause for Interfaces.Unsigned_64.
(Small): Comment out.
(Tiny): Likewise.
(Tiny16): New integer constant.
(Tiny32): Likewise.
(Tiny64): Likewise.
(Tiny80): New integer array constant.
(Pred): Declare a local overlay for Tiny.
(Succ): Likewise.
Eric Botcazou [Fri, 20 Nov 2020 20:29:13 +0000 (21:29 +0100)]
[Ada] Fix internal error on bit-packed array in Volatile_Full_Access record
gcc/ada/
* exp_pakd.adb (Expand_Bit_Packed_Element_Set): Fix again packed
array type in complex cases where array is Volatile.
* exp_util.adb (Remove_Side_Effects): Do not force a renaming to
be handled by the back-end.
Eric Botcazou [Fri, 20 Nov 2020 18:33:21 +0000 (19:33 +0100)]
[Ada] Adjust previous change to System.Fat_Gen
gcc/ada/
* libgnat/s-fatgen.adb: Remove use clause for
System.Unsigned_Types.
(Scaling): Add renaming of System.Unsigned_Types and use type
clause for Long_Long_Unsigned.
Gary Dismukes [Thu, 19 Nov 2020 20:18:39 +0000 (15:18 -0500)]
[Ada] Fix documentation of -gnatw.K switch (activates => disables)
gcc/ada/
* doc/gnat_ugn/building_executable_programs_with_gnat.rst:
Correct documentation of the -gnatw.K switch to say that it
disables rather than activates the warning.
* gnat_ugn.texi: Regenerate.
Gary Dismukes [Wed, 18 Nov 2020 23:06:14 +0000 (18:06 -0500)]
[Ada] Additional fixes for Default_Initial_Condition
gcc/ada/
* exp_aggr.adb (Build_Array_Aggr_Code.Gen_Assign): Move
generation of the call for DIC check past the optional
generation of calls to controlled Initialize procedures.
* exp_ch3.adb
(Build_Array_Init_Proc.Init_One_Dimension.Possible_DIC_Call):
Suppress generation of a DIC call when the array component type
is controlled. The call will now be generated later inside the
array's DI (Deep_Initialize) procedure.
* exp_ch7.adb
(Make_Deep_Array_Body.Build_Initialize_Statements): Generate a
DIC call (when needed by the array component type) after any
call to the component type's controlled Initialize procedure, or
generate the DIC call by itself if there's no Initialize to
call.
* sem_aggr.adb (Resolve_Record_Aggregate.Add_Association):
Simplify condition to only test Is_Box_Init_By_Default (previous
condition was overkill, as well as incorrect in some cases).
* sem_elab.adb (Active_Scenarios.Output_Call): For
Default_Initial_Condition, suppress call to
Output_Verification_Call when the subprogram is a partial DIC
procedure.
Eric Botcazou [Wed, 18 Nov 2020 20:42:18 +0000 (21:42 +0100)]
[Ada] Fix couple of bugs in the implementation of Round attribute
gcc/ada/
* exp_attr.adb (Expand_N_Attribute_Reference) <Attribute_Round>:
Adjust commentary and set the Rounded_Result flag on the type
conversion node when the node is needed.
* exp_ch4.adb (Expand_N_Type_Conversion): Minor tweak.
(Fixup_Universal_Fixed_Operation): Look through the type conversion
only when it is to Universal_Real.
* exp_fixd.adb: Remove with and use clauses for Snames.
(Build_Divide): Remove redundant test.
(Expand_Convert_Float_To_Fixed): Use Rounded_Result flag on the
node to set the truncation parameter.
Eric Botcazou [Tue, 17 Nov 2020 08:21:19 +0000 (09:21 +0100)]
[Ada] Tidy up implementation of System.Fat_Gen.Valid and inline it again
gcc/ada/
* libgnat/s-fatgen.ads (Valid): Add again pragma Inline.
* libgnat/s-fatgen.adb (Valid): Improve commentary, tidy up left
and right, and remove superfluous trick for denormalized numbers.
Piotr Trojanek [Mon, 16 Nov 2020 14:21:20 +0000 (15:21 +0100)]
[Ada] Fix analysis of access objects in Depends contracts
gcc/ada/
* sem_prag.adb (Find_Role): Constant object of
access-to-constant and access-to-subprogram types are not
writable.
(Collect_Subprogram_Inputs_Outputs): In-parameters of
access-to-variable type can act as outputs of the Depends
contracts.
Piotr Trojanek [Thu, 5 Nov 2020 09:14:36 +0000 (10:14 +0100)]
[Ada] Update comment for processing of pragma Assertion_Policy
gcc/ada/
* sa_messages.ads: Reference Subprogram_Variant in the comment
for Assertion_Check.
* sem_prag.adb (Analyze_Pragma): Add Subprogram_Variant as an
ID_ASSERTION_KIND; move Default_Initial_Condition as an
RM_ASSERTION_KIND.
Yannick Moy [Mon, 16 Nov 2020 11:06:32 +0000 (12:06 +0100)]
[Ada] Correctly mark subprogram as not always inlined in GNATprove mode
gcc/ada/
* inline.adb (Cannot_Inline): Add No_Info parameter to disable
info message.
* inline.ads (Cannot_Inline): When No_Info is set to True, do
not issue info message in GNATprove mode, but still mark the
subprogram as not always inlined.
* sem_res.adb (Resolve_Call): Always call Cannot_Inline inside
an assertion expression.
Nathan Sidwell [Mon, 14 Dec 2020 15:21:49 +0000 (07:21 -0800)]
preprocessor: Deferred macro support
For deferred macros we also need a new field on the macro itself, so
that the module machinery can determine the macro was imported. Also
the documentation for the hashnode's deferred field was incomplete.
libcpp/
* include/cpplib.h (struct cpp_macro): Add imported_p field.
(struct cpp_hashnode): Tweak deferred field documentation.
* macro.c (_cpp_new_macro): Clear new field.
(cpp_get_deferred_macro, get_deferred_or_lazy_macro): Assert
more.
Commit 2ead1ab91123 ("Limit perf data buffer during profiling") added
-m8 to perf invocations during running tests, but the same problem
exists for checking whether perf is working in the first place.
gcc/testsuite/ChangeLog:
2020-12-08 Ilya Leoshkevich <iii@linux.ibm.com>
* lib/target-supports.exp(check_profiling_available): Limit
perf data buffer.
Christophe Lyon [Mon, 7 Dec 2020 14:43:18 +0000 (14:43 +0000)]
arm: Auto-vectorization for MVE: vneg
This patch enables MVE vneg instructions for auto-vectorization. MVE
vnegq insns in mve.md are modified to use 'neg' instead of unspec
expression. The neg<mode>2 expander is added to vec-common.md.
Existing patterns in neon.md are prefixed with neon_.
It's not clear why we have different patterns for VDQW
and VH in neon.md, when WDQWH handles both, and patterns
with VDQ have provision for attributes for FP modes.
Another question is why <absneg_str><mode>2 always sets
neon_abs<q> type when it also handles neon_neq<q> cases.
Christophe Lyon [Wed, 2 Dec 2020 12:20:02 +0000 (12:20 +0000)]
arm: Auto-vectorization for MVE: vmvn
This patch enables MVE vmvnq instructions for auto-vectorization. MVE
vmvnq insns in mve.md are modified to use 'not' instead of unspec
expression to support one_cmpl<mode>2. The one_cmpl<mode>2 expander
is added to vec-common.md.
gcc/
* config/arm/iterators.md (supf): Remove VBICQ_S and VBICQ_U.
(VBICQ): Remove.
* config/arm/mve.md (mve_vbicq_u<mode>): New entry for vbic
instruction using expression and not.
(mve_vbicq_s<mode>): New expander.
(mve_vbicq_f<mode>): Replace use of unspec by 'and not'.
* config/arm/unspecs.md (VBICQ_S, VBICQ_U, VBICQ_F): Remove.
gcc/testsuite/
* gcc.target/arm/simd/mve-vbic.c: Add tests for vbic.
Christophe Lyon [Fri, 13 Nov 2020 13:05:43 +0000 (13:05 +0000)]
arm: Auto-vectorization for MVE: veor
This patch enables MVE veorq instructions for auto-vectorization. MVE
veorq insns in mve.md are modified to use xor instead of unspec
expression to support xor<mode>3. The xor<mode>3 expander is added to
vec-common.md
Christophe Lyon [Mon, 14 Dec 2020 10:40:45 +0000 (10:40 +0000)]
arm,testsuite: Fix vect-half-floats.c test
This patch fixes typos in effective targets which otherwise lead to
DejaGnu errors.
It also replaces dg-additional-options with dg-options to avoid
compiling with -ansi -pedantic-errors, resulting in
error: ISO C does not support the '_Float16' type [-Wpedantic]
Nikhil Benesch [Mon, 14 Dec 2020 07:37:11 +0000 (23:37 -0800)]
-fgo-dump-spec: skip typedefs that match struct tag
gcc/:
* godump.c (go_output_typedef): Suppress typedefs whose name
matches the tag of the underlying struct, union, or enum.
Output declarations for enums that do not appear in typedefs.
gcc/testsuite:
* gcc.misc-tests/godump-1.c: Add test cases.
François Dumont [Sat, 12 Dec 2020 17:02:47 +0000 (18:02 +0100)]
libstdc++: Fix several _GLIBCXX_DEBUG tests
libstdc++-v3/ChangeLog:
* testsuite/23_containers/array/debug/back2_neg.cc: target c++14 because assertion
for constexpr is disabled in C++11.
* testsuite/23_containers/array/debug/front2_neg.cc: Likewise.
* testsuite/23_containers/array/debug/square_brackets_operator2_neg.cc: Likewise.
* testsuite/23_containers/vector/debug/multithreaded_swap.cc: Include <memory>
for shared_ptr.
Avoid the possibility of code discrepancies like one fixed with the
previous change and improve the structure of code by selecting between
push and non-push operations in a single place in `vax_output_int_move'.
The PUSHAB/MOVAB address moves are never actually produced from this
code as the SImode invocation of this function is guarded with the
`nonsymbolic_operand' predicate, but let's not mess up with this code
too much on this occasion and keep the piece in place.
VAX: Check the correct operand for constant 0 push operation
Check the output operand for representing pushing a value onto the stack
rather than the constant 0 input in determining whether to use the PUSHL
or the CLRL instruction for a SImode move. The latter actually works by
means of using the predecrement addressing mode with the SP register and
the machine code produced even takes the same number of bytes, however
at least with some VAX implementations it incurs a performance penalty.
Besides, we don't want to check the wrong operand anyway and have code
that works by chance only.
Add a test case covering push operations; for operands different from
constant zero there is actually a code size advantage for using PUSHL
rather than the equivalent MOVL instruction.
gcc/
* config/vax/vax.c (vax_output_int_move): Check the correct
operand for constant 0 push operation.
VAX: Handle subtracting from self with QMATH DImode add/sub
Remove an assertion the failure of which has not been actually observed,
but which appears clearly dangerous, for when the QMATH DImode add/sub
handler is invoked with the subtrahend and the minuend both the same.
Instead handle the operation by emitting a move of constant 0 to the
output operand. Adjust the relevant inline comment accordingly.
gcc/
* config/vax/vax.c (vax_expand_addsub_di_operands): Handle equal
input operands with subtraction.
during RTL pass: expand
.../gcc/testsuite/gcc.c-torture/compile/sync-1.c: In function 'test_op_ignore':
.../gcc/testsuite/gcc.c-torture/compile/sync-1.c:33:10: internal compiler error: in vax_expand_addsub_di_operands, at config/vax/vax.c:2080
0x11815003 vax_expand_addsub_di_operands(rtx_def**, rtx_code)
.../gcc/config/vax/vax.c:2080
0x11d409af gen_adddi3(rtx_def*, rtx_def*, rtx_def*)
.../gcc/config/vax/vax.md:755
0x10ea2763 rtx_insn* insn_gen_fn::operator()<rtx_def*, rtx_def*, rtx_def*>(rtx_def*, rtx_def*, rtx_def*) const
.../gcc/recog.h:304
0x10f7fc8f maybe_gen_insn(insn_code, unsigned int, expand_operand*)
.../gcc/optabs.c:7402
0x10f67f8b expand_binop_directly
.../gcc/optabs.c:1122
0x10f684cf expand_binop(machine_mode, optab_tag, rtx_def*, rtx_def*, rtx_def*, int, optab_methods)
.../gcc/optabs.c:1209
0x10f6fb4f expand_unop(machine_mode, optab_tag, rtx_def*, rtx_def*, int)
.../gcc/optabs.c:3013
0x10f6c493 expand_simple_unop(machine_mode, rtx_code, rtx_def*, rtx_def*, int)
.../gcc/optabs.c:2200
0x10f7e2f3 expand_atomic_fetch_op(rtx_def*, rtx_def*, rtx_def*, rtx_code, memmodel, bool)
.../gcc/optabs.c:7021
0x107f7523 expand_builtin_sync_operation
.../gcc/builtins.c:7605
0x107ff547 expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int)
.../gcc/builtins.c:9430
0x10acda63 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
.../gcc/expr.c:11249
0x10abeb9f expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
.../gcc/expr.c:8486
0x1085606b expand_expr
.../gcc/expr.h:282
0x1086157f expand_call_stmt
.../gcc/cfgexpand.c:2709
0x10865ab7 expand_gimple_stmt_1
.../gcc/cfgexpand.c:3713
0x108662fb expand_gimple_stmt
.../gcc/cfgexpand.c:3877
0x10870387 expand_gimple_basic_block
.../gcc/cfgexpand.c:5918
0x10872b6b execute
.../gcc/cfgexpand.c:6602
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
compiler exited with status 1
FAIL: gcc.c-torture/compile/sync-1.c -O0 (internal compiler error)
causing numerous failures in regression testing.
While requesting an addition operation to be produced for the constant
operands of 0 and -1 may seem silly, technically there is nothing wrong
with it, and non-QMATH code (as with the `-mno-qmath' option) has no
issues with that, so neither should QMATH code. This operation will
normally be folded in later passes anyway.
Observe then, that adding or subtracting constant 0 amounts to a move
(and we even have a machine instruction available to do that with a
single operation) so handle the case explicitly, swapping the addends if
so required, removing the assertion failure and along with that 70 test
suite failures like:
FAIL: gcc.c-torture/compile/sync-1.c -O0 (internal compiler error)
FAIL: gcc.c-torture/compile/sync-1.c -O0 fetch_and_nand (test for warnings, line )
FAIL: gcc.c-torture/compile/sync-1.c -O0 nand_and_fetch (test for warnings, line )
FAIL: gcc.c-torture/compile/sync-1.c -O0 (test for excess errors)
FAIL: gcc.c-torture/compile/sync-2.c -O0 (internal compiler error)
FAIL: gcc.c-torture/compile/sync-2.c -O0 (test for warnings, line )
FAIL: gcc.c-torture/compile/sync-2.c -O0 (test for excess errors)
FAIL: gcc.c-torture/compile/sync-3.c -O0 (internal compiler error)
FAIL: gcc.c-torture/compile/sync-3.c -O0 (test for warnings, line )
FAIL: gcc.c-torture/compile/sync-3.c -O0 (test for excess errors)
and similarly across all the other optimization levels and compilation
options covered.
gcc/
* config/vax/vax.c (vax_expand_addsub_di_operands): Handle the
addition or subtraction of 0.
VAX: Remove unused register allocation from QMATH DImode add/sub handler
An allocation is made for a temporary register, however it is unneeded,
as actually explained in the comment preceding the conditional block in
question, and consequently never used, so remove it. The `temp' rtx is
already used elsewhere in the function, which is possibly why this dead
assignment has not been warned about.
Fix an issue with the `casesi' expander using `GEN_INT' to produce the
constant rtx for lower bound adjustment. This generates a VOIDmode
value which may overflow the SImode range required for the operand to
stay within to satisfy `general_operand', resulting in an ICE like:
.../gcc/testsuite/gcc.c-torture/compile/pr46934.c: In function 'caller':
.../gcc/testsuite/gcc.c-torture/compile/pr46934.c:17:1: error: unrecognizable insn:
(insn 5 2 6 2 (set (reg:SI 25)
(plus:SI (mem/c:SI (reg/f:SI 17 virtual-incoming-args) [1 reg_type+0 S4 A32])
(const_int 2147483648 [0x80000000]))) -1
(nil))
during RTL pass: vregs
.../gcc/testsuite/gcc.c-torture/compile/pr46934.c:17:1: internal compiler error: in extract_insn, at recog.c:2315
0x110d4673 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
.../gcc/rtl-error.c:108
0x110d46eb _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
.../gcc/rtl-error.c:116
0x1106578b extract_insn(rtx_insn*)
.../gcc/recog.c:2315
0x10b63f73 instantiate_virtual_regs_in_insn
.../gcc/function.c:1609
0x10b65b2f instantiate_virtual_regs
.../gcc/function.c:1979
0x10b65ca7 execute
.../gcc/function.c:2028
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
compiler exited with status 1
FAIL: gcc.c-torture/compile/pr46934.c -O0 (internal compiler error)
Use `gen_int_mode' to produce the rtx instead, requesting a SImode value
so that the constant gets correctly truncated:
@@ -199,7 +199,7 @@ caller (unsigned int reg_type)
Jakub Jelinek [Sun, 13 Dec 2020 18:25:33 +0000 (19:25 +0100)]
widening_mul: Fix a > ~b to .ADD_OVERFLOW optimization [PR98256]
Unfortunately, my latest tree-ssa-math-opts.c patch broke the following
testcase. The problem is that the code is adding .ADD_OVERFLOW or
.SUB_OVERFLOW before or after the stmt on which the function has been
called, which is normally a addition or subtraction that has all the
operands.
But in the a > ~b optimization that stmt is the ~b stmt and the other
comparison operand might be defined only after that ~b stmt, so we can't
insert the .ADD_OVERFLOW next to ~b that we want to delete, but need to
insert it before the a > temp comparison that uses it; and in that case
when removing the BIT_NOT_EXPR stmt we need to ensure the caller doesn't do
gsi_next because gsi_remove already points the iterator to the next stmt.
2020-12-13 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/98256
* tree-ssa-math-opts.c (match_uaddsub_overflow): For BIT_NOT_EXPR,
only handle a single use, and insert .ADD_OVERFLOW before the
comparison rather than after the BIT_NOT_EXPR. Return true iff
it is BIT_NOT_EXPR and it has been removed.
(math_opts_dom_walker::after_dom_children) <case BIT_NOT_EXPR>:
If match_uaddsub_overflow returned true, continue instead of break.
Jakub Jelinek [Sun, 13 Dec 2020 16:08:08 +0000 (17:08 +0100)]
varasm: Reject soft frame or arg pointer registers for register vars [PR92469]
The following patch rejects frame, argp and retarg registers (unless they are equal
to hard frame pointer registers or if they aren't eliminable) from local or global
register vars.
These are just internal implementation details eliminated later into hard
frame pointer or stack pointer and using them as register variable leads
to numerous ICEs.
2020-12-13 Jakub Jelinek <jakub@redhat.com>
PR target/92469
* varasm.c (eliminable_regno_p): New function.
(make_decl_rtl): Reject asm vars for frame and argp
if they are different from hard frame pointer.
* gcc.target/i386/pr92469.c: New test.
* gcc.target/i386/pr79804.c: Adjust expected diagnostics.
* gcc.target/i386/pr88178.c: Expect an error.
Tamar Christina [Sun, 13 Dec 2020 13:56:30 +0000 (13:56 +0000)]
Arm: Add support for auto-vectorization using HF mode.
This adds support to the auto-vectorizer to support HFmode vectorization for
AArch32. This is supported when +fp16 is used. I wonder if I should disable
the returning of the type if the option isn't enabled.
At the moment it will be returned but the vectorizer will try and fail to use
it. It wastes a few compile cycles but doesn't result in bad code.
Tamar Christina [Sun, 13 Dec 2020 13:54:48 +0000 (13:54 +0000)]
middle-end: Support complex Addition
This patch adds support for
* Complex Addition with rotation of 90 and 270.
Addition with rotation of the second argument around the Argand plane.
Supported rotations are 90 and 180.
c = a + (b * I) and c = a + (b * I * I * I)
gcc/ChangeLog:
* tree-vect-slp-patterns.c: New file.
* Makefile.in: Add it.
* doc/passes.texi: Document it.
* internal-fn.def (COMPLEX_ADD_ROT90, COMPLEX_ADD_ROT270): New.
* optabs.def (cadd90_optab, cadd270_optab): New.
* doc/md.texi: Document them.
* tree-vect-loop.c (vect_analyze_loop_2): Add dissolve code.
* tree-vect-slp.c:
(vect_free_slp_instance, vect_create_new_slp_node): Export.
(vect_match_slp_patterns_2, vect_match_slp_patterns): New.
(vect_analyze_slp): Use it.
* tree-vectorizer.h (vect_free_slp_tree): Export.
(enum _complex_operation): Forward declare.
(class vect_pattern): New
gcc/testsuite/ChangeLog:
* lib/target-supports.exp
(check_effective_target_arm_v8_3a_complex_neon_ok_nocache): Fix it.
(check_effective_target_vect_complex_add_byte
,check_effective_target_vect_complex_add_int
,check_effective_target_vect_complex_add_short
,check_effective_target_vect_complex_add_long
,check_effective_target_vect_complex_add_half
,check_effective_target_vect_complex_add_float
,check_effective_target_vect_complex_add_double): New.
* gcc.dg/vect/complex/bb-slp-complex-add-pattern-byte.c: New test.
* gcc.dg/vect/complex/bb-slp-complex-add-pattern-int.c: New test.
* gcc.dg/vect/complex/bb-slp-complex-add-pattern-long.c: New test.
* gcc.dg/vect/complex/bb-slp-complex-add-pattern-short.c: New test.
* gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-byte.c: New test.
* gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-int.c: New test.
* gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-long.c: New test.
* gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-short.c: New test.
* gcc.dg/vect/complex/complex-add-pattern-template.c: New test.
* gcc.dg/vect/complex/complex-add-template.c: New test.
* gcc.dg/vect/complex/complex-operations-run.c: New test.
* gcc.dg/vect/complex/complex-operations.c: New test.
* gcc.dg/vect/complex/complex.exp: New test.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-double.c: New test.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-float.c: New test.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-half-float.c: New test.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-double.c: New test.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-float.c: New test.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-half-float.c: New test.
* gcc.dg/vect/complex/fast-math-complex-add-double.c: New test.
* gcc.dg/vect/complex/fast-math-complex-add-float.c: New test.
* gcc.dg/vect/complex/fast-math-complex-add-half-float.c: New test.
* gcc.dg/vect/complex/fast-math-complex-add-pattern-double.c: New test.
* gcc.dg/vect/complex/fast-math-complex-add-pattern-float.c: New test.
* gcc.dg/vect/complex/fast-math-complex-add-pattern-half-float.c: New test.
* gcc.dg/vect/complex/vect-complex-add-pattern-byte.c: New test.
* gcc.dg/vect/complex/vect-complex-add-pattern-int.c: New test.
* gcc.dg/vect/complex/vect-complex-add-pattern-long.c: New test.
* gcc.dg/vect/complex/vect-complex-add-pattern-short.c: New test.
* gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-byte.c: New test.
* gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-int.c: New test.
* gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-long.c: New test.
* gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-short.c: New test.
This patch fixes this by stripping double quotes from section names.
* However, this didn't work initially (only the leading quote was
stripped), which is due to David's recent AIX patch: with the
introduction of the new capturing group to handle both .section (ELF)
and .csect (XCOFF), $full_section_directive would never be empty on
ELF and Mach-O targets, so the extraction of the section name didn't
work any longer. This had also broken the Darwin tests completely.
* With working double quote stripping, all but one of the tests PASSed
on Solaris/SPARC, the exception being:
FAIL: gcc.dg/20021029-1.c scan-assembler-symbol-section symbol ar (found __sparc_get_pc_thunk.l7) has section ^\\\\.(const|rodata)|\\\\[RO\\\\] (found .text.__sparc_get_pc_thunk.l7%__sparc_get_pc_thunk.l7)
This is due to the symbol name (ar) not being anchored in the test and
unexpectedly matchting __sparc_get_pc_thunk.l7.
* Next, I ran the tests on Darwin 11 and found two failing tests:
FAIL: gcc.dg/darwin-sections.c scan-assembler-symbol-section symbol ^_a\$ (symbol not found) has section \\\\.data
FAIL: gcc.dg/darwin-sections.c scan-assembler-symbol-section symbol ^_b\$ (symbol not found) has section \\\\.data
is due to Iain's recent "Darwin : Begin rework of zero-fill sections."
patch which emits
.globl _a
.zerofill __DATA,__common,_a,1,0
This is already scanned for, so the two scans above can just go.
The other failing test is
FAIL: g++.dg/gomp/tls-5.C -std=c++14 scan-assembler-symbol-section symbol ^_?_ZGR2ir_\$ (symbol not found) has section ^\\\\.tdata|\\\\[TL\\\\]
FAIL: g++.dg/gomp/tls-5.C -std=c++14 scan-assembler-symbol-section symbol ^_?ir\$ (symbol not found) has section ^\\\\.tbss|\\\\[TL\\\\]
Other scans are guarded by target tls_native, and indeed the assembler
output has
___emutls_v._ZGR2ir_:
___emutls_t._ZGR2ir_:
___emutls_v.ir:
Unfortunately scan-assembler-symbol-section doesn't support selects
yet, which this test implements both for the benefit of this test and
for symmetry.
With those changes, test results are clean now on sparc-sun-solaris2.11,
i386-pc-solaris2.11, i386-apple-darwin11.4.2, and
powerpc-ibm-aix7.2.4.0.
gcc:
* doc/sourcebuild.texi (Commands for use in dg-final, Scan the
assembly output, scan-assembler-symbol-section): Document.
(scan-symbol-section): Document.
gcc/testsuite:
* lib/scanasm.exp (scan-symbol-section): Pass args to
dg-scan-symbol-section.
(scan-assembler-symbol-section): Likewise.
(dg-scan-symbol-section): Handle selector from orig_args.
Get patterns from orig_args.
(parse_section_of_symbols): Fix section_pattern.
Strip double quotes from section name.
* g++.dg/gomp/tls-5.C: Restrict ir, _ZGR2ir_ scans to tls_native.
* gcc.dg/20021029-1.c: Anchor ar symbol.
* gcc.dg/darwin-sections.c: Remove obsolete scans for _a, _b in
.data.
But this doesn't scale well to larger hierarchies, because it only
defines ::test for an argument that is exactly “symtab_node *”
(and not for example “const symtab_node *” or something that
comes between cgraph_node and symtab_node in the hierarchy).
For example:
struct A { int x; };
struct B : A {};
struct C : B {};
It also adds a general specialisation of is_a_helper for const
pointers. Together, this makes both of the above examples work.
gcc/
* is-a.h (reinterpret_is_a_helper): New class.
(static_is_a_helper): Likewise.
(is_a_helper): Inherit from reinterpret_is_a_helper.
(is_a_helper<const T *>): New specialization.
Move iterator_range to a new iterator-utils.h file
A later patch will add more iterator-related utilities. Rather than
putting them all directly in coretypes.h, it seemed better to add a
new header file, here called "iterator-utils.h". This preliminary
patch moves the existing iterator_range class there too.
I used the same copyright date range as coretypes.h “just to be sure”.
noop_move_p currently keeps any instruction that has a REG_EQUAL
note, on the basis that the equality might be useful in future.
But this creates a perverse incentive not to add potentially-useful
REG_EQUAL notes, in case they prevent an instruction from later being
removed as dead.
The condition originates from flow.c:life_analysis_1 and predates
the changes tracked by the current repository (1992). It probably
made sense when most optimisations were done on RTL rather than FE
trees, but it seems counterproductive now.
gcc/
* rtlanal.c (noop_move_p): Don't check for REG_EQUAL notes.
The __glibcxx_check_can_[increment|decrement]_range macros are using the
_GLIBCXX_DEBUG_VERIFY_COND_AT macro which is not constexpr compliant and will produce nasty
diagnostics rather than the std::__failed_assertion dedicated to constexpr. Replace it with
correct _GLIBCXX_DEBUG_VERIFY_AT_F.
libstdc++-v3/ChangeLog:
* include/debug/macros.h (__glibcxx_check_can_increment_range): Replace
_GLIBCXX_DEBUG_VERIFY_COND_AT usage with _GLIBCXX_DEBUG_VERIFY_AT_F.
(__glibcxx_check_can_decrement_range): Likewise.
* testsuite/25_algorithms/copy_backward/constexpr.cc (test03): New.
* testsuite/25_algorithms/copy/debug/constexpr_neg.cc: New test.
* testsuite/25_algorithms/copy_backward/debug/constexpr_neg.cc: New test.
* testsuite/25_algorithms/equal/constexpr_neg.cc: New test.
* testsuite/25_algorithms/equal/debug/constexpr_neg.cc: New test.
This patch adds the ~(X - Y) -> ~X + Y simplification requested
in the PR (plus also ~(X + C) -> ~X + (-C) for constants C that can
be safely negated.
The first two simplify blocks is what has been requested in the PR
and that makes the first testcase pass.
Unfortunately, that change also breaks the second testcase, because
while the same expressions appearing in the same stmt and split
across multiple stmts has been folded (not really) before, with
this optimization fold-const.c optimizes ~X + Y further into
(Y - X) - 1 in fold_binary_loc associate: code, but we have nothing
like that in GIMPLE and so end up with different expressions.
The last simplify is an attempt to deal with just this case,
had to rule out there the Y == -1U case, because then we
reached infinite recursion as ~X + -1U was canonicalized by
the pattern into (-1U - X) + -1U but there is a canonicalization
-1 - A -> ~A that turns it back. Furthermore, had to make it #if
GIMPLE only, because it otherwise resulted in infinite recursion
when interacting with the associate: optimization.
The end result is that we pass all 3 testcases and thus canonizalize
the 3 possible forms of writing the same thing.
Jakub Jelinek [Sat, 12 Dec 2020 13:48:47 +0000 (14:48 +0100)]
widening_mul: Recognize another form of ADD_OVERFLOW [PR96272]
The following patch recognizes another form of hand written
__builtin_add_overflow (this time _p), in particular when
the code does unsigned
if (x > ~0U - y)
or
if (x <= ~0U - y)
it can be optimized (if the subtraction turned into ~y is single use)
into
if (__builtin_add_overflow_p (x, y, 0U))
or
if (!__builtin_add_overflow_p (x, y, 0U))
and generate better code, e.g. for the first function in the testcase:
- movl %esi, %eax
addl %edi, %esi
- notl %eax
- cmpl %edi, %eax
- movl $-1, %eax
- cmovnb %esi, %eax
+ jc .L3
+ movl %esi, %eax
+ ret
+.L3:
+ orl $-1, %eax
ret
on x86_64. As for the jumps vs. conditional move case, that is some CE
issue with complex branch patterns we should fix up no matter what, but
in this case I'm actually not sure if branchy code isn't better, overflow
is something that isn't that common.
2020-12-12 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96272
* tree-ssa-math-opts.c (uaddsub_overflow_check_p): Add OTHER argument.
Handle BIT_NOT_EXPR.
(match_uaddsub_overflow): Optimize unsigned a > ~b into
__imag__ .ADD_OVERFLOW (a, b).
(math_opts_dom_walker::after_dom_children): Call match_uaddsub_overflow
even for BIT_NOT_EXPR.
Jakub Jelinek [Sat, 12 Dec 2020 07:36:02 +0000 (08:36 +0100)]
openmp, openacc: Fix up handling of data regions [PR98183]
While the data regions (target data and OpenACC counterparts) aren't
standalone directives, unlike most other OpenMP/OpenACC constructs
we allow (apparently as an extension) exceptions and goto out of
the block. During gimplification we place an *end* call into a finally
block so that it is reached even on exceptions or goto out etc.).
During omplower pass we then add paired #pragma omp return for them,
but due to the exceptions because the region is not SESE we can end up
with #pragma omp return appearing only conditionally in the CFG etc.,
which the ompexp pass can't handle.
For the ompexp pass, we actually don't care about the end part or about
target data nesting, so we can treat it as standalone directive.
2020-12-12 Jakub Jelinek <jakub@redhat.com>
PR middle-end/98183
* omp-low.c (lower_omp_target): Don't add OMP_RETURN for
data regions.
* omp-expand.c (expand_omp_target): Don't try to remove
OMP_RETURN for data regions.
(build_omp_regions_1, omp_make_gimple_edges): Don't expect
OMP_RETURN for data regions.
* gcc.dg/gomp/pr98183.c: New test.
* gcc.dg/goacc/pr98183.c: New test.
Jason Merrill [Fri, 11 Dec 2020 19:37:09 +0000 (14:37 -0500)]
c++: Avoid considering some conversion ops [PR97600]
Patrick's earlier patch to check convertibility before constraints for
conversion ops wasn't suitable because checking convertibility can also lead
to unwanted instantiations, but it occurs to me that there's a smaller check
we can do to avoid doing normal consideration of the conversion ops in this
case: since we're in the middle of a user-defined conversion, we can exclude
from consideration any conversion ops that return a type that would need an
additional user-defined conversion to reach the desired type: namely, a type
that differs in class-ness from the desired type.
[temp.inst]/9 allows optimizations like this: "If the function selected by
overload resolution can be determined without instantiating a class template
definition, it is unspecified whether that instantiation actually takes
place."
gcc/cp/ChangeLog:
PR libstdc++/97600
* call.c (build_user_type_conversion_1): Avoid considering
conversion functions that return a clearly unsuitable type.
Nathan Sidwell [Fri, 11 Dec 2020 19:10:40 +0000 (11:10 -0800)]
c++: Final module preparations
This adds the final few preparations to drop modules in. I'd missed a
couple of changes to core compiler -- a new pair of preprocessor
options, and marking the boundary of fixed and lazy global trees.
For C++, we need to add module.cc to the GTY scanner. Parsing final
cleanups needs a few tweaks for modules. Lambdas used to initialize a
global (for instance) get an extra scope, but we now need to point
that object to the lambda too. Finally template instantiation needs
to do lazy loading before looking at the available instantiations and
specializations.
Jim Wilson [Thu, 10 Dec 2020 02:57:32 +0000 (18:57 -0800)]
Add missing varasm DECL_P check.
This fixes a riscv64-linux bootstrap failure.
get_constant_section calls the select_section target hook, and select_section
calls get_named_section which calls get_section. So it is possible to have
a constant not a decl in both of these functions. They already call DECL_P
checks everywhere except for the new code HJ recently added. This adds the
missing DECL_P check.
gcc/
* varasm.c (get_section): Add DECL_P check before DECL_PRESERVE_P.
Ian Lance Taylor [Fri, 11 Dec 2020 05:07:27 +0000 (21:07 -0800)]
compiler: encode user visible names if necessary
Avoid putting weird characters into the user visible name.
It breaks stabs in particular, and may also cause debugger problems.
Instead, encode those names, and use a "g." prefix to tell the debugger.
Also dereference the type for the name of a recover thunk, to avoid a
pointless '*' that gets encoded.
Christophe Lyon [Fri, 11 Dec 2020 16:46:26 +0000 (16:46 +0000)]
arm: Auto-vectorization for MVE clean condition for vand and vorr expanders
The patch restores the unconditional definition of the VDQ iterator,
and changes the conditions of the vand and vorr expanders to use
ARM_HAVE_<MODE>_ARITH.
Replace/update ARC700 cache hazard detection. The next situations are
handled:
- There are 2 stores back2back, then 3 loads in next 3 or 4 instructions.
if 3 loads in 3 instructions then we insert 2 nops after stores.
if 3 loads in 4 instructions then we insert 1 nop after stores
- 2 back to back stores, followed by at least 3 loads in next 4 instructions.
st st ld ld ld ##
st st ## ld ld ld
st st ld ## ld ld
st st ld ld ## ld
## - any instruction
- store between non-store instructions, followed by 3 loads
$$ st SS ld ld ld
$$ - non-store instruction, even load.
* config/arc/arc.c (arc_active_insn): Ignore all non essential
instructions when getting the next active instruction.
(check_store_cacheline_hazard): Update.
(workaround_arc_anomaly): Remove obsolete cache hazard code.
BRcc instructions are generated quite late in the compilation
process. These instructions combines a compare with a regular
conditional branch if the result of the compare is not used
anylonger. However, when compiling for size, it is better to avoid
BRcc instructions which are introducing a 32-bit long immediate.
Nathan Sidwell [Fri, 11 Dec 2020 16:22:57 +0000 (08:22 -0800)]
c++: cp_tree_equal tweaks
When comparing streamed trees we can encounter NON_LVALUE_EXPR and
VIEW_CONVERT_EXPRs with null types. Also, when checking a potential
duplicate we don't want to reject PARM_DECLs with different contexts,
if those two contexts are the two decls of interest.
gcc/cp/
* cp-tree.h (map_context_from, map_context_to): Declare.
* module.cc (map_context_from, map_context_to): Define.
* tree.c (cp_tree_equal): Check map_context_{from,to} for parm
context difference. Allow NON_LVALUE_EXPR and VIEW_CONVERT_EXPR
with null types.
Christophe Lyon [Fri, 13 Nov 2020 12:34:12 +0000 (12:34 +0000)]
arm: Auto-vectorization for MVE: vorr
This patch enables MVE vorrq instructions for auto-vectorization. MVE
vorrq insns in mve.md are modified to use ior instead of unspec
expression to support ior<mode>3. The ior<mode>3 expander is added to
vec-common.md
The compiler can match mpyd.eq r0,r1,r0 as a predicated instruction,
which is incorrect. The mpyd(u) instruction takes as input two 32-bit
registers, returning into a double 64-bit even-odd register pair. For
the predicated case, the ARC instruction decoder expects the
destination register to be the same as the first input register. In
the big-endian case the result is swaped in the destination register
pair, however, the instruction encoding remains the same. Refurbish
the mpyd(u) patterns to take into account the above observation.
* config/arc/arc.md (mpyd<su_optab>_arcv2hs): New template
pattern.
(*pmpyd<su_optab>_arcv2hs): Likewise.
(*pmpyd<su_optab>_imm_arcv2hs): Likewise.
(mpyd_arcv2hs): Moved into above template.
(mpyd_imm_arcv2hs): Moved into above template.
(mpydu_arcv2hs): Likewise.
(mpydu_imm_arcv2hs): Likewise.
(su_optab): New optab prefix for sign/zero-extending operations.