Dimitar Dimitrov [Sat, 21 Jan 2023 16:10:59 +0000 (18:10 +0200)]
pru: Fix CLZ expansion for QI and HI modes
The recent gcc.dg/tree-ssa/clz-char.c test case failed for PRU target,
exposing a wrong code generation bug in the PRU backend. The "clz"
pattern did not produce correct output for QI and HI input operand
modes. SI mode is ok.
The "clz" pattern is expanded to an LMBD instruction to get the
left-most bit position having value "1". In turn, to get the correct
"clz" value, that bit position must be subtracted from the MSB bit
position of the input operand. The old behaviour of hard-coding 31
for MSB bit position is wrong.
The LMBD instruction returns 32 if input operand is zero, irrespective
of its register mode. This maps nicely for SI mode, where the "clz"
pattern outputs -1. It also leads to peculiar (but valid!) output
values from the "clz" pattern for QI and HI zero-valued inputs.
The corresponding commit in trunk contains two new test cases, which
have been removed here because they depend on r13-5195-g4798080d4a3530.
Regtested for pru-unknown-elf.
gcc/ChangeLog:
* config/pru/pru.h (CLZ_DEFINED_VALUE_AT_ZERO): Fix value for QI
and HI input modes.
* config/pru/pru.md (clz): Fix generated code for QI and HI
input modes.
Christophe Lyon [Tue, 14 Jun 2022 21:08:33 +0000 (21:08 +0000)]
aarch64: fix warning emission for ABI break since GCC 9.1
While looking at PR 105549, which is about fixing the ABI break
introduced in GCC 9.1 in parameter alignment with bit-fields, we
noticed that the GCC 9.1 warning is not emitted in all the cases where
it should be. This patch fixes that and the next patch in the series
fixes the GCC 9.1 break.
We split this into two patches since patch #2 introduces a new ABI
break starting with GCC 13.1. This way, patch #1 can be back-ported
to release branches if needed to fix the GCC 9.1 warning issue.
The main idea is to add a new global boolean that indicates whether
we're expanding the start of a function, so that aarch64_layout_arg
can emit warnings for callees as well as callers. This removes the
need for aarch64_function_arg_boundary to warn (with its incomplete
information). However, in the first patch there are still cases where
we emit warnings were we should not; this is fixed in patch #2 where
we can distinguish between GCC 9.1 and GCC.13.1 ABI breaks properly.
The fix in aarch64_function_arg_boundary (replacing & with &&) looks
like an oversight of a previous commit in this area which changed
'abi_break' from a boolean to an integer.
We also take the opportunity to fix the comment above
aarch64_function_arg_alignment since the value of the abi_break
parameter was changed in a previous commit, no longer matching the
description.
2022-11-28 Christophe Lyon <christophe.lyon@arm.com>
Richard Sandiford <richard.sandiford@arm.com>
* gcc.target/aarch64/bitfield-abi-warning-align16-O2.c: New test.
* gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c: New
test.
* gcc.target/aarch64/bitfield-abi-warning-align32-O2.c: New test.
* gcc.target/aarch64/bitfield-abi-warning-align32-O2-extra.c: New
test.
* gcc.target/aarch64/bitfield-abi-warning-align8-O2.c: New test.
* gcc.target/aarch64/bitfield-abi-warning.h: New test.
* g++.target/aarch64/bitfield-abi-warning-align16-O2.C: New test.
* g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C: New
test.
* g++.target/aarch64/bitfield-abi-warning-align32-O2.C: New test.
* g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C: New
test.
* g++.target/aarch64/bitfield-abi-warning-align8-O2.C: New test.
* g++.target/aarch64/bitfield-abi-warning.h: New test.
Richard Biener [Fri, 21 Oct 2022 07:45:44 +0000 (09:45 +0200)]
tree-optimization/107323 - loop distribution partition ordering issue
The following reverts part of the PR94125 fix which causes us to
use a bogus partition ordering after applying versioning for
alias to the testcase in PR107323. Instead PR94125 is fixed by
appropriately considering to be merged SCCs when skipping edges
we want to ignore because of the alias versioning.
PR tree-optimization/107323
* tree-loop-distribution.c (pg_unmark_merged_alias_ddrs):
New function.
(loop_distribution::break_alias_scc_partitions): Revert
postorder save/restore from the PR94125 fix. Instead
make sure to not ignore edges from SCCs we are going to
merge.
Richard Biener [Fri, 14 Oct 2022 09:14:59 +0000 (11:14 +0200)]
tree-optimization/107254 - check and support live lanes from permutes
The following fixes an omission from adding SLP permute nodes which
is live lanes originating from those. We have to check that we
can extract the lane and have to actually code generate them.
PR tree-optimization/107254
* tree-vect-slp.c (vect_slp_analyze_node_operations_1):
For permutes also analyze live lanes.
(vect_schedule_slp_node): For permutes also code generate
live lane extracts.
Richard Biener [Tue, 11 Oct 2022 09:34:55 +0000 (11:34 +0200)]
tree-optimization/107212 - SLP reduction of reduction paths
The following fixes an issue with how we handle epilogue generation
for SLP reductions of reduction paths where the actual live lanes
are not "canonical". We need to make sure to identify all live
lanes as reductions and thus have to iterate over all participating
SLP lanes when walking the reduction SSA use-def chain. Also the
previous attempt likely to mitigate such issue in
vectorizable_live_operation is misguided and has to be removed.
PR tree-optimization/107212
* tree-vect-loop.c (vectorizable_reduction): Make sure to
set STMT_VINFO_REDUC_DEF for all live lanes in a SLP
reduction.
(vectorizable_live_operation): Do not pun to the SLP
node representative for reduction epilogue generation.
* gcc.dg/vect/pr107212-1.c: New testcase.
* gcc.dg/vect/pr107212-2.c: Likewise.
`xputenv()` (and `putenv()`) don't copy strings and only store the
pointer in the `environ` global table. As a result `environ` got
corrupted as soon as `jinfo.skipped_makeflags` store got deallocated.
This started causing bootstrap crashes in `execv()` calls:
xgcc: fatal error: cannot execute '/build/build/./prev-gcc/collect2': execv: Bad address
The change restores memory allocation for `xputenv()` argument.
Jonathan Wakely [Thu, 28 Jul 2022 15:15:58 +0000 (16:15 +0100)]
libstdc++: Unblock atomic wait on non-futex platforms [PR106183]
When using a mutex and condition variable, the notifying thread needs to
increment _M_ver while holding the mutex lock, and the waiting thread
needs to re-check after locking the mutex. This avoids a missed
notification as described in the PR.
By moving the increment of _M_ver to the base _M_notify we can make the
use of the mutex local to the use of the condition variable, and
simplify the code a little. We can use a relaxed store because the mutex
already provides sequential consistency. Also we don't need to check
whether __addr == &_M_ver because we know that's always true for
platforms that use a condition variable, and so we also know that we
always need to use notify_all() not notify_one().
Reviewed-by: Thomas Rodgers <trodgers@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/106183
* include/bits/atomic_wait.h (__waiter_pool_base::_M_notify):
Move increment of _M_ver here.
[!_GLIBCXX_HAVE_PLATFORM_WAIT]: Lock mutex around increment.
Use relaxed memory order and always notify all waiters.
(__waiter_base::_M_do_wait) [!_GLIBCXX_HAVE_PLATFORM_WAIT]:
Check value again after locking mutex.
(__waiter_base::_M_notify): Remove increment of _M_ver.
Alex Coplan [Thu, 1 Dec 2022 17:36:02 +0000 (17:36 +0000)]
varasm: Fix type confusion bug
This patch fixes a type confusion bug in varasm.c:assemble_variable.
The problem is that the current code calls:
sect = get_variable_section (decl, false);
and then accesses sect->named.name without checking whether the section
is in fact a named section. In the surrounding else clause, we only know
that SECTION_STYLE (sect) != SECTION_NOSWITCH, so it is possible that
the section is an unnamed section.
In practice, this means that we end up doing a wild string compare
between a function pointer and the string literal ".vtable_map_vars".
This is because sect->named.name aliases sect->unnamed.callback in the
section union.
This can be seen in GDB with a simple testcase such as "int x;".
This patch fixes the issue by checking the SECTION_STYLE of the section
is in fact SECTION_NAMED before trying to do the string comparison.
We drop the existing check of whether sect->named.name is non-NULL
because this should presumably always be the case for a named section.
gcc/ChangeLog:
* varasm.c (assemble_variable): Fix type confusion bug when
checking for ".vtable_map_vars" section.
we found that the Um constraint would also allow through a
register offset writeback, resulting in an assembler error.
Here I have added a new constraint and predicate for these
instructions, which (uniquely, AFAICT), only support a `!` writeback
increment by the data size (inside the compiler this is a POST_INC).
No regressions in arm-none-eabi with MVE and MVE.FP.
gcc/ChangeLog:
PR target/107714
* config/arm/arm-protos.h (mve_struct_mem_operand): New protoype.
* config/arm/arm.c (mve_struct_mem_operand): New function.
* config/arm/constraints.md (Ug): New constraint.
* config/arm/mve.md (mve_vst4q<mode>): Change constraint.
(mve_vst2q<mode>): Likewise.
(mve_vld4q<mode>): Likewise.
(mve_vld2q<mode>): Likewise.
* config/arm/predicates.md (mve_struct_operand): New predicate.
gcc/testsuite/ChangeLog:
PR target/107714
* gcc.target/arm/mve/intrinsics/vldst24q_reg_offset.c: New test.
Harald Anlauf [Sat, 17 Dec 2022 21:04:32 +0000 (22:04 +0100)]
Fortran: incorrect array bounds when bound intrinsic used in decl [PR108131]
gcc/fortran/ChangeLog:
PR fortran/108131
* array.c (match_array_element_spec): Avoid too early simplification
of matched array element specs that can lead to a misinterpretation
when used as array bounds in array declarations.
Kewen Lin [Thu, 5 Jan 2023 05:31:45 +0000 (23:31 -0600)]
rs6000: Raise error for __vector_{quad,pair} uses without MMA enabled [PR106736]
As PR106736 shows, it's unexpected to use __vector_quad and
__vector_pair types without MMA support, it would cause ICE
when expanding the corresponding assignment. We can't guard
these built-in types registering under MMA support as Peter
pointed out in that PR, because the registering is global,
it doesn't work for target pragma/attribute support with MMA
enabled. The existing verify_type_context mentioned in [2]
can help to make the diagnostics invalid built-in type uses
better, but as Richard pointed out in [4], it can't deal with
all cases. As the discussions in [1][3], this patch is to
check the invalid use of built-in types __vector_quad and
__vector_pair in mov pattern of OOmode and XOmode, on the
currently being expanded gimple assignment statement. It
still puts an assertion in else arm rather than just makes
it go through, it's to ensure we can catch any other possible
unexpected cases in time if there are.
* config/rs6000/mma.md (define_expand movoo): Call function
rs6000_opaque_type_invalid_use_p to check and emit error message for
the invalid use of opaque type.
(define_expand movxo): Likewise.
* config/rs6000/rs6000-protos.h
(rs6000_opaque_type_invalid_use_p): New function declaration.
(currently_expanding_gimple_stmt): New extern declaration.
* config/rs6000/rs6000.c (rs6000_opaque_type_invalid_use_p): New
function.
Florian Weimer [Tue, 18 Oct 2022 14:58:48 +0000 (16:58 +0200)]
libiberty: Fix C89-isms in configure tests
libiberty/
* acinclude.m4 (ac_cv_func_strncmp_works): Add missing
int return type and parameter list to the definition of main.
Include <stdlib.h> and <string.h> for prototypes.
(ac_cv_c_stack_direction): Add missing
int return type and parameter list to the definitions of
main, find_stack_direction. Include <stdlib.h> for exit
prototype.
* configure: Regenerate.
Patrick Palka [Thu, 28 Oct 2021 14:05:14 +0000 (10:05 -0400)]
c++: quadratic constexpr behavior for left-assoc logical exprs [PR102780]
In the testcase below the two left fold expressions each expand into a
constant logical expression with 1024 terms, for which potential_const_expr
takes more than a minute to return true. This happens because p_c_e_1
performs trial evaluation of the first operand of a &&/|| in order to
determine whether to consider the potentiality of the second operand.
And because the expanded expression is left-associated, this trial
evaluation causes p_c_e_1 to be quadratic in the number of terms of the
expression.
This patch fixes this quadratic behavior by making p_c_e_1 preemptively
compute potentiality of the second operand of a &&/||, and perform trial
evaluation of the first operand only if the second operand isn't
potentially constant. We must be careful to avoid emitting bogus
diagnostics during the preemptive computation; to that end, we perform
this shortcut only when tf_error is cleared, and when tf_error is set we
now first check potentiality of the whole expression quietly and replay
the check noisily for diagnostics.
Apart from fixing the quadraticness for left-associated logical exprs,
this change also reduces compile time for the libstdc++ testcase
20_util/variant/87619.cc by about 15% even though our <variant> uses
right folds instead of left folds. Likewise for the testcase in the PR,
for which compile time is reduced by 30%. The reason for these speedups
is that p_c_e_1 no longer performs expensive trial evaluation of each term
of large constant logical expressions when determining their potentiality.
PR c++/102780
PR c++/108138
gcc/cp/ChangeLog:
* constexpr.c (potential_constant_expression_1) <case TRUTH_*_EXPR>:
When tf_error isn't set, preemptively check potentiality of the
second operand before performing trial evaluation of the first
operand.
(potential_constant_expression_1): When tf_error is set, first check
potentiality quietly and return true if successful, otherwise
proceed noisily to give errors.
Sebastian Pop [Wed, 30 Nov 2022 19:45:24 +0000 (19:45 +0000)]
AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776]
Currently patchable area is at the wrong place on AArch64. It is placed
immediately after function label, before .cfi_startproc. This patch
adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and
modifies aarch64_print_patchable_function_entry to avoid placing
patchable area before .cfi_startproc.
Iain Buclaw [Tue, 13 Dec 2022 22:46:39 +0000 (23:46 +0100)]
libphobos: Backport library and bindings fixes from mainline
D Runtime changes:
- Fix MIPS64 bindings for CRuntime_UClibc.
Phobos changes:
- Fix std.path.expandTilde erroneously raising onOutOfMemory
after failed call to getpwnam_r().
- Use GENERIC_IO on CRuntime_UClibc port of std.stdio.
libphobos/ChangeLog:
* libdruntime/core/stdc/fenv.d: Compile in MIPS uClibc bindings on
MIPS_Any targets.
* libdruntime/core/stdc/math.d: Likewise.
* libdruntime/core/sys/posix/dlfcn.d: Likewise.
* libdruntime/core/sys/posix/setjmp.d: Add MIPS64 definitions for
CRuntime_UClibc.
* libdruntime/core/sys/posix/sys/types.d: Likewise.
* src/std/path.d (expandTilde): Handle more errno codes that could be
left set by getpwnam_r.
* src/std/stdio.d: Set CRuntime_UClibc as GENERIC_IO target.
The following fixes an unintended(?) side-effect of the special
MODIFY_EXPR expression entries we add for tail-merging during VN.
We shouldn't value-number the virtual operand differently here.
PR tree-optimization/107107
* tree-ssa-sccvn.c (visit_reference_op_store): Do not
affect value-numbering when doing the tail merging
MODIFY_EXPR lookup.
Iain Buclaw [Sat, 10 Dec 2022 18:12:43 +0000 (19:12 +0100)]
d: Fix internal compiler error: in visit, at d/imports.cc:72 (PR108050)
The visitor for lowering IMPORTED_DECLs did not have an override for
dealing with importing OverloadSet symbols. This has now been
implemented in the code generator.
PR d/108050
gcc/d/ChangeLog:
* decl.cc (DeclVisitor::visit (Import *)): Handle build_import_decl
returning a TREE_LIST.
* imports.cc (ImportVisitor::visit (OverloadSet *)): New override.
liuhongt [Mon, 28 Nov 2022 01:59:47 +0000 (09:59 +0800)]
Fix unrecognizable insn due to illegal immediate_operand (const_int 255) of QImode.
For __builtin_ia32_vec_set_v16qi (a, -1, 2) with
!flag_signed_char. it's transformed to
__builtin_ia32_vec_set_v16qi (_4, 255, 2) in the gimple,
and expanded to (const_int 255) in the rtl. But for immediate_operand,
it expects (const_int 255) to be signed extended to
(const_int -1). The mismatch caused an unrecognizable insn error.
The patch converts (const_int 255) to (const_int -1) in the backend
expander.
Iain Buclaw [Fri, 11 Nov 2022 23:54:47 +0000 (00:54 +0100)]
d: Fix ICE on named continue label in an unrolled loop [PR107592]
Continue labels in an unrolled loop require a unique label per
iteration. Previously this used the Statement body node for each
unrolled iteration to generate a new entry in the label hash table.
This does not work when the continue label has an identifier, as said
named label is pointing to the outer UnrolledLoopStatement node.
What would happen is that during the lowering of `continue label', an
automatic label associated with the unrolled loop would be generated,
and a jump to that label inserted, but because it was never pushed by
the visitor for the loop itself, it subsequently never gets emitted.
To fix, correctly use the UnrolledLoopStatement as the key to look up
and store the break/continue label pair, but remove the continue label
from the value entry after every loop to force a new label to be
generated by the next call to `push_continue_label'
PR d/107592
gcc/d/ChangeLog:
* toir.cc (IRVisitor::push_unrolled_continue_label): New method.
(IRVisitor::pop_unrolled_continue_label): New method.
(IRVisitor::visit (UnrolledLoopStatement *)): Use them instead of
push_continue_label and pop_continue_label.
While most PA 2.0 instructions support both 32 and 64-bit traps
and conditions, the addi and subi instructions only support 32-bit
traps and conditions. Thus, we need to force immediate operands
to register operands on the 64-bit target and use the add/sub
instructions which can trap on 64-bit signed overflow.
2022-11-30 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
* config/pa/pa.md (addvdi3): Force operand 2 to a register.
Remove "addi,tsv,*" instruction from unamed pattern.
(subvdi3): Force operand 1 to a register.
Remove "subi,tsv" instruction from from unamed pattern.
Eric Botcazou [Fri, 25 Nov 2022 09:49:20 +0000 (10:49 +0100)]
Fix thinko in operator_bitwise_xor::op1_range
There is a thinko in the op1_range method of ranger's operator_bitwise_xor
class in a boolean context: if the result is known to be true, it may infer
that a specific operand is false without any basis.
Eric Botcazou [Tue, 22 Nov 2022 18:03:49 +0000 (19:03 +0100)]
Fix wrong array type conversion with different storage orde
When two arrays of scalars have a different storage order in Ada, the
front-end makes sure that the conversion is performed component-wise
so that each component can be reversed. So it's a little bit counter
productive that the ldist pass performs the opposite transformation
and synthesizes a memcpy/memmove in this case.
gcc/
* tree-loop-distribution.c (loop_distribution::classify_builtin_ldst):
Bail out if source and destination do not have the same storage order.