Harald Anlauf [Sat, 27 Jan 2024 16:41:43 +0000 (17:41 +0100)]
Fortran: fix bounds-checking errors for CLASS array dummies [PR104908]
Commit r11-1235 addressed issues with bounds of unlimited polymorphic array
dummies. However, using the descriptor from sym->backend_decl does break
the case of CLASS array dummies. The obvious solution is to restrict the
fix to the unlimited polymorphic case, thus keeping the original descriptor
in the ordinary case.
gcc/fortran/ChangeLog:
PR fortran/104908
* trans-array.cc (gfc_conv_array_ref): Restrict use of transformed
descriptor (sym->backend_decl) to the unlimited polymorphic case.
gcc/testsuite/ChangeLog:
PR fortran/104908
* gfortran.dg/pr104908.f90: New test.
Jonathan Wakely [Thu, 1 Feb 2024 21:40:33 +0000 (21:40 +0000)]
libstdc++: Allow explicit conversion of string views with different traits
This was changed by LWG 3857.
libstdc++-v3/ChangeLog:
* include/std/string_view (basic_string_view(R&&)): Remove
constraint that traits_type must be the same, as per LWG 3857.
* testsuite/21_strings/basic_string_view/cons/char/range_c++20.cc:
Explicit conversion between different specializations should be
allowed.
* testsuite/21_strings/basic_string_view/cons/wchar_t/range_c++20.cc:
Likewise.
Jonathan Wakely [Wed, 31 Jan 2024 10:41:49 +0000 (10:41 +0000)]
libstdc++: Avoid reusing moved-from iterators in PSTL tests [PR90276]
The reverse_invoker utility for PSTL tests uses forwarding references for
all parameters, but some of those parameters get forwarded to move
constructors which then leave the objects in a moved-from state. When
the parameters are forwarded a second time that results in making new
copies of moved-from iterators. For libstdc++ debug mode iterators, the
moved-from state is singular, which means copying them will abort at
runtime.
The fix is to make copies of iterator arguments instead of forwarding
them.
The callers of reverse_invoker::operator() also forward the iterators
multiple times, but that's OK because reverse_invoker accepts them by
forwarding reference but then breaks the chain of forwarding and copies
them as lvalues.
libstdc++-v3/ChangeLog:
PR libstdc++/90276
* testsuite/util/pstl/test_utils.h (reverse_invoker): Do not use
perfect forwarding for iterator arguments.
Jonathan Wakely [Thu, 11 Jan 2024 15:09:12 +0000 (15:09 +0000)]
libstdc++: Fix non-portable results from 64-bit std::subtract_with_carry_engine [PR107466]
I implemented the resolution of LWG 3809 in r13-4364-ga64775a0edd469 but
it was recently noted in the MSVC STL github repo that the change causes
possible truncation for 64-bit seeds. Whether the truncation occurs (and
to what value) depends on the width of uint_least32_t which is not
portable, so the output of the PRNG for 64-bit seed values is no longer
the same as in C++20, and no longer portable across platforms.
That new issue was filed as LWG 4014. I proposed a new change which
reduces the seed by the LCG's modulus before the conversion to
uint_least32_t. This ensures that 64-bit seed values are consistently
reduced by the modulus before any truncation. This removes the
platform-dependent behaviour and restores the old behaviour for
std::subtract_with_carry_engine specializations using a 64-bit result
type (such as std::ranlux48_base).
libstdc++-v3/ChangeLog:
PR libstdc++/107466
* include/bits/random.tcc (subtract_with_carry_engine::seed):
Implement proposed resolution of LWG 4014.
* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error
line number.
* testsuite/26_numerics/random/subtract_with_carry_engine/cons/lwg3809.cc:
Check for expected result of 64-bit engine with seed that
doesn't fit in 32-bits.
Jonathan Wakely [Tue, 9 Jan 2024 15:22:46 +0000 (15:22 +0000)]
libstdc++: Prefer posix_memalign for aligned-new [PR113258]
As described in PR libstdc++/113258 there are old versions of tcmalloc
which replace malloc and related APIs, but do not repalce aligned_alloc
because it didn't exist at the time they were released. This means that
when operator new(size_t, align_val_t) uses aligned_alloc to obtain
memory, it comes from libc's aligned_alloc not from tcmalloc. But when
operator delete(void*, size_t, align_val_t) uses free to deallocate the
memory, that goes to tcmalloc's replacement version of free, which
doesn't know how to free it.
If we give preference to the older posix_memalign instead of
aligned_alloc then we're more likely to use a function that will be
compatible with the replacement version of free. Because posix_memalign
has been around for longer, it's more likely that old third-party malloc
replacements will also replace posix_memalign alongside malloc and free.
libstdc++-v3/ChangeLog:
PR libstdc++/113258
* libsupc++/new_opa.cc: Prefer to use posix_memalign if
available.
Jason Merrill [Thu, 1 Jun 2023 18:41:07 +0000 (14:41 -0400)]
varasm: check float size [PR109359]
In PR95226, the testcase was failing because we tried to output_constant a
NOP_EXPR to float from a double REAL_CST, and so we output a double where
the caller wanted a float. That doesn't happen anymore, but with the
output_constant hunk we will ICE in that situation rather than emit the
wrong number of bytes.
Part of the problem was that initializer_constant_valid_p_1 returned true
for that NOP_EXPR, because it compared the sizes of integer types but not
floating-point types. So the C++ front end assumed it didn't need to fold
the initializer.
Jason Merrill [Mon, 5 Feb 2024 18:54:22 +0000 (13:54 -0500)]
c++: prvalue of array type [PR111286]
Here we want to build a prvalue array to bind to the T reference, but we
were wrongly trying to strip cv-quals from the array prvalue, which should
be treated the same as a class prvalue.
PR c++/111286
gcc/cp/ChangeLog:
* tree.cc (rvalue): Don't drop cv-quals from an array.
Xi Ruoyao [Fri, 2 Feb 2024 19:35:07 +0000 (03:35 +0800)]
MIPS: Fix wrong MSA FP vector negation
We expanded (neg x) to (minus const0 x) for MSA FP vectors, this is
wrong because -0.0 is not 0 - 0.0. This causes some Python tests to
fail when Python is built with MSA enabled.
Use the bnegi.df instructions to simply reverse the sign bit instead.
gcc/ChangeLog:
* config/mips/mips-msa.md (elmsgnbit): New define_mode_attr.
(neg<mode>2): Change the mode iterator from MSA to IMSA because
in FP arithmetic we cannot use (0 - x) for -x.
(neg<mode>2): New define_insn to implement FP vector negation,
using a bnegi instruction to negate the sign bit.
Martin Jambor [Fri, 2 Feb 2024 12:27:50 +0000 (13:27 +0100)]
sra: Disqualify bases of operands of asm gotos
PR 110422 shows that SRA can ICE assuming there is a single edge
outgoing from a block terminated with an asm goto. We need that for
BB-terminating statements so that any adjustments they make to the
aggregates can be copied over to their replacements. Because we can't
have that after ASM gotos, we need to punt.
gcc/ChangeLog:
2024-01-17 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/110422
* tree-sra.cc (scan_function): Disqualify bases of operands of asm
gotos.
gcc/testsuite/ChangeLog:
2024-01-17 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/110422
* gcc.dg/torture/pr110422.c: New test.
Jonathan Wakely [Thu, 1 Feb 2024 18:37:34 +0000 (18:37 +0000)]
libstdc++: Force-inline shared_ptr::operator bool() for C++20 [PR108636]
This avoids a linker error with -fkeep-inline-functions when including
<filesystem>. We can't backport the fix from trunk because it adds an
export to the shared library. By marking the "missing" symbol
always_inline for C++20 mode we don't need a definition in the library.
libstdc++-v3/ChangeLog:
PR libstdc++/108636
* include/bits/shared_ptr_base.h (__shared_ptr::operator bool):
Add always_inline attribute for C++20 and later.
The first alternative stores the floating-point status register
in the destination. It should store zero. We need to copy %fr0
to another floating-point register to initialize it to zero.
2024-02-01 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
* config/pa/pa.md (atomic_storedi_1): Fix bug in
alternative 1.
Lewis Hyatt [Tue, 5 Dec 2023 16:33:39 +0000 (11:33 -0500)]
c-family: Fix ICE with large column number after restoring a PCH [PR105608]
Users are allowed to define macros prior to restoring a precompiled header
file, as long as those macros are not defined (or are defined identically)
in the PCH. However, the PCH restoration process destroys all the macro
definitions, so libcpp has to record them before restoring the PCH and then
redefine them afterward.
This process does not currently assign great locations to the macros after
redefining them. Some work is needed to also remember the original locations
and get the line_maps instance in the right state (since, like all other
data structures, the line_maps instance is also reset after restoring a PCH).
This patch addresses a more pressing issue, which is that we ICE in some
cases since GCC 11, hitting an assert in line-maps.cc. It happens if the
first line encountered after the PCH restore requires an LC_RENAME map, such
as will happen if the line is sufficiently long. This is much easier to
fix, since we just need to call linemap_line_start before asking libcpp to
redefine the stored macros, instead of afterward, to avoid the unexpected
need for an LC_RENAME before an LC_ENTER has been seen.
gcc/c-family/ChangeLog:
PR preprocessor/105608
* c-pch.cc (c_common_read_pch): Start a new line map before asking
libcpp to restore macros defined prior to reading the PCH, instead
of afterward.
gcc/testsuite/ChangeLog:
PR preprocessor/105608
* g++.dg/pch/line-map-1.C: New test.
* g++.dg/pch/line-map-1.Hs: New test.
* g++.dg/pch/line-map-2.C: New test.
* g++.dg/pch/line-map-2.Hs: New test.
Jason Merrill [Tue, 23 Jan 2024 20:41:09 +0000 (15:41 -0500)]
c++: throwing cleanup after return [PR113347]
Here we were assuming that current_retval_sentinel would be set if we have
seen a throwing cleanup, but that's not the case if the cleanup is after all
returns.
This change isn't needed on trunk, where current_retval_sentinel is set for
all NRV functions.
Jason Merrill [Tue, 19 Dec 2023 21:12:02 +0000 (16:12 -0500)]
c++: xvalue array subscript [PR103185]
Normally we handle xvalue array subscripting with ARRAY_REF, but in this
case we weren't doing that because the operands were reversed. Handle that
case better.
Jason Merrill [Wed, 20 Dec 2023 16:06:27 +0000 (11:06 -0500)]
c++: throwing dtor and empty try [PR113088]
maybe_splice_retval_cleanup assumed that the function body can't be empty if
there's a throwing cleanup, but when I added cleanups to try blocks in r12-6333-gb10e031458d541 I didn't adjust that assumption.
PR c++/113088
PR c++/33799
gcc/cp/ChangeLog:
* except.cc (maybe_splice_retval_cleanup): Handle an empty block.
Jason Merrill [Sun, 4 Jun 2023 16:00:55 +0000 (12:00 -0400)]
c++: NRV and goto [PR92407]
Here our named return value optimization was breaking the required
destructor when the goto takes 'a' out of scope. A simple fix for the
release branches is to disable the optimization in the presence of backward
goto.
We could do better by disabling the optimization only if there is a backward
goto across the variable declaration, but we don't track that, and in GCC 14
we instead make the goto work with NRV.
PR c++/92407
gcc/cp/ChangeLog:
* cp-tree.h (struct language_function): Add backward_goto.
* decl.cc (check_goto): Set it.
* typeck.cc (check_return_expr): Prevent NRV if set.
Georg-Johann Lay [Mon, 15 Jan 2024 12:25:59 +0000 (13:25 +0100)]
AVR: target/107201: Make -nodevicelib work for all devices.
driver-avr.cc contains a spec that discriminates between cores
and devices by means of a mmcu=avr* spec pattern. This does not
work for new devices like AVR128* which also start with mmcu=avr
like all cores do. The patch uses a new spec function in order to
tell apart cores from devices.
gcc/
PR target/107201
* config/avr/avr.h (EXTRA_SPEC_FUNCTIONS): Add no-devlib, avr_no_devlib.
* config/avr/driver-avr.cc (avr_no_devlib): New function.
(avr_devicespecs_file): Use it to remove -nodevicelib from the
options for cores only.
* config/avr/avr-arch.h (avr_get_parch): New prototype.
* config/avr/avr-devices.cc (avr_get_parch): New function.
The get_target_expr call added in r12-7069-g119cea98f66476 causes us
for the below testcase to call build_vec_delete in a template context,
which builds a templated destructor call and checks expr_noexcept_p for
it, which ICEs because the call has templated form.
Much of the work of build_vec_delete however is code generation and thus
will just get discarded in a template context, and that includes the
code guarded by expr_noexcept_p. So this patch narrowly fixes this ICE
by eliding the expr_noexcept_p call when in a template context.
PR c++/109899
gcc/cp/ChangeLog:
* init.cc (build_vec_delete_1): Assume expr_noexcept_p returns
false in a template context.
Andrew Pinski [Mon, 15 Jan 2024 09:31:36 +0000 (10:31 +0100)]
AVR: target/113156 - Fix ICE due to missing "Save" on -m[long-]double= options.
Multilib options -mdouble= and -mlong-double= are not orthogonal:
TARGET_HANDLE_OPTION = avr-common.cc::avr_handle_option() sets them
such that sizeof(double) <= sizeof(long double) is always true.
Sandra Loosemore [Thu, 11 Jan 2024 21:12:56 +0000 (21:12 +0000)]
libgcc, nios2: Fix exception handling on nios2 with -fpic
Exception handling on nios2-linux-gnu with -fpic has been broken since
revision 790854ea7670f11c14d431c102a49181d2915965, "Use _dl_find_object
in _Unwind_Find_FDE". For whatever reason, this doesn't work on nios2.
Nios2 uses the GOT address as the base for DW_EH_PE_datarel
relocations in PIC; see my previous fix to make this work, revision 2d33dcfe9f0494c9b56a8d704c3d27c5a4329ebc, "Support for GOT-relative
DW_EH_PE_datarel encoding". So this may be a horrible bug in the ABI
or in my interpretation of it or just glibc's implementation of
_dl_find_object for this target, but there's existing code out there
that does things this way; and realistically, nobody is going to
re-engineer this now that the vendor has EOL'ed the nios2
architecture. So, just skip over the code trying to use
_dl_find_object on this target and fall back to the way that works.
libgcc/ChangeLog
* unwind-dw2-fde-dip.c (_Unwind_Find_FDE): Do not try to use
_dl_find_object on nios2; it doesn't work.
The current constexpr implementation of std::char_traits<C>::move relies
on being able to compare the pointer parameters, which is not allowed
for unrelated pointers. We can use __builtin_constant_p to determine
whether it's safe to compare the pointers directly. If not, then we know
the ranges must be disjoint and so we can use char_traits<C>::copy to
copy forwards from the first character to the last. If the pointers can
be compared directly, then we can simplify the condition for copying
backwards to just two pointer comparisons.
libstdc++-v3/ChangeLog:
PR libstdc++/113200
* include/bits/char_traits.h (__gnu_cxx::char_traits::move): Use
__builtin_constant_p to check for unrelated pointers that cannot
be compared during constant evaluation.
* testsuite/21_strings/char_traits/requirements/113200.cc: New
test.
Ken Matsui [Thu, 11 Jan 2024 06:08:07 +0000 (22:08 -0800)]
libstdc++: Fix error handling in filesystem::equivalent [PR113250]
This patch made std::filesystem::equivalent correctly throw an exception
when either path does not exist as per [fs.op.equivalent]/4.
PR libstdc++/113250
libstdc++-v3/ChangeLog:
* src/c++17/fs_ops.cc (fs::equivalent): Use || instead of &&.
* src/filesystem/ops.cc (fs::equivalent): Likewise.
* testsuite/27_io/filesystem/operations/equivalent.cc: Handle
error codes.
* testsuite/experimental/filesystem/operations/equivalent.cc:
Likewise.
Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org> Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit df147e2ee7199d33d66959c6509ce9c21072077f)
Steve Baird [Wed, 16 Nov 2022 17:28:22 +0000 (09:28 -0800)]
Fix nternal compiler error for Sequential Partition_Elaboration_Policy
In some cases, compilation of a function with a limited class-wide result
type could fail with an ICE if a Sequential Partition_Elaboration_Policy is
specified. To prevent this, we really want that specifying a Sequential
Partition_Elaboration_Policy to have the side effect of imposing a
No_Task_Hierarchy restriction. But doing that in a straightforward
way leads to problems with incorrectly accepting violations of H.6(6).
So a new restriction, No_Task_Hierarchy_Implicit, is introduced.
gcc/ada/
PR ada/104354
* libgnat/s-rident.ads: Define a new restriction,
No_Task_Hierarchy_Implicit. This is like the No_Task_Hierarchy
restriction, but with the difference that setting this restriction
does not mean the H.6(6) post-compilation check is satisified.
* exp_ch6.adb (Add_Task_Actuals_To_Build_In_Place_Call): If it is
known that the function result cannot have tasks, then pass in a
null literal for the activation chain actual parameter. This
avoids generating a reference to an entity that
Build_Activation_Chain_Entity may have chosen not to generate a
declaration for.
* gnatbind.adb (List_Applicable_Restrictions): Do not list the
No_Task_Hierarchy_Implicit restriction.
* restrict.adb: Special treatment for the
No_Task_Hierarchy_Implicit restriction in functions
Get_Restriction_Id and Restriction_Active. The former is needed to
disallow the (unlikely) case that a user tries to explicitly
reference the No_Task_Hierarchy_Implicit restriction.
* sem_prag.adb (Analyze_Pragma): If a Sequential
Partition_Elaboration_Policy is specified (and the
No_Task_Hierarchy restriction is not already enabled), then enable
the No_Task_Hierarchy_Implicit restriction.
AVR: PR target/112952: Fix attribute "address", "io" and "io_low"
so they work with all combinations of -f[no-]data-sections -f[no-]common.
The patch also improves some diagnostics and adds additional checks, for
example these attributes must only be applied to variables in static storage.
gcc/
PR target/112952
* config/avr/avr.cc (avr_handle_addr_attribute): Also print valid
range when diagnosing attribute "io" and "io_low" are out of range.
(avr_eval_addr_attrib): Don't ICE on empty address at that place.
(avr_insert_attributes): Reject if attribute "address", "io" or "io_low"
in contexts other than static storage.
(avr_asm_output_aligned_decl_common): Move output of decls with
attribute "address", "io", and "io_low" to...
(avr_output_addr_attrib): ...this new function.
(avr_asm_asm_output_aligned_bss): Remove output for decls with
attribute "address", "io", and "io_low".
(avr_encode_section_info): Rectify handling of decls with attribute
"address", "io", and "io_low".
gcc/testsuite/
PR target/112952
* gcc.target/avr/attribute-io.h: New file.
* gcc.target/avr/pr112952-0.c: New test.
* gcc.target/avr/pr112952-1.c: New test.
* gcc.target/avr/pr112952-2.c: New test.
* gcc.target/avr/pr112952-3.c: New test.
Patrick Palka [Wed, 3 Jan 2024 02:31:20 +0000 (21:31 -0500)]
libstdc++: testsuite: Reduce max_size_type.cc exec time [PR113175]
The adjustment to max_size_type.cc in r14-205-g83470a5cd4c3d2
inadvertently increased the execution time of this test by over 5x due
to making the two main loops actually run in the signed_p case instead
of being dead code.
To compensate, this patch cuts the relevant loops' range [-1000,1000] by
10x as proposed in the PR. This shouldn't significantly weaken the test
since the same important edge cases are still checked in the smaller range
and/or elsewhere. On my machine this reduces the test's execution time by
roughly 10x (and 1.6x relative to before r14-205).
PR testsuite/113175
libstdc++-v3/ChangeLog:
* testsuite/std/ranges/iota/max_size_type.cc (test02): Reduce
'limit' to 100 from 1000 and adjust 'log2_limit' accordingly.
(test03): Likewise.
Patrick Palka [Fri, 22 Sep 2023 10:25:49 +0000 (06:25 -0400)]
c++: constraint rewriting during ttp coercion [PR111485]
In order to compare the constraints of a ttp with that of its argument,
we rewrite the ttp's constraints in terms of the argument template's
template parameters. The substitution to achieve this currently uses a
single level of template arguments, but that never does the right thing
because a ttp's template parameters always have level >= 2. This patch
fixes this by including the outer template arguments in the substitution,
which ought to match the depth of the ttp.
The second testcase demonstrates it's better to substitute the concrete
outer template arguments instead of generic ones since a ttp's constraints
could depend on outer parameters.
PR c++/111485
gcc/cp/ChangeLog:
* pt.cc (is_compatible_template_arg): New parameter 'args'.
Add the outer template arguments 'args' to 'new_args'.
(convert_template_argument): Pass 'args' to
is_compatible_template_arg.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-ttp5.C: New test.
* g++.dg/cpp2a/concepts-ttp6.C: New test.
Patrick Palka [Tue, 25 Apr 2023 19:59:22 +0000 (15:59 -0400)]
c++: value dependence of by-ref lambda capture [PR108975]
We are still ICEing on the generic lambda version of the testcase from
this PR, even after r13-6743-g6f90de97634d6f, due to the by-ref capture
of the constant local variable 'dim' being considered value-dependent
when regenerating the lambda (at which point processing_template_decl is
set since the lambda is generic), which prevents us from constant folding
its uses. Later during prune_lambda_captures we end up not thoroughly
walking the body of the lambda and overlook the (non-folded) uses of
'dim' within the array bound and using-decls.
We could fix this by making prune_lambda_captures walk the body of the
lambda more thoroughly so that it finds these uses of 'dim', but ideally
we should be able to constant fold all uses of 'dim' ahead of time and
prune the implicit capture after all.
To that end this patch makes value_dependent_expression_p return false
for such by-ref captures of constant local variables, allowing their
uses to get constant folded ahead of time. It seems we just need to
disable the predicate's conservative early exit for reference variables
(added by r5-5022-g51d72abe5ea04e) when DECL_HAS_VALUE_EXPR_P. This
effectively makes us treat by-value and by-ref captures more consistently
when it comes to value dependence.
PR c++/108975
gcc/cp/ChangeLog:
* pt.cc (value_dependent_expression_p) <case VAR_DECL>:
Suppress conservative early exit for reference variables
when DECL_HAS_VALUE_EXPR_P.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/lambda/lambda-const11a.C: New test.
Jakub Jelinek [Tue, 19 Dec 2023 09:24:33 +0000 (10:24 +0100)]
i386: Fix mmx.md signbit expanders [PR112816]
Apparently when looking for "signbit<mode>2" vector expanders, I've only
looked at sse.md and forgot mmx.md, which has another one and the
following patch still ICEd.
2023-12-19 Jakub Jelinek <jakub@redhat.com>
PR target/112816
* config/i386/mmx.md (signbitv2sf2): Force operands[1] into a REG.
The following testcase ICEs because we aren't careful enough with
alloc_size attribute. We do check that such an argument exists
(although wouldn't handle correctly functions with more than INT_MAX
arguments), but didn't check that it is scalar integer, the ICE is
trying to fold_convert a structure to sizetype.
Given that the attribute can also appear on non-prototyped functions
where the arguments aren't known, I don't see how the FE could diagnose
that and because we already handle the case where argument doesn't exist,
I think we should also verify the argument is scalar integer convertible
to sizetype. Furthermore, given this is not just in diagnostics but
used for code generation, I think it is better to punt on arguments with
larger precision then sizetype, the upper bits are then truncated.
The patch also fixes some formatting issues and avoids duplication of the
fold_convert, plus removes unnecessary check for if (arg1 >= 0), that is
always the case after if (arg1 < 0) return ...;
2023-12-18 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/113013
* tree-object-size.cc (alloc_object_size): Return size_unknown if
corresponding argument(s) don't have integral type or have integral
type with higher precision than sizetype. Don't check arg1 >= 0
uselessly. Compare argument indexes against gimple_call_num_args
in unsigned type rather than int. Formatting fixes.
Jakub Jelinek [Fri, 8 Dec 2023 19:56:48 +0000 (20:56 +0100)]
c++: Unshare folded SAVE_EXPR arguments during cp_fold [PR112727]
The following testcase is miscompiled because two ubsan instrumentations
run into each other.
The first one is the shift instrumentation. Before the C++ FE calls
it, it wraps the 2 shift arguments with cp_save_expr, so that side-effects
in them aren't evaluated multiple times. And, ubsan_instrument_shift
itself uses unshare_expr on any uses of the operands to make sure further
modifications in them don't affect other copies of them (the only not
unshared ones are the one the caller then uses for the actual operation
after the instrumentation, which means there is no tree sharing).
Now, if there are side-effects in the first operand like say function
call, cp_save_expr wraps it into a SAVE_EXPR, and ubsan_instrument_shift
in this mode emits something like
if (..., SAVE_EXPR <foo ()>, SAVE_EXPR <op1> > const)
__ubsan_handle_shift_out_of_bounds (..., SAVE_EXPR <foo ()>, ...);
and caller adds
SAVE_EXPR <foo ()> << SAVE_EXPR <op1>
after it in a COMPOUND_EXPR. So far so good.
If there are no side-effects and cp_save_expr doesn't create SAVE_EXPR,
everything is ok as well because of the unshare_expr.
We have
if (..., SAVE_EXPR <op1> > const)
__ubsan_handle_shift_out_of_bounds (..., ptr->something[i], ...);
and
ptr->something[i] << SAVE_EXPR <op1>
where ptr->something[i] is unshared.
In the testcase below, the !x->s[j] ? 1 : 0 expression is wrapped initially
into a SAVE_EXPR though, and unshare_expr doesn't unshare SAVE_EXPRs nor
anything used in them for obvious reasons, so we end up with:
if (..., SAVE_EXPR <!(bool) VIEW_CONVERT_EXPR<const struct S *>(x)->s[j] ? 1 : 0>, SAVE_EXPR <op1> > const)
__ubsan_handle_shift_out_of_bounds (..., SAVE_EXPR <!(bool) VIEW_CONVERT_EXPR<const struct S *>(x)->s[j] ? 1 : 0>, ...);
and
SAVE_EXPR <!(bool) VIEW_CONVERT_EXPR<const struct S *>(x)->s[j] ? 1 : 0> << SAVE_EXPR <op1>
So far good as well. But later during cp_fold of the SAVE_EXPR we find
out that VIEW_CONVERT_EXPR<const struct S *>(x)->s[j] ? 0 : 1 is actually
invariant (has TREE_READONLY set) and so cp_fold simplifies the above to
if (..., SAVE_EXPR <op1> > const)
__ubsan_handle_shift_out_of_bounds (..., (bool) VIEW_CONVERT_EXPR<const struct S *>(x)->s[j] ? 0 : 1, ...);
and
((bool) VIEW_CONVERT_EXPR<const struct S *>(x)->s[j] ? 0 : 1) << SAVE_EXPR <op1>
with the s[j] ARRAY_REFs and other expressions shared in between the two
uses (and obviously the expression optimized away from the COMPOUND_EXPR in
the if condition.
Then comes another ubsan instrumentation at genericization time,
this time to instrument the ARRAY_REFs with strict bounds checking,
and replaces the s[j] in there with s[.UBSAN_BOUNDS (0B, SAVE_EXPR<j>, 8), SAVE_EXPR<j>]
As the trees are shared, it does that just once though.
And as the if body is gimplified first, the SAVE_EXPR<j> is evaluated inside
of the if body and when it is used again after the if, it uses a potentially
uninitialized value of j.1 (always uninitialized if the shift count isn't
out of bounds).
The following patch fixes that by unshare_expr unsharing the folded argument
of a SAVE_EXPR if we've folded the SAVE_EXPR into an invariant and it is
used more than once.
2023-12-08 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/112727
* cp-gimplify.cc (cp_fold): If SAVE_EXPR has been previously
folded, unshare_expr what is returned.
Jakub Jelinek [Wed, 29 Nov 2023 11:26:50 +0000 (12:26 +0100)]
fold-const: Fix up multiple_of_p [PR112733]
We ICE on the following testcase when wi::multiple_of_p is called on
widest_int 1 and -128 with UNSIGNED. I still need to work on the
actual wide-int.cc issue, the latest patch attached to the PR regressed
bitint-{38,39}.c, so will need to debug that, but there is a clear bug
on the fold-const.cc side as well - widest_int is a signed representation
by definition, using UNSIGNED with it certainly doesn't match what was
intended, because -128 as the second operand effectively means unsigned
131072 bit 0xfffff............ffff80 integer, not the signed char -128
that appeared in the source.
In the INTEGER_CST case a few lines above this we already use
case INTEGER_CST:
if (TREE_CODE (bottom) != INTEGER_CST || integer_zerop (bottom))
return false;
return wi::multiple_of_p (wi::to_widest (top), wi::to_widest (bottom),
SIGNED);
so I think using SIGNED with widest_int is best there (compared to the
other choices in the PR).
2023-11-29 Jakub Jelinek <jakub@redhat.com>
PR middle-end/112733
* fold-const.cc (multiple_of_p): Pass SIGNED rather than
UNSIGNED for wi::multiple_of_p on widest_int arguments.
Jakub Jelinek [Tue, 5 Dec 2023 12:17:57 +0000 (13:17 +0100)]
i386: Fix -fcf-protection -Os ICE due to movabsq peephole2 [PR112845]
The following testcase ICEs in the movabsq $(i32 << shift), r64 peephole2
I've added a while back to use smaller code than movabsq if possible.
If i32 is 0xfa1e0ff3 and shift is not divisible by 8, then it creates
an invalid insn (as 0xfa1e0ff3 CONST_INT is not allowed as
x86_64_immediate_operand nor x86_64_zext_immediate_operand), the peephole2
even triggers on it again and again (this time with shift 0) until it gives
up.
The following patch fixes that. As ix86_endbr_immediate_operand needs a
CONST_INT and it is hopefully rare, I chose to use FAIL rather than handling
it in the condition (where I'd probably need to call ctz_hwi again etc.).
2023-12-05 Jakub Jelinek <jakub@redhat.com>
PR target/112845
* config/i386/i386.md (movabsq $(i32 << shift), r64 peephole2): FAIL
if the new immediate is ix86_endbr_immediate_operand.
Jakub Jelinek [Mon, 4 Dec 2023 08:01:09 +0000 (09:01 +0100)]
i386: Fix rtl checking ICE in ix86_elim_entry_set_got [PR112837]
The following testcase ICEs with RTL checking, because it sets if
XINT (SET_SRC (set), 1) is UNSPEC_SET_GOT without checking if SET_SRC (set)
is actually an UNSPEC, so any time we see any other insn with PARALLEL
and a SET in it which is not an UNSPEC we ICE during RTL checking or
access there some other union member as if it was an rt_int.
The rest is just small cleanup.
2023-12-04 Jakub Jelinek <jakub@redhat.com>
PR target/112837
* config/i386/i386.cc (ix86_elim_entry_set_got): Before checking
for UNSPEC_SET_GOT check that SET_SRC is UNSPEC. Use SET_SRC and
SET_DEST macros instead of XEXP, rename vec variable to set.
Jakub Jelinek [Mon, 4 Dec 2023 08:00:18 +0000 (09:00 +0100)]
i386: Fix up signbit<mode>2 expander [PR112816]
The following testcase ICEs, because the signbit<mode>2 expander uses an
explicit SUBREG in the pattern around match_operand with register_operand
predicate. If we are unlucky enough that expansion tries to expand it
with some SUBREG as operands[1], we have two nested SUBREGs in the IL,
which is not valid and causes ICE later.
2023-12-04 Jakub Jelinek <jakub@redhat.com>
PR target/112816
* config/i386/sse.md (signbit<mode>2): Force operands[1] into a REG.
Jakub Jelinek [Mon, 4 Dec 2023 07:59:15 +0000 (08:59 +0100)]
c++: #pragma GCC unroll C++ fixes [PR112795]
foo in the unroll-5.C testcase ICEs because cp_parser_pragma_unroll
during parsing calls maybe_constant_value unconditionally, which is
fine if !processing_template_decl, but can ICE otherwise.
While just calling fold_non_dependent_expr there instead could be enough
to fix the ICE (and I guess the right thing to do for backports if any),
I don't see a reason why we couldn't handle a dependent #pragma GCC unroll
argument as well, the unrolling isn't done in the FE and all the middle-end
cares about is that ANNOTATE_EXPR has a 1..65534 last operand when it is
annot_expr_unroll_kind.
So, the following patch changes all the unsigned short unroll arguments
to tree unroll (and thus avoids the tree -> unsigned short -> tree
conversions), does the type and value checking during parsing only if
the argument isn't dependent and repeats it during instantiation.
2023-12-04 Jakub Jelinek <jakub@redhat.com>
PR c++/112795
gcc/cp/
* parser.cc (cp_parser_pragma_unroll): Use fold_non_dependent_expr
instead of maybe_constant_value.
gcc/testsuite/
* g++.dg/ext/unroll-5.C: New test.
Jakub Jelinek [Sat, 25 Nov 2023 09:31:55 +0000 (10:31 +0100)]
i386: Fix up *jcc_bt*_mask{,_1} [PR111408]
The following testcase is miscompiled in GCC 14 because the
*jcc_bt<mode>_mask and *jcc_bt<SWI48:mode>_mask_1 patterns have just
one argument in (match_operator 0 "bt_comparison_operator" [...])
but as bt_comparison_operator is eq,ne, we need two.
The md readers don't warn about it, after all, some checks can
be done in the predicate rather than specified explicitly, and the
behavior is that anything is accepted as the second argument.
I went through all other i386.md match_operator uses and all others
looked right (extract_operator using 3 operands, all others 2).
I think we'll want to fix this at different spots in older releases
because I think the bug was introduced already in 2008, though most
likely just latent.
2023-11-25 Jakub Jelinek <jakub@redhat.com>
PR target/111408
* config/i386/i386.md (*jcc_bt<mode>_mask): Add (const_int 0) as
expected second operand of bt_comparison_operator.