Eric Botcazou [Thu, 13 Apr 2023 21:11:38 +0000 (23:11 +0200)]
ada: Fix fallout of recent fix for missing finalization
The original fix makes it possible to create transient scopes around return
statements in more cases, but it overlooks that transient scopes are reused
and, in particular, that they can be promoted to secondary stack management.
gcc/ada/
* exp_ch7.adb (Find_Enclosing_Transient_Scope): Return the index in
the scope table instead of the scope's entity.
(Establish_Transient_Scope): If an enclosing scope already exists,
do not set the Uses_Sec_Stack flag on it if the node to be wrapped
is a return statement which requires secondary stack management.
Eric Botcazou [Tue, 11 Apr 2023 10:15:22 +0000 (12:15 +0200)]
ada: Fix visibility error with DIC or Type_Invariant aspect on generic type
The compiler fails to capture global references during the analysis of the
aspect on the generic type because it analyzes a copy of the expression.
gcc/ada/
* exp_util.adb (Build_DIC_Procedure_Body.Add_Own_DIC): When inside
a generic unit, preanalyze the expression directly.
(Build_Invariant_Procedure_Body.Add_Own_Invariants): Likewise.
Eric Botcazou [Sat, 8 Apr 2023 16:29:16 +0000 (18:29 +0200)]
ada: Fix memory leak in expression function returning Big_Integer
We fail to establish a transient scope around the return statement because
the function returns a controlled type, but this is no longer problematic
because controlled types are no longer returned on the secondary stack.
gcc/ada/
* exp_ch7.adb (Establish_Transient_Scope.Find_Transient_Context):
Bail out for a simple return statement only if the transient scope
and the function both require secondary stack management, or else
if the function is a thunk.
* sem_res.adb (Resolve_Call): Do not create a transient scope when
the call is the expression of a simple return statement.
Eric Botcazou [Sat, 8 Apr 2023 10:43:54 +0000 (12:43 +0200)]
ada: Call idiomatic routine in Expand_Simple_Function_Return
In the primary stack case, Insert_Actions is invoked when the expression is
being rewritten, whereas Insert_List_Before_And_Analyze is invoked in the
secondary stack case. The former is idiomatic, the latter is not.
gcc/ada/
* exp_ch6.adb (Expand_Simple_Function_Return): Call Insert_Actions
consistently when rewriting the expression.
Eric Botcazou [Fri, 7 Apr 2023 17:17:20 +0000 (19:17 +0200)]
ada: Fix wrong finalization for loop on indexed container
The problem is that a transient temporary created for the constant indexing
of the container is finalized almost immediately after its creation.
gcc/ada/
* exp_util.adb (Is_Finalizable_Transient.Is_Indexed_Container):
New predicate to detect a temporary created to hold the result of
a constant indexing on a container.
(Is_Finalizable_Transient.Is_Iterated_Container): Adjust a couple
of obsolete comments.
(Is_Finalizable_Transient): Return False if Is_Indexed_Container
returns True on the object.
Eric Botcazou [Fri, 7 Apr 2023 07:16:12 +0000 (09:16 +0200)]
ada: Fix bogus error on conditional expression with only user-defined literals
This implements the recursive resolution of conditional expressions whose
dependent expressions are (all) user-defined literals the same way it is
implemented for operators.
gcc/ada/
* sem_res.adb (Has_Applicable_User_Defined_Literal): Make it clear
that the predicate also checks the node itself.
(Try_User_Defined_Literal): Move current implementation to...
Deal only with literals, named numbers and conditional expressions
whose dependent expressions are literals or named numbers.
(Try_User_Defined_Literal_For_Operator): ...this. Remove multiple
return False statements and put a single one at the end.
(Resolve): Call Try_User_Defined_Literal instead of directly
Has_Applicable_User_Defined_Literal for all nodes. Call
Try_User_Defined_Literal_For_Operator for operator nodes.
Eric Botcazou [Tue, 4 Apr 2023 17:25:11 +0000 (19:25 +0200)]
ada: Fix internal error with pragma Compile_Time_{Warning,Error}
This happens when the pragmas are deferred to the back-end from an external
unit to the main unit that is generic, because the back-end does not compile
a main unit that is generic.
gcc/ada/
* sem_prag.adb (Process_Compile_Time_Warning_Or_Error): Do not defer
anything to the back-end when the main unit is generic.
Eric Botcazou [Wed, 5 Apr 2023 18:43:54 +0000 (20:43 +0200)]
ada: Fix remaining failures in Roman Numbers test
The test is inspired from the example of user-defined literals given in the
Ada 2022 RM. Mixed Arabic numbers/Roman numbers computations are rejected
because the second resolution pass would try to resolve Arabic numbers only
as user-defined literals.
gcc/ada/
* sem_res.adb (Try_User_Defined_Literal): For arithmetic operators,
also accept operands whose type is covered by the resolution type.
Eric Botcazou [Mon, 3 Apr 2023 08:53:30 +0000 (10:53 +0200)]
ada: Fix wrong finalization for call to BIP function in conditional expression
This happens when the call is a dependent expression of the conditional
expression, and the conditional expression is either the expression of a
simple return statement or the return expression of an expression function.
The reason is that the special processing of "tail calls" for BIP functions,
i.e. calls that are the expression of simple return statements or the return
expression of expression functions, is not applied.
This change makes sure that it is applied by distributing the simple return
statements enclosing conditional expressions into the dependent expressions
of the conditional expressions in almost all cases. As a side effect, this
elides a temporary in the nonlimited by-reference case, as well as a pair of
calls to Adjust/Finalize in the nonlimited controlled case.
gcc/ada/
* exp_ch4.adb (Expand_N_Case_Expression): Distribute simple return
statements enclosing the conditional expression into the dependent
expressions in almost all cases.
(Expand_N_If_Expression): Likewise.
(Process_Transient_In_Expression): Adjust to the above distribution.
* exp_ch6.adb (Expand_Ctrl_Function_Call): Deal with calls in the
dependent expressions of a conditional expression.
* sem_ch6.adb (Analyze_Function_Return): Deal with the rewriting of
a simple return statement during the resolution of its expression.
Eric Botcazou [Mon, 3 Apr 2023 15:11:11 +0000 (17:11 +0200)]
ada: Repair support for user-defined literals in arithmetic operators
It was partially broken to fix a regression in error reporting, because the
fix was applied to the first pass of resolution instead of the second pass,
as needs to be done for user-defined literals.
gcc/ada/
* sem_ch4.ads (Unresolved_Operator): New procedure.
* sem_ch4.adb (Has_Possible_Literal_Aspects): Rename into...
(Has_Possible_User_Defined_Literal): ...this. Tidy up.
(Operator_Check): Accept again unresolved operators if they have a
possible user-defined literal as operand. Factor out the handling
of the general error message into...
(Unresolved_Operator): ...this new procedure.
* sem_res.adb (Resolve): Be prepared for unresolved operators on
entry in Ada 2022 or later. If they are still unresolved on exit,
call Unresolved_Operator to give the error message.
(Try_User_Defined_Literal): Tidy up.
Eric Botcazou [Sat, 1 Apr 2023 19:57:21 +0000 (21:57 +0200)]
ada: Fix spurious error on nested instantiations with generic renaming
The problem is that the renaming slightly changes the form of a global
reference that was saved during the analysis of a generic package, and
that is sufficient to fool the code adjusting global references during
the instantiation.
gcc/ada/
* sem_ch12.adb (Copy_Generic_Node): Test the original node kind
for the sake of consistency. For identifiers and other entity
names and operators, accept an expanded name as associated node.
Replace "or" with "or else" in condtion and fix its formatting.
Eric Botcazou [Sat, 25 Mar 2023 20:42:11 +0000 (21:42 +0100)]
ada: Fix internal error on Big_Integer conversion ghost instance
The problem is that the ghost mode of the instance is used to analyze the
parent of the generic body, whose own ghost mode has nothing to do with it.
gcc/ada/
* sem_ch12.adb (Instantiate_Package_Body): Set the ghost mode to
that of the instance only after loading the generic's parent.
(Instantiate_Subprogram_Body): Likewise.
Eric Botcazou [Sun, 5 Mar 2023 17:30:34 +0000 (18:30 +0100)]
ada: Reject thin 'Unrestricted_Access value to aliased constrained array
This rejects the Unrestricted_Access attribute applied to an aliased array
with a constrained nominal subtype when its type is resolved to be a thin
pointer. The reason is that supporting this case would require the aliased
array to contain its bounds, and this is the case only for aliased arrays
whose nominal subtype is unconstrained.
gcc/ada/
* sem_attr.adb (Is_Thin_Pointer_To_Unc_Array): New predicate.
(Resolve_Attribute): Apply the static matching legality rule to an
Unrestricted_Access attribute applied to an aliased prefix if the
type is a thin pointer. Call Is_Thin_Pointer_To_Unc_Array for the
aliasing legality rule as well.
Eric Botcazou [Wed, 1 Mar 2023 21:28:51 +0000 (22:28 +0100)]
ada: Rework fix for internal error on quantified expression with predicated type
It turns out that skipping compiler-generated block scopes is problematic
when computing the public status of a subprogram, because this subprogram
may end up being nested in the elaboration procedure of a package spec or
body, in which case it may not be public.
This replaces the original fix with a pair of Push_Scope/Pop_Scope in the
Build_Predicate_Function procedure, as done elsewhere in similar cases.
gcc/ada/
* sem_ch13.adb (Build_Predicate_Functions): If the current scope
is not that of the type, push this scope and pop it at the end.
* sem_util.ads (Current_Scope_No_Loops_No_Blocks): Delete.
* sem_util.adb (Current_Scope_No_Loops_No_Blocks): Likewise.
(Set_Public_Status): Call again Current_Scope.
Eric Botcazou [Wed, 15 Feb 2023 14:52:00 +0000 (15:52 +0100)]
ada: Fix internal error on quantified expression with predicated type
The problem is that the special function created by the compiler to check
the predicate does not inherit the public status of the type, because it
is generated as part of the freezing of the quantified expression, which
occurs from within a couple of intermediate internal scopes.
gcc/ada/
* sem_ch13.adb (Build_Predicate_Function_Declaration): Adjust the
commentary to the current implementation.
* sem_util.ads (Current_Scope_No_Loops): Move around.
(Current_Scope_No_Loops_No_Blocks): New declaration.
(Add_Block_Identifier): Fix formatting.
* sem_util.adb (Add_Block_Identifier): Likewise.
(Current_Scope_No_Loops_No_Blocks): New function.
(Set_Public_Status): Call Current_Scope_No_Loops_No_Blocks instead
of Current_Scope to get the current scope.
Eric Botcazou [Fri, 17 Feb 2023 17:01:52 +0000 (18:01 +0100)]
ada: Fix bogus error on predicated limited record declared in protected type
This happens when the limited record is initialized with a function call
because of a couple of issues: incorrect tree sharing when building the
predicate check and too late freezing for a compiler-generated subtype.
It turns out that building the predicate check manually is redundant here,
since predicate checks are automatically generated during the expansion of
assignment statements, and the late freezing can be easily fixed.
gcc/ada/
* exp_ch3.adb (Build_Record_Init_Proc.Build_Assignment): Do not
manually generate a predicate check. Call Unqualify before doing
pattern matching on the expression.
* sem_ch3.adb (Analyze_Object_Declaration): Also freeze the actual
subtype when it is built in the definite case.
Eric Botcazou [Wed, 8 Feb 2023 15:26:46 +0000 (16:26 +0100)]
ada: Fix spurious freezing error on nonabstract null extension
This prevents the wrapper function created for each nonoverridden inherited
function with a controlling result of nonabstract null extensions of tagged
types from causing premature freezing of types referenced in its profile.
gcc/ada/
* exp_ch3.adb (Make_Controlling_Function_Wrappers): Create the body
as the expanded body of an expression function.
Eric Botcazou [Wed, 1 Feb 2023 11:35:08 +0000 (12:35 +0100)]
ada: Fix error and crash on imported function with precondition and 'Base
This fixes a spurious error on an imported function with a precondition
and a parameter declared with a 'Base formal type, and even a crash in
the case where this function is declared in a generic package.
gcc/ada/
* freeze.adb (Wrap_Imported_Subprogram): Use Copy_Subprogram_Spec
to copy the spec from the subprogram to the generated subprogram
body.
(Freeze_Entity): Do not wrap imported subprograms inside generics.
Eric Botcazou [Thu, 26 Jan 2023 14:59:37 +0000 (15:59 +0100)]
ada: Use accumulator type in expansion of 'Reduce attribute
The current expansion of the 'Reduce attribute uses the resolution type of
the expression for the accumulator. Now this type can be unresolved or set
to a universal type, for example if it is itself the prefix of the 'Image
attribute, and this may yield a spurious type mismatch error in that case.
This changes the expansion to use the accumulator type instead as defined
by the RM 4.5.10 clause, albeit only in the prefixed case for now.
gcc/ada/
* exp_attr.adb (Expand_N_Attribute_Reference) <Attribute_Reduce>:
Use the canonical accumulator type as the type of the accumulator
in the prefixed case.
Eric Botcazou [Sun, 29 Jan 2023 23:05:42 +0000 (00:05 +0100)]
ada: Fix crash on iterated component in expression function
The problem is that the freeze node generated for the type of a static
subexpression present in the expression function is incorrectly placed
inside instead of outside the function.
gcc/ada/
* freeze.adb (Freeze_Expression): When the freezing is to be done
outside the current scope, skip any scope that is an internal loop.
Eric Botcazou [Fri, 27 Jan 2023 14:13:07 +0000 (15:13 +0100)]
ada: Fix internal error on chain of predicated record types
The preanalysis of a predicate set on one of the record types was causing
premature freezing of another record type.
gcc/ada/
* sem_ch13.adb: Add with and use clauses for Expander.
(Resolve_Aspect_Expressions) <Aspect_Predicate>: Emulate a
bona-fide preanalysis setup before calling
Resolve_Aspect_Expression.
Eric Botcazou [Fri, 27 Jan 2023 23:08:24 +0000 (00:08 +0100)]
ada: Implement inheritance of user-defined literal aspects for untagged types
In Ada 2022, user-defined literal aspects are nonoverridable but the named
subprograms present in them can be overridden, including for untagged types.
gcc/ada/
* sem_res.adb (Has_Applicable_User_Defined_Literal): Apply the
same processing for derived untagged types as for tagged types.
* sem_util.ads (Corresponding_Primitive_Op): Adjust description.
* sem_util.adb (Corresponding_Primitive_Op): Handle untagged
types.
Eric Botcazou [Thu, 12 Jan 2023 14:51:40 +0000 (15:51 +0100)]
ada: Fix internal error on instance in package body with -gnatn
This plugs a small loophole in the procedure responsible for attempting to
hide entities that have been previously made public by the semantic analyzer
in package bodies.
gcc/ada/
* sem_ch7.adb (Hide_Public_Entities): Use the same condition for
subprogram bodies without specification as for those with one.
Eric Botcazou [Wed, 4 Jan 2023 15:41:47 +0000 (16:41 +0100)]
ada: Fix invalid JSON for extended variant record with -gnatRj
This fixes the output of -gnatRj for an extension of a tagged type which has
a variant part and also deals with the case where the parent type is private
with unknown discriminants.
gcc/ada/
* repinfo.ads (JSON output format): Document special case of
Present member of a Variant object.
* repinfo.adb (List_Structural_Record_Layout): Change the type of
Ext_Level parameter to Integer. Restrict the first recursion with
increasing levels to the fixed part and implement a second
recursion with decreasing levels for the variant part. Deal with
an extension of a type with unknown discriminants.
Gaius Mulley [Tue, 26 Sep 2023 18:39:59 +0000 (19:39 +0100)]
PR modula2/111510 runtime ICE findChildAndParent has caused internal runtime error
This patch fixes the runtime bug above. The full runtime message is:
findChildAndParent has caused internal runtime error, RTentity is either
corrupt or the module storage has not been initialized yet. The bug is
due to a non nul terminated string determining the module initialization order.
This results in modules being uninitialized and the above crash. The bug
manifests itself on 32 bit systems - but obviously is latent on all
targets and the fix should be applied to both gcc-14 and gcc-13.
gcc/m2/ChangeLog:
PR modula2/111510
* gm2-compiler/M2GenGCC.mod (IsExportedGcc): Minor spacing changes.
(BuildTrashTreeFromInterface): Minor spacing changes.
* gm2-compiler/M2Options.mod (GetRuntimeModuleOverride): Call
string to generate a nul terminated C style string.
* gm2-compiler/M2Quads.mod (BuildStringAdrParam): New procedure.
(BuildM2InitFunction): Replace inline parameter generation with
calls to BuildStringAdrParam.
Andrew MacLeod [Tue, 26 Sep 2023 13:44:39 +0000 (09:44 -0400)]
Reduce the initial size of int_range_max.
This patch adds the ability to resize ranges as needed, defaulting to no
resizing. int_range_max now defaults to 3 sub-ranges (instead of 255)
and grows to 255 when the range being calculated does not fit.
PR tree-optimization/110315
* value-range-storage.h (vrange_allocator::alloc_irange): Adjust
new params.
* value-range.cc (irange::operator=): Resize range.
(irange::irange_union): Same.
(irange::irange_intersect): Same.
(irange::invert): Same.
* value-range.h (irange::maybe_resize): New.
(~int_range): New.
(int_range_max): Default to 3 sub-ranges and resize as needed.
(int_range::int_range): Adjust for resizing.
(int_range::operator=): Same.
Patrick Palka [Fri, 22 Sep 2023 10:27:48 +0000 (06:27 -0400)]
c++: missing SFINAE in grok_array_decl [PR111493]
We should guard both the diagnostic and backward compatibilty fallback
code with tf_error, so that in a SFINAE context we don't issue any
diagnostics and correctly treat ill-formed C++23 multidimensional
subscript operator expressions as such.
PR c++/111493
gcc/cp/ChangeLog:
* decl2.cc (grok_array_decl): Guard diagnostic and backward
compatibility fallback code paths with tf_error.
Patrick Palka [Fri, 22 Sep 2023 10:25:49 +0000 (06:25 -0400)]
c++: constraint rewriting during ttp coercion [PR111485]
In order to compare the constraints of a ttp with that of its argument,
we rewrite the ttp's constraints in terms of the argument template's
template parameters. The substitution to achieve this currently uses a
single level of template arguments, but that never does the right thing
because a ttp's template parameters always have level >= 2. This patch
fixes this by including the outer template arguments in the substitution,
which ought to match the depth of the ttp.
The second testcase demonstrates it's better to substitute the concrete
outer template arguments instead of generic ones since a ttp's constraints
could depend on outer parameters.
PR c++/111485
gcc/cp/ChangeLog:
* pt.cc (is_compatible_template_arg): New parameter 'args'.
Add the outer template arguments 'args' to 'new_args'.
(convert_template_argument): Pass 'args' to
is_compatible_template_arg.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-ttp5.C: New test.
* g++.dg/cpp2a/concepts-ttp6.C: New test.
Patrick Palka [Tue, 19 Sep 2023 12:21:05 +0000 (08:21 -0400)]
c++: constness of decltype of NTTP object [PR99631]
This corrects resolving decltype of a (class) NTTP object as per
[dcl.type.decltype]/1.2 and [temp.param]/6 in the type-dependent case.
Note that in the non-dependent case we resolve the decltype ahead of
time, in which case finish_decltype_type drops the const VIEW_CONVERT_EXPR
wrapper around the TEMPLATE_PARM_INDEX, and the latter has the desired
non-const type.
In the type-dependent case, at instantiation time tsubst drops the
VIEW_CONVERT_EXPR since the substituted NTTP is the already-const object
created by get_template_parm_object. So in this case finish_decltype_type
sees the const object, which this patch now adds special handling for.
PR c++/99631
gcc/cp/ChangeLog:
* semantics.cc (finish_decltype_type): For an NTTP object,
return its type modulo cv-quals.
Paul Thomas [Sun, 24 Sep 2023 14:34:57 +0000 (15:34 +0100)]
Fortran: Supply a missing dereference [PR92586]
2023-09-24 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/92586
* trans-expr.cc (gfc_trans_arrayfunc_assign): Supply a missing
dereference for the call to gfc_deallocate_alloc_comp_no_caf.
gcc/testsuite/
PR fortran/92586
* gfortran.dg/pr92586.f90 : New test
Paul Thomas [Sun, 24 Sep 2023 14:26:01 +0000 (15:26 +0100)]
Fortran: Pad mismatched charlens in component initializers [PR68155]
2023-09-24 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/68155
* decl.cc (fix_initializer_charlen): New function broken out of
add_init_expr_to_sym.
(add_init_expr_to_sym, build_struct): Call the new function.
gcc/testsuite/
PR fortran/68155
* gfortran.dg/pr68155.f90: New test.
Patrick Palka [Wed, 20 Sep 2023 16:09:36 +0000 (12:09 -0400)]
c++: improve class NTTP object pretty printing [PR111471]
1. Move class NTTP object pretty printing to a more general spot in
the pretty printer, so that we always print its value instead of
its (mangled) name even when it appears outside of a template
argument list.
2. Print the type of an class NTTP object alongside its CONSTRUCTOR
value, like dump_expr would have done.
3. Don't print const VIEW_CONVERT_EXPR wrappers for class NTTPs.
PR c++/111471
gcc/cp/ChangeLog:
* cxx-pretty-print.cc (cxx_pretty_printer::expression)
<case VAR_DECL>: Handle class NTTP objects by printing
their type and value.
<case VIEW_CONVERT_EXPR>: Strip const VIEW_CONVERT_EXPR
wrappers for class NTTPs.
(pp_cxx_template_argument_list): Don't handle class NTTP
objects here.
aarch64_operands_ok_for_ldpstp contained the code:
/* One of the memory accesses must be a mempair operand.
If it is not the first one, they need to be swapped by the
peephole. */
if (!aarch64_mem_pair_operand (mem_1, GET_MODE (mem_1))
&& !aarch64_mem_pair_operand (mem_2, GET_MODE (mem_2)))
return false;
But the requirement isn't just that one of the accesses must be a
valid mempair operand. It's that the lower access must be, since
that's the access that will be used for the instruction operand.
gcc/
PR target/111411
* config/aarch64/aarch64.cc (aarch64_operands_ok_for_ldpstp): Require
the lower memory access to a mem-pair operand.
gcc/testsuite/
PR target/111411
* gcc.dg/rtl/aarch64/pr111411.c: New test.
aarch64: Fix return register handling in untyped_call
While working on another patch, I hit a problem with the aarch64
expansion of untyped_call. The expander emits the usual:
(set (mem ...) (reg resN))
instructions to store the result registers to memory, but it didn't
say in RTL where those resN results came from. This eventually led
to a failure of gcc.dg/torture/stackalign/builtin-return-2.c,
via regrename.
This patch turns the untyped call from a plain call to a call_value,
to represent that the call returns (or might return) a useful value.
The patch also uses a PARALLEL return rtx to represent all the possible
return registers.
gcc/
* config/aarch64/aarch64.md (untyped_call): Emit a call_value
rather than a call. List each possible destination register
in the call pattern.
RISC-V: Remove phase 6 of vsetvl pass in GCC13[PR111412]
vsetvl pass has been refactored in gcc14, and the optimization
is more reasonable than releases/gcc-13. This problem does not
exist in gcc14.
Phase 6 of gcc13 is an optimization patch. Due to lack of consideration,
there will be some hidden bugs, so we decided to remove phase 6.
Although the generated code will be redundant, the program is correct.
Gaius Mulley [Wed, 13 Sep 2023 19:48:53 +0000 (20:48 +0100)]
[PATCH] modula2: -Wcase-enum detect singular/plural and use switch during build
This patch generates a singular or plural message relating to the
number of enums missing. Use -Wcase-enum when building of the
modula-2 libraries and m2/stage2/cc1gm2.
gcc/m2/ChangeLog:
* Make-lang.in (GM2_FLAGS): Add -Wcase-enum.
(GM2_ISO_FLAGS): Add -Wcase-enum.
* gm2-compiler/M2CaseList.mod (EnumerateErrors): Issue
singular or plural start text prior to the enum list.
Remove unused parameter tokenno.
(EmitMissingRangeErrors): New procedure.
(MissingCaseBounds): Call EmitMissingRangeErrors.
(MissingCaseStatementBounds): Call EmitMissingRangeErrors.
* gm2-libs-iso/TextIO.mod: Fix spacing.
aarch64: Make stack smash canary protect saved registers
AArch64 normally puts the saved registers near the bottom of the frame,
immediately above any dynamic allocations. But this means that a
stack-smash attack on those dynamic allocations could overwrite the
saved registers without needing to reach as far as the stack smash
canary.
The same thing could also happen for variable-sized arguments that are
passed by value, since those are allocated before a call and popped on
return.
This patch avoids that by putting the locals (and thus the canary) below
the saved registers when stack smash protection is active.
The patch fixes CVE-2023-4039.
gcc/
* config/aarch64/aarch64.cc (aarch64_save_regs_above_locals_p):
New function.
(aarch64_layout_frame): Use it to decide whether locals should
go above or below the saved registers.
(aarch64_expand_prologue): Update stack layout comment.
Emit a stack tie after the final adjustment.
gcc/testsuite/
* gcc.target/aarch64/stack-protector-8.c: New test.
* gcc.target/aarch64/stack-protector-9.c: Likewise.
After previous patches, it's no longer necessary to store
saved_regs_size and below_hard_fp_saved_regs_size in the frame info.
All measurements instead use the top or bottom of the frame as
reference points.
aarch64: Explicitly record probe registers in frame info
The stack frame is currently divided into three areas:
A: the area above the hard frame pointer
B: the SVE saves below the hard frame pointer
C: the outgoing arguments
If the stack frame is allocated in one chunk, the allocation needs a
probe if the frame size is >= guard_size - 1KiB. In addition, if the
function is not a leaf function, it must probe an address no more than
1KiB above the outgoing SP. We ensured the second condition by
(1) using single-chunk allocations for non-leaf functions only if
the link register save slot is within 512 bytes of the bottom
of the frame; and
(2) using the link register save as a probe (meaning, for instance,
that it can't be individually shrink wrapped)
If instead the stack is allocated in multiple chunks, then:
* an allocation involving only the outgoing arguments (C above) requires
a probe if the allocation size is > 1KiB
* any other allocation requires a probe if the allocation size
is >= guard_size - 1KiB
* second and subsequent allocations require the previous allocation
to probe at the bottom of the allocated area, regardless of the size
of that previous allocation
The final point means that, unlike for single allocations,
it can be necessary to have both a non-SVE register probe and
an SVE register probe. For example:
* allocate A, probe using a non-SVE register save
* allocate B, probe using an SVE register save
* allocate C
The non-SVE register used in this case was again the link register.
It was previously used even if the link register save slot was some
bytes above the bottom of the non-SVE register saves, but an earlier
patch avoided that by putting the link register save slot first.
As a belt-and-braces fix, this patch explicitly records which
probe registers we're using and allows the non-SVE probe to be
whichever register comes first (as for SVE).
The patch also avoids unnecessary probes in sve/pcs/stack_clash_3.c.
gcc/
* config/aarch64/aarch64.h (aarch64_frame::sve_save_and_probe)
(aarch64_frame::hard_fp_save_and_probe): New fields.
* config/aarch64/aarch64.cc (aarch64_layout_frame): Initialize them.
Rather than asserting that a leaf function saves LR, instead assert
that a leaf function saves something.
(aarch64_get_separate_components): Prevent the chosen probe
registers from being individually shrink-wrapped.
(aarch64_allocate_and_probe_stack_space): Remove workaround for
probe registers that aren't at the bottom of the previous allocation.
Previous patches ensured that the final frame allocation only needs
a probe when the size is strictly greater than 1KiB. It's therefore
safe to use the normal 1024 probe offset in all cases.
The main motivation for doing this is to simplify the code and
remove the number of special cases.
gcc/
* config/aarch64/aarch64.cc (aarch64_allocate_and_probe_stack_space):
Always probe the residual allocation at offset 1024, asserting
that that is in range.
gcc/testsuite/
* gcc.target/aarch64/stack-check-prologue-17.c: Expect the probe
to be at offset 1024 rather than offset 0.
* gcc.target/aarch64/stack-check-prologue-18.c: Likewise.
* gcc.target/aarch64/stack-check-prologue-19.c: Likewise.
-fstack-clash-protection uses the save of LR as a probe for the next
allocation. The next allocation could be:
* another part of the static frame, e.g. when allocating SVE save slots
or outgoing arguments
* an alloca in the same function
* an allocation made by a callee function
However, when -fomit-frame-pointer is used, the LR save slot is placed
above the other GPR save slots. It could therefore be up to 80 bytes
above the base of the GPR save area (which is also the hard fp address).
aarch64_allocate_and_probe_stack_space took this into account when
deciding how much subsequent space could be allocated without needing
a probe. However, it interacted badly with:
/* If doing a small final adjustment, we always probe at offset 0.
This is done to avoid issues when LR is not at position 0 or when
the final adjustment is smaller than the probing offset. */
else if (final_adjustment_p && rounded_size == 0)
residual_probe_offset = 0;
which forces any allocation that is smaller than the guard page size
to be probed at offset 0 rather than the usual offset 1024. It was
therefore possible to construct cases in which we had:
* a probe using LR at SP + 80 bytes (or some other value >= 16)
* an allocation of the guard page size - 16 bytes
* a probe at SP + 0
which allocates guard page size + 64 consecutive unprobed bytes.
This patch requires the LR probe to be in the first 16 bytes of the
save area when stack clash protection is active. Doing it
unconditionally would cause code-quality regressions.
Putting LR before other registers prevents push/pop allocation
when shadow call stacks are enabled, since LR is restored
separately from the other callee-saved registers.
The new comment doesn't say that the probe register is required
to be LR, since a later patch removes that restriction.
gcc/
* config/aarch64/aarch64.cc (aarch64_layout_frame): Ensure that
the LR save slot is in the first 16 bytes of the register save area.
Only form STP/LDP push/pop candidates if both registers are valid.
(aarch64_allocate_and_probe_stack_space): Remove workaround for
when LR was not in the first 16 bytes.
The AArch64 ABI says that, when stack clash protection is used,
there can be a maximum of 1KiB of unprobed space at sp on entry
to a function. Therefore, we need to probe when allocating
>= guard_size - 1KiB of data (>= rather than >). This is what
GCC does.
If an allocation is exactly guard_size bytes, it is enough to allocate
those bytes and probe once at offset 1024. It isn't possible to use a
single probe at any other offset: higher would conmplicate later code,
by leaving more unprobed space than usual, while lower would risk
leaving an entire page unprobed. For simplicity, the code probes all
allocations at offset 1024.
Some register saves also act as probes. If we need to allocate
more space below the last such register save probe, we need to
probe the allocation if it is > 1KiB. Again, this allocation is
then sometimes (but not always) probed at offset 1024. This sort of
allocation is currently only used for outgoing arguments, which are
rarely this big.
However, the code also probed if this final outgoing-arguments
allocation was == 1KiB, rather than just > 1KiB. This isn't
necessary, since the register save then probes at offset 1024
as required. Continuing to probe allocations of exactly 1KiB
would complicate later patches.
gcc/
* config/aarch64/aarch64.cc (aarch64_allocate_and_probe_stack_space):
Don't probe final allocations that are exactly 1KiB in size (after
unprobed space above the final allocation has been deducted).
gcc/testsuite/
* gcc.target/aarch64/stack-check-prologue-17.c: New test.
After previous patches, it no longer really makes sense to allocate
the top of the frame in terms of varargs_and_saved_regs_size and
saved_regs_and_above.
gcc/
* config/aarch64/aarch64.cc (aarch64_layout_frame): Simplify
the allocation of the top of the frame.
aarch64: Measure reg_offset from the bottom of the frame
reg_offset was measured from the bottom of the saved register area.
This made perfect sense with the original layout, since the bottom
of the saved register area was also the hard frame pointer address.
It became slightly less obvious with SVE, since we save SVE
registers below the hard frame pointer, but it still made sense.
However, if we want to allow different frame layouts, it's more
convenient and obvious to measure reg_offset from the bottom of
the frame. After previous patches, it's also a slight simplification
in its own right.
gcc/
* config/aarch64/aarch64.h (aarch64_frame): Add comment above
reg_offset.
* config/aarch64/aarch64.cc (aarch64_layout_frame): Walk offsets
from the bottom of the frame, rather than the bottom of the saved
register area. Measure reg_offset from the bottom of the frame
rather than the bottom of the saved register area.
(aarch64_save_callee_saves): Update accordingly.
(aarch64_restore_callee_saves): Likewise.
(aarch64_get_separate_components): Likewise.
(aarch64_process_components): Likewise.
aarch64: Rename hard_fp_offset to bytes_above_hard_fp
Similarly to the previous locals_offset patch, hard_fp_offset
was described as:
/* Offset from the base of the frame (incomming SP) to the
hard_frame_pointer. This value is always a multiple of
STACK_BOUNDARY. */
poly_int64 hard_fp_offset;
which again took an “upside-down” view: higher offsets meant lower
addresses. This patch renames the field to bytes_above_hard_fp instead.
aarch64: Rename locals_offset to bytes_above_locals
locals_offset was described as:
/* Offset from the base of the frame (incomming SP) to the
top of the locals area. This value is always a multiple of
STACK_BOUNDARY. */
This is implicitly an “upside down” view of the frame: the incoming
SP is at offset 0, and anything N bytes below the incoming SP is at
offset N (rather than -N).
However, reg_offset instead uses a “right way up” view; that is,
it views offsets in address terms. Something above X is at a
positive offset from X and something below X is at a negative
offset from X.
Also, even on FRAME_GROWS_DOWNWARD targets like AArch64,
target-independent code views offsets in address terms too:
locals are allocated at negative offsets to virtual_stack_vars.
It seems confusing to have *_offset fields of the same structure
using different polarities like this. This patch tries to avoid
that by renaming locals_offset to bytes_above_locals.
aarch64_save_callee_saves and aarch64_restore_callee_saves took
a parameter called start_offset that gives the offset of the
bottom of the saved register area from the current stack pointer.
However, it's more convenient for later patches if we use the
bottom of the entire frame as the reference point, rather than
the bottom of the saved registers.
Doing that removes the need for the callee_offset field.
Other than that, this is not a win on its own. It only really
makes sense in combination with the follow-on patches.
gcc/
* config/aarch64/aarch64.h (aarch64_frame::callee_offset): Delete.
* config/aarch64/aarch64.cc (aarch64_layout_frame): Remove
callee_offset handling.
(aarch64_save_callee_saves): Replace the start_offset parameter
with a bytes_below_sp parameter.
(aarch64_restore_callee_saves): Likewise.
(aarch64_expand_prologue): Update accordingly.
(aarch64_expand_epilogue): Likewise.
Following on from the previous bytes_below_saved_regs patch, this one
records the number of bytes that are below the hard frame pointer.
This eventually replaces below_hard_fp_saved_regs_size.
If a frame pointer is not needed, the epilogue adds final_adjust
to the stack pointer before restoring registers:
Therefore, if the epilogue needs to restore the stack pointer from
the hard frame pointer, the directly corresponding offset is:
-bytes_below_hard_fp + final_adjust
i.e. go from the hard frame pointer to the bottom of the frame,
then add the same amount as if we were using the stack pointer
from the outset.
gcc/
* config/aarch64/aarch64.h (aarch64_frame::bytes_below_hard_fp): New
field.
* config/aarch64/aarch64.cc (aarch64_layout_frame): Initialize it.
(aarch64_expand_epilogue): Use it instead of
below_hard_fp_saved_regs_size.
The frame layout code currently hard-codes the assumption that
the number of bytes below the saved registers is equal to the
size of the outgoing arguments. This patch abstracts that
value into a new field of aarch64_frame.
gcc/
* config/aarch64/aarch64.h (aarch64_frame::bytes_below_saved_regs): New
field.
* config/aarch64/aarch64.cc (aarch64_layout_frame): Initialize it,
and use it instead of crtl->outgoing_args_size.
(aarch64_get_separate_components): Use bytes_below_saved_regs instead
of outgoing_args_size.
(aarch64_process_components): Likewise.
aarch64: Explicitly handle frames with no saved registers
If a frame has no saved registers, it can be allocated in one go.
There is no need to treat the areas below and above the saved
registers as separate.
And if we allocate the frame in one go, it should be allocated
as the initial_adjust rather than the final_adjust. This allows the
frame size to grow to guard_size - guard_used_by_caller before a stack
probe is needed. (A frame with no register saves is necessarily a
leaf frame.)
This is a no-op as thing stand, since a leaf function will have
no outgoing arguments, and so all the frame will be above where
the saved registers normally go.
gcc/
* config/aarch64/aarch64.cc (aarch64_layout_frame): Explicitly
allocate the frame in one go if there are no saved registers.
When we emit the frame chain, i.e. when we reach Here in this statement
of aarch64_expand_prologue:
if (emit_frame_chain)
{
// Here
...
}
the stack is in one of two states:
- We've allocated up to the frame chain, but no more.
- We've allocated the whole frame, and the frame chain is within easy
reach of the new SP.
The offset of the frame chain from the current SP is available
in aarch64_frame as callee_offset. It is also available as the
chain_offset local variable, where the latter is calculated from other
data. (However, chain_offset is not always equal to callee_offset when
!emit_frame_chain, so chain_offset isn't redundant.)
But the later REG_CFA_ADJUST_CFA handling still used callee_offset.
I think the difference is harmless, but it's more logical for the
CFA note to be in sync, and it's more convenient for later patches
if it uses chain_offset.
gcc/
* config/aarch64/aarch64.cc (aarch64_expand_prologue): Use
chain_offset rather than callee_offset.
aarch64: Use local frame vars in shrink-wrapping code
aarch64_layout_frame uses a shorthand for referring to
cfun->machine->frame:
aarch64_frame &frame = cfun->machine->frame;
This patch does the same for some other heavy users of the structure.
No functional change intended.
gcc/
* config/aarch64/aarch64.cc (aarch64_save_callee_saves): Use
a local shorthand for cfun->machine->frame.
(aarch64_restore_callee_saves, aarch64_get_separate_components):
(aarch64_process_components): Likewise.
(aarch64_allocate_and_probe_stack_space): Likewise.
(aarch64_expand_prologue, aarch64_expand_epilogue): Likewise.
(aarch64_layout_frame): Use existing shorthand for one more case.
Gaius Mulley [Tue, 12 Sep 2023 12:04:20 +0000 (13:04 +0100)]
[PATCH] modula2: new option -Wcase-enum and associated fixes
This patch introduces -Wcase-enum which enumerates each missing
field in a case statement without an else clause providing the selector
expression type is an enum.
gcc/ChangeLog:
* doc/gm2.texi (Compiler options): Document new option
-Wcase-enum.
gcc/m2/ChangeLog:
* gm2-compiler/M2CaseList.def (PushCase): Rename parameters
r to rec and v to va. Add expr parameter.
(MissingCaseStatementBounds): New procedure function.
* gm2-compiler/M2CaseList.mod (RangePair): Add expression.
(PushCase): Rename parameters r to rec and v to va. Add
expr parameter.
(RemoveRange): New procedure function.
(SubBitRange): Detect the case when the range in the set matches
lo..hi.
(CheckLowHigh): New procedure.
(ExcludeCaseRanges): Rename parameter c to cd. Rename local
variables q to cl and r to rp.
(High): Remove.
(Low): Remove.
(DoEnumValues): Remove.
(IncludeElement): New procedure.
(IncludeElements): New procedure.
(ErrorRangeEnum): New procedure.
(ErrorRange): Remove.
(ErrorRanges): Remove.
(appendEnum): New procedure.
(appendStr): New procedure.
(EnumerateErrors): New procedure.
(MissingCaseBounds): Re-implement.
(InRangeList): Remove.
(MissingCaseStatementBounds): New procedure function.
(checkTypes): Re-format.
(inRange): Re-format.
(TypeCaseBounds): Re-format.
* gm2-compiler/M2Error.mod (GetAnnounceScope): Add noscope to
case label list.
* gm2-compiler/M2GCCDeclare.mod: Replace ForeachFieldEnumerationDo
with ForeachLocalSymDo.
* gm2-compiler/M2Options.def (SetCaseEnumChecking): New procedure.
(CaseEnumChecking): New variable.
* gm2-compiler/M2Options.mod (SetCaseEnumChecking): New procedure.
(Module initialization): set CaseEnumChecking to FALSE.
* gm2-compiler/M2Quads.def (QuadOperator): Alphabetically ordered.
* gm2-compiler/M2Quads.mod (IsBackReferenceConditional): Add else
clause.
(BuildCaseStart): Pass selector expression to InitCaseBounds.
(CheckUninitializedVariablesAreUsed): Remove.
(IsInlineWithinBlock): Remove.
(AsmStatementsInBlock): Remove.
(CheckVariablesInBlock): Remove commented code.
(BeginVarient): Pass NulSym to InitCaseBounds.
* gm2-compiler/M2Range.mod (FoldCaseBounds): New local variable
errorGenerated. Add call to MissingCaseStatementBounds.
* gm2-compiler/P3Build.bnf (CaseEndStatement): Call ElseCase.
* gm2-compiler/PCSymBuild.mod (InitDesExpr): Add else clause.
(InitFunction): Add else clause.
(InitConvert): Add else clause.
(InitLeaf): Add else clause.
(InitBinary): Add else clause.
(InitUnary): Add else clause.
* gm2-compiler/SymbolTable.def (GetNth): Re-write comment.
(ForeachFieldEnumerationDo): Re-write comment stating alphabetical
traversal.
* gm2-compiler/SymbolTable.mod (GetNth): Re-write comment.
Add case label for EnumerationSym and call GetItemFromList.
(ForeachFieldEnumerationDo): Re-write comment stating alphabetical
traversal.
(SymEnumeration): Add ListOfFields used for declaration order.
(MakeEnumeration): Initialize ListOfFields.
(PutFieldEnumeration): Include Field in ListOfFields.
* gm2-gcc/m2options.h (M2Options_SetCaseEnumChecking): New
function.
* gm2-lang.cc (gm2_langhook_handle_option): Add
OPT_Wcase_enum case and call M2Options_SetCaseEnumChecking.
* lang.opt (Wcase-enum): Add.
gcc/testsuite/ChangeLog:
* gm2/switches/case/fail/missingclause.mod: New test.
* gm2/switches/case/fail/switches-case-fail.exp: New test.
* gm2/switches/case/pass/enumcase.mod: New test.
* gm2/switches/case/pass/enumcase2.mod: New test.
* gm2/switches/case/pass/switches-case-pass.exp: New test.
Gaius Mulley [Mon, 11 Sep 2023 21:29:27 +0000 (22:29 +0100)]
PR modula2/111330 Bootstrap failure building SeqFile.lo
cc1gm2 issues a runtime case statement error and terminates
when building SeqFile.lo on Fedora mock. There are four
missing labels from the largest case statement in M2SymInit.mod.
This patch adds the case labels and appropriate actions.
gcc/m2/ChangeLog:
PR modula2/111330
* gm2-compiler/M2SymInit.mod (CheckReadBeforeInitQuad): Add
case labels LogicalDiffOp, DummyOp, OptParamOp and
InitAddressOp.