This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [047/nnn] poly_int: argument sizes
- From: Richard Sandiford <richard dot sandiford at linaro dot org>
- To: Jeff Law <law at redhat dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Wed, 20 Dec 2017 11:37:05 +0000
- Subject: Re: [047/nnn] poly_int: argument sizes
- Authentication-results: sourceware.org; auth=none
- References: <871sltvm7r.fsf@linaro.org> <878tg1n5mf.fsf@linaro.org> <47796586-120a-d7f0-6f48-147a683698b4@redhat.com>
Jeff Law <law@redhat.com> writes:
> On 10/23/2017 11:20 AM, Richard Sandiford wrote:
>> This patch changes various bits of state related to argument sizes so
>> that they have type poly_int64 rather than HOST_WIDE_INT. This includes:
>>
>> - incoming_args::pops_args and incoming_args::size
>> - rtl_data::outgoing_args_size
>> - pending_stack_adjust
>> - stack_pointer_delta
>> - stack_usage::pushed_stack_size
>> - args_size::constant
>>
>> It also changes TARGET_RETURN_POPS_ARGS so that the size of the
>> arguments passed in and the size returned by the hook are both
>> poly_int64s.
>>
>>
>> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org>
>> Alan Hayward <alan.hayward@arm.com>
>> David Sherwood <david.sherwood@arm.com>
>>
>> gcc/
>> * target.def (return_pops_args): Treat both the input and output
>> sizes as poly_int64s rather than HOST_WIDE_INTS.
>> * targhooks.h (default_return_pops_args): Update accordingly.
>> * targhooks.c (default_return_pops_args): Likewise.
>> * doc/tm.texi: Regenerate.
>> * emit-rtl.h (incoming_args): Change pops_args, size and
>> outgoing_args_size from int to poly_int64_pod.
>> * function.h (expr_status): Change x_pending_stack_adjust and
>> x_stack_pointer_delta from int to poly_int64.
>> (args_size::constant): Change from HOST_WIDE_INT to poly_int64.
>> (ARGS_SIZE_RTX): Update accordingly.
>> * calls.c (highest_outgoing_arg_in_use): Change from int to
>> unsigned int.
>> (stack_usage_watermark, stored_args_watermark): New variables.
>> (stack_region_maybe_used_p, mark_stack_region_used): New functions.
>> (emit_call_1): Change the stack_size and rounded_stack_size
>> parameters from HOST_WIDE_INT to poly_int64. Track n_popped
>> as a poly_int64.
>> (save_fixed_argument_area): Check stack_usage_watermark.
>> (initialize_argument_information): Change old_pending_adj from
>> a HOST_WIDE_INT * to a poly_int64_pod *.
>> (compute_argument_block_size): Return the size as a poly_int64
>> rather than an int.
>> (finalize_must_preallocate): Track polynomial argument sizes.
>> (compute_argument_addresses): Likewise.
>> (internal_arg_pointer_based_exp): Track polynomial offsets.
>> (mem_overlaps_already_clobbered_arg_p): Rename to...
>> (mem_might_overlap_already_clobbered_arg_p): ...this and take the
>> size as a poly_uint64 rather than an unsigned HOST_WIDE_INT.
>> Check stored_args_used_watermark.
>> (load_register_parameters): Update accordingly.
>> (check_sibcall_argument_overlap_1): Likewise.
>> (combine_pending_stack_adjustment_and_call): Take the unadjusted
>> args size as a poly_int64 rather than an int. Return a bool
>> indicating whether the optimization was possible and return
>> the new adjustment by reference.
>> (check_sibcall_argument_overlap): Track polynomail argument sizes.
>> Update stored_args_watermark.
>> (can_implement_as_sibling_call_p): Handle polynomial argument sizes.
>> (expand_call): Likewise. Maintain stack_usage_watermark and
>> stored_args_watermark. Update calls to
>> combine_pending_stack_adjustment_and_call.
>> (emit_library_call_value_1): Handle polynomial argument sizes.
>> Call stack_region_maybe_used_p and mark_stack_region_used.
>> Maintain stack_usage_watermark.
>> (store_one_arg): Likewise. Update call to
>> mem_overlaps_already_clobbered_arg_p.
>> * config/arm/arm.c (arm_output_function_prologue): Add a cast to
>> HOST_WIDE_INT.
>> * config/avr/avr.c (avr_outgoing_args_size): Likewise.
>> * config/microblaze/microblaze.c (microblaze_function_prologue):
>> Likewise.
>> * config/cr16/cr16.c (cr16_return_pops_args): Update for new
>> TARGET_RETURN_POPS_ARGS interface.
>> (cr16_compute_frame, cr16_initial_elimination_offset): Add casts
>> to HOST_WIDE_INT.
>> * config/ft32/ft32.c (ft32_compute_frame): Likewise.
>> * config/i386/i386.c (ix86_return_pops_args): Update for new
>> TARGET_RETURN_POPS_ARGS interface.
>> (ix86_expand_split_stack_prologue): Add a cast to HOST_WIDE_INT.
>> * config/moxie/moxie.c (moxie_compute_frame): Likewise.
>> * config/m68k/m68k.c (m68k_return_pops_args): Update for new
>> TARGET_RETURN_POPS_ARGS interface.
>> * config/vax/vax.c (vax_return_pops_args): Likewise.
>> * config/pa/pa.h (STACK_POINTER_OFFSET): Add a cast to poly_int64.
>> (EXIT_IGNORE_STACK): Update reference to crtl->outgoing_args_size.
>> * config/arm/arm.h (CALLER_INTERWORKING_SLOT_SIZE): Likewise.
>> * config/powerpcspe/aix.h (STACK_DYNAMIC_OFFSET): Likewise.
>> * config/powerpcspe/darwin.h (STACK_DYNAMIC_OFFSET): Likewise.
>> * config/powerpcspe/powerpcspe.h (STACK_DYNAMIC_OFFSET): Likewise.
>> * config/rs6000/aix.h (STACK_DYNAMIC_OFFSET): Likewise.
>> * config/rs6000/darwin.h (STACK_DYNAMIC_OFFSET): Likewise.
>> * config/rs6000/rs6000.h (STACK_DYNAMIC_OFFSET): Likewise.
>> * dojump.h (saved_pending_stack_adjust): Change x_pending_stack_adjust
>> and x_stack_pointer_delta from int to poly_int64.
>> * dojump.c (do_pending_stack_adjust): Update accordingly.
>> * explow.c (allocate_dynamic_stack_space): Handle polynomial
>> stack_pointer_deltas.
>> * function.c (STACK_DYNAMIC_OFFSET): Add a cast to poly_int64.
>> (pad_to_arg_alignment): Track polynomial offsets.
>> (assign_parm_find_stack_rtl): Likewise.
>> (assign_parms, locate_and_pad_parm): Handle polynomial argument sizes.
>> * toplev.c (output_stack_usage): Update reference to
>> current_function_pushed_stack_size.
> I haven't been able to convince myself that the changes to the
> stack_usage_map are correct, particularly in the case where the
> UPPER_BOUND is not constant. But I'm willing to let it go given the
> only target that could potentially be affected would be SVE and I'd
> expect that if you'd gotten it wrong it would have showed up in your
> testing.
>
> I'm also slightly worried about how we handle ARGS_GROW_DOWNWARD targets
> (pa, stormy16). I couldn't convince myself those changes were correct
> either. Again, I'm willing on fall back to lean on your extensive
> experience and testing here.
>
> Of all the patches to-date I've looked at, this one worries me the most
> (which is why it's next to last according to my records). The potential
> for a goof in the argument setup, padding, stack save area, and the
> consequences for such a goof worry me.
>
> I'm going to conditionally ack this -- my condition is that you
> re-review the the calls.c/function.c changes as well.
I agree this is probably the least obvious of the changes. I've had
another look and I still think the approach is OK.
For stack_usage_map: we're trying to detect whether an instruction
references a region of stack that has already been clobbered.
We can conservatively do this as a range from the lowest possible byte
offset in the region to the highest possible byte offset in the region.
This works both when recording a clobber and when testing for an overlap
with a previous clobber.
For polynomial lower bounds, the lowest possible offset is given by
constant_lower_bound. For polynomial upper bounds, the highest
possible offset is HOST_WIDE_INT_M1U (since we track unsigned offsets
from a zero base). But it's not possible to have the stack_usage_map
array itself go up to HOST_WIDE_INT_M1U, so instead, the watermark
records the lowest lower bound whose corresponding upper bound is
HOST_WIDE_INT_M1U, so that every index of stack_usage_map starting at
the watermark is conceptually set.
The patch doesn't really change the handling of ARGS_GROW_DOWNWARD,
it just rewrites the tests into a different form. E.g. there were
two instances of code like:
if (ARGS_GROW_DOWNWARD)
highest_outgoing_arg_in_use
= MAX (initial_highest_arg_in_use, needed + 1);
else
highest_outgoing_arg_in_use
= MAX (initial_highest_arg_in_use, needed);
and the patch splits out the calculation of the second MAX argument:
poly_int64 limit = needed;
if (ARGS_GROW_DOWNWARD)
limit += 1;
The third ARGS_GROW_DOWNARD-related hunk is for pad_to_arg_alignment.
Previously we had:
offset_ptr->constant = -sp_offset +
(ARGS_GROW_DOWNWARD
? FLOOR_ROUND (offset_ptr->constant + sp_offset, boundary_in_bytes)
: CEIL_ROUND (offset_ptr->constant + sp_offset, boundary_in_bytes));
Expanding FLOOR_ROUND, and using the fact that boundary_in_bytes is
a power of 2, the ARGS_GROW_DOWNARD case is equivalent to:
offset_ptr->constant = -sp_offset
+ (offset_ptr->constant + sp_offset)
- ((offset_ptr->constant + sp_offset) & (boundary_in_bytes - 1));
The -sp_offset + sp_offset cancels out, giving:
offset_ptr->constant
= offset_ptr->constant
- ((offset_ptr->constant + sp_offset) & (boundary_in_bytes - 1));
or:
offset_ptr->constant
-= (offset_ptr->constant + sp_offset) & (boundary_in_bytes - 1);
But since (offset_ptr->constant + sp_offset) is polynomial, we can't
always calculate this value, so the code is conditional on a call
to known_misalignment:
if (...
|| !known_misalignment (offset_ptr->constant + sp_offset,
boundary_in_bytes, &misalign))
...
else
{
if (ARGS_GROW_DOWNWARD)
offset_ptr->constant -= misalign;
Thanks,
Richard