This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [047/nnn] poly_int: argument sizes


Jeff Law <law@redhat.com> writes:
> On 10/23/2017 11:20 AM, Richard Sandiford wrote:
>> This patch changes various bits of state related to argument sizes so
>> that they have type poly_int64 rather than HOST_WIDE_INT.  This includes:
>> 
>> - incoming_args::pops_args and incoming_args::size
>> - rtl_data::outgoing_args_size
>> - pending_stack_adjust
>> - stack_pointer_delta
>> - stack_usage::pushed_stack_size
>> - args_size::constant
>> 
>> It also changes TARGET_RETURN_POPS_ARGS so that the size of the
>> arguments passed in and the size returned by the hook are both
>> poly_int64s.
>> 
>> 
>> 2017-10-23  Richard Sandiford  <richard.sandiford@linaro.org>
>> 	    Alan Hayward  <alan.hayward@arm.com>
>> 	    David Sherwood  <david.sherwood@arm.com>
>> 
>> gcc/
>> 	* target.def (return_pops_args): Treat both the input and output
>> 	sizes as poly_int64s rather than HOST_WIDE_INTS.
>> 	* targhooks.h (default_return_pops_args): Update accordingly.
>> 	* targhooks.c (default_return_pops_args): Likewise.
>> 	* doc/tm.texi: Regenerate.
>> 	* emit-rtl.h (incoming_args): Change pops_args, size and
>> 	outgoing_args_size from int to poly_int64_pod.
>> 	* function.h (expr_status): Change x_pending_stack_adjust and
>> 	x_stack_pointer_delta from int to poly_int64.
>> 	(args_size::constant): Change from HOST_WIDE_INT to poly_int64.
>> 	(ARGS_SIZE_RTX): Update accordingly.
>> 	* calls.c (highest_outgoing_arg_in_use): Change from int to
>> 	unsigned int.
>> 	(stack_usage_watermark, stored_args_watermark): New variables.
>> 	(stack_region_maybe_used_p, mark_stack_region_used): New functions.
>> 	(emit_call_1): Change the stack_size and rounded_stack_size
>> 	parameters from HOST_WIDE_INT to poly_int64.  Track n_popped
>> 	as a poly_int64.
>> 	(save_fixed_argument_area): Check stack_usage_watermark.
>> 	(initialize_argument_information): Change old_pending_adj from
>> 	a HOST_WIDE_INT * to a poly_int64_pod *.
>> 	(compute_argument_block_size): Return the size as a poly_int64
>> 	rather than an int.
>> 	(finalize_must_preallocate): Track polynomial argument sizes.
>> 	(compute_argument_addresses): Likewise.
>> 	(internal_arg_pointer_based_exp): Track polynomial offsets.
>> 	(mem_overlaps_already_clobbered_arg_p): Rename to...
>> 	(mem_might_overlap_already_clobbered_arg_p): ...this and take the
>> 	size as a poly_uint64 rather than an unsigned HOST_WIDE_INT.
>> 	Check stored_args_used_watermark.
>> 	(load_register_parameters): Update accordingly.
>> 	(check_sibcall_argument_overlap_1): Likewise.
>> 	(combine_pending_stack_adjustment_and_call): Take the unadjusted
>> 	args size as a poly_int64 rather than an int.  Return a bool
>> 	indicating whether the optimization was possible and return
>> 	the new adjustment by reference.
>> 	(check_sibcall_argument_overlap): Track polynomail argument sizes.
>> 	Update stored_args_watermark.
>> 	(can_implement_as_sibling_call_p): Handle polynomial argument sizes.
>> 	(expand_call): Likewise.  Maintain stack_usage_watermark and
>> 	stored_args_watermark.  Update calls to
>> 	combine_pending_stack_adjustment_and_call.
>> 	(emit_library_call_value_1): Handle polynomial argument sizes.
>> 	Call stack_region_maybe_used_p and mark_stack_region_used.
>> 	Maintain stack_usage_watermark.
>> 	(store_one_arg): Likewise.  Update call to
>> 	mem_overlaps_already_clobbered_arg_p.
>> 	* config/arm/arm.c (arm_output_function_prologue): Add a cast to
>> 	HOST_WIDE_INT.
>> 	* config/avr/avr.c (avr_outgoing_args_size): Likewise.
>> 	* config/microblaze/microblaze.c (microblaze_function_prologue):
>> 	Likewise.
>> 	* config/cr16/cr16.c (cr16_return_pops_args): Update for new
>> 	TARGET_RETURN_POPS_ARGS interface.
>> 	(cr16_compute_frame, cr16_initial_elimination_offset): Add casts
>> 	to HOST_WIDE_INT.
>> 	* config/ft32/ft32.c (ft32_compute_frame): Likewise.
>> 	* config/i386/i386.c (ix86_return_pops_args): Update for new
>> 	TARGET_RETURN_POPS_ARGS interface.
>> 	(ix86_expand_split_stack_prologue): Add a cast to HOST_WIDE_INT.
>> 	* config/moxie/moxie.c (moxie_compute_frame): Likewise.
>> 	* config/m68k/m68k.c (m68k_return_pops_args): Update for new
>> 	TARGET_RETURN_POPS_ARGS interface.
>> 	* config/vax/vax.c (vax_return_pops_args): Likewise.
>> 	* config/pa/pa.h (STACK_POINTER_OFFSET): Add a cast to poly_int64.
>> 	(EXIT_IGNORE_STACK): Update reference to crtl->outgoing_args_size.
>> 	* config/arm/arm.h (CALLER_INTERWORKING_SLOT_SIZE): Likewise.
>> 	* config/powerpcspe/aix.h (STACK_DYNAMIC_OFFSET): Likewise.
>> 	* config/powerpcspe/darwin.h (STACK_DYNAMIC_OFFSET): Likewise.
>> 	* config/powerpcspe/powerpcspe.h (STACK_DYNAMIC_OFFSET): Likewise.
>> 	* config/rs6000/aix.h (STACK_DYNAMIC_OFFSET): Likewise.
>> 	* config/rs6000/darwin.h (STACK_DYNAMIC_OFFSET): Likewise.
>> 	* config/rs6000/rs6000.h (STACK_DYNAMIC_OFFSET): Likewise.
>> 	* dojump.h (saved_pending_stack_adjust): Change x_pending_stack_adjust
>> 	and x_stack_pointer_delta from int to poly_int64.
>> 	* dojump.c (do_pending_stack_adjust): Update accordingly.
>> 	* explow.c (allocate_dynamic_stack_space): Handle polynomial
>> 	stack_pointer_deltas.
>> 	* function.c (STACK_DYNAMIC_OFFSET): Add a cast to poly_int64.
>> 	(pad_to_arg_alignment): Track polynomial offsets.
>> 	(assign_parm_find_stack_rtl): Likewise.
>> 	(assign_parms, locate_and_pad_parm): Handle polynomial argument sizes.
>> 	* toplev.c (output_stack_usage): Update reference to
>> 	current_function_pushed_stack_size.
> I haven't been able to convince myself that the changes to the
> stack_usage_map are correct, particularly in the case where the
> UPPER_BOUND is not constant.  But I'm willing to let it go given the
> only target that could potentially be affected would be SVE and I'd
> expect that if you'd gotten it wrong it would have showed up in your
> testing.
>
> I'm also slightly worried about how we handle ARGS_GROW_DOWNWARD targets
> (pa, stormy16).  I couldn't convince myself those changes were correct
> either.  Again, I'm willing on fall back to lean on your extensive
> experience and testing here.
>
> Of all the patches to-date I've looked at, this one worries me the most
> (which is why it's next to last according to my records).  The potential
> for a goof in the argument setup, padding, stack save area, and the
> consequences for such a goof worry me.
>
> I'm going to conditionally ack this -- my condition is that you
> re-review the the calls.c/function.c changes as well.

I agree this is probably the least obvious of the changes.  I've had
another look and I still think the approach is OK.

For stack_usage_map: we're trying to detect whether an instruction
references a region of stack that has already been clobbered.
We can conservatively do this as a range from the lowest possible byte
offset in the region to the highest possible byte offset in the region.
This works both when recording a clobber and when testing for an overlap
with a previous clobber.

For polynomial lower bounds, the lowest possible offset is given by
constant_lower_bound.  For polynomial upper bounds, the highest
possible offset is HOST_WIDE_INT_M1U (since we track unsigned offsets
from a zero base).  But it's not possible to have the stack_usage_map
array itself go up to HOST_WIDE_INT_M1U, so instead, the watermark
records the lowest lower bound whose corresponding upper bound is
HOST_WIDE_INT_M1U, so that every index of stack_usage_map starting at
the watermark is conceptually set.

The patch doesn't really change the handling of ARGS_GROW_DOWNWARD,
it just rewrites the tests into a different form.  E.g. there were
two instances of code like:

                  if (ARGS_GROW_DOWNWARD)
                    highest_outgoing_arg_in_use
                      = MAX (initial_highest_arg_in_use, needed + 1);
                  else
                    highest_outgoing_arg_in_use
                      = MAX (initial_highest_arg_in_use, needed);

and the patch splits out the calculation of the second MAX argument:

                  poly_int64 limit = needed;
                  if (ARGS_GROW_DOWNWARD)
                    limit += 1;

The third ARGS_GROW_DOWNARD-related hunk is for pad_to_arg_alignment.
Previously we had:

          offset_ptr->constant = -sp_offset +
            (ARGS_GROW_DOWNWARD
            ? FLOOR_ROUND (offset_ptr->constant + sp_offset, boundary_in_bytes)
            : CEIL_ROUND (offset_ptr->constant + sp_offset, boundary_in_bytes));

Expanding FLOOR_ROUND, and using the fact that boundary_in_bytes is
a power of 2, the ARGS_GROW_DOWNARD case is equivalent to:

          offset_ptr->constant = -sp_offset
            + (offset_ptr->constant + sp_offset)
            - ((offset_ptr->constant + sp_offset) & (boundary_in_bytes - 1));

The -sp_offset + sp_offset cancels out, giving:

          offset_ptr->constant
            = offset_ptr->constant
            - ((offset_ptr->constant + sp_offset) & (boundary_in_bytes - 1));

or:

          offset_ptr->constant
            -= (offset_ptr->constant + sp_offset) & (boundary_in_bytes - 1);

But since (offset_ptr->constant + sp_offset) is polynomial, we can't
always calculate this value, so the code is conditional on a call
to known_misalignment:

      if (...
          || !known_misalignment (offset_ptr->constant + sp_offset,
                                  boundary_in_bytes, &misalign))
        ...
      else
        {
          if (ARGS_GROW_DOWNWARD)
            offset_ptr->constant -= misalign;
        
Thanks,
Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]