This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][PR84877]Dynamically align the address for local parameter copy on the stack when required alignment is larger than MAX_SUPPORTED_STACK_ALIGNMENT


Hi H.J.

On 22/03/18 12:55, H.J. Lu wrote:
On Thu, Mar 22, 2018 at 5:52 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
On Thu, Mar 22, 2018 at 4:56 AM, Renlin Li <renlin.li@foss.arm.com> wrote:
Hi all,

As described in PR84877. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877
The local copy of parameter on stack is not aligned.

For BLKmode paramters, a local copy on the stack will be saved.
There are three cases:
1) arguments passed partially on the stack, partially via registers.
2) arguments passed fully on the stack.
3) arguments passed via registers.

After the change here, in all three cases, the stack slot for the local
parameter copy is aligned by the data type.
The stack slot is the DECL_RTL of the parameter. All the references
thereafter in the function will refer to this RTL.

To populate the local copy on the stack,
For case 1) and 2), there are operations to move data from the caller's
stack (from incoming rtl) into callee's stack.
For case 3), the registers are directly saved into the stack slot.

In all cases, the destination address is properly aligned.
But for case 1) and case 2), the source address is not aligned by the type.
It is defined by the PCS how the arguments are prepared.
The block move operation is fulfilled by emit_block_move (). As far as I can
see,
it will use the smaller alignment of source and destination.
This looks fine as long as we don't use instructions which requires a strict
larger alignment than the address actually has.

Here, it only changes receiving parameters.
The function assign_stack_local_1 will be called in various places.
Usually, the caller will constraint the ALIGN parameter. For example via
STACK_SLOT_ALIGNMENT macro.
assign_parm_setup_block will call assign_stack_local () with alignment from
the parameter type which in this case could be
larger than MAX_SUPPORTED_STACK_ALIGNMENT.

The alignment operation for parameter copy on the stack is similar to stack
vars.
First, enough space is reserved on the stack. The size is fixed at compile
time.
Instructions are emitted to dynamically get an aligned address at runtime
within this piece of memory.

This will unavoidably increase the usage of stack. However, it really
depends on
how many over-aligned parameters are passed by value.

x86-linux, arm-none-eabi, aarch64-one-elf regression test Okay.
linux-armhf bootstrap Okay.

I assume there are other targets which will be affected by the change.
But I don't have environment to test.

Okay the commit?


Regards,
Renlin

gcc/

2018-03-22  Renlin Li  <renlin.li@arm.com>

         PR middle-end/84877
         * explow.h (get_dynamic_stack_size): Declare it as external.
         * explow.c (record_new_stack_level): Remove function static
attribute.
         * function.c (assign_stack_local_1): Dynamically align the stack
slot
         addr for parameter copy on the stack.

gcc/testsuite/

2018-03-22  Renlin Li  <renlin.li@arm.com>

         PR middle-end/84877
         * gcc.dg/pr84877.c: New.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

But your patch has

diff --git a/gcc/testsuite/gcc.target/arm/pr84877.c
b/gcc/testsuite/gcc.target/arm/pr84877.c
new file mode 100644
index 0000000000000000000000000000000000000000..b2df8a954f566dea3a4f1ed70572b43de39dda82
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr84877.c

Why is this ARM specific?
I attached the wrong patch. The testcase should really be gcc/testsuite/gcc.dg/pr84877.c


+#include <inttypes.h>
+
+struct U {
+    int M0;
+    int M1;
+} __attribute ((aligned(16)));
^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Some targets align stack to 16 bytes by default.
You need 32 byte alignment to better test stack alignment.

yes, you are right here.

IIUIC, as for x86 and aarch64, the STRICT_ALIGNMNET marco is false and the STACK_BOUNDARY is 128.
For this particular test case here, the patch didn't make any difference.

As the code here indicates, for parameters passed on the stack fully or partially, a local copy
will be allocated when the following condition is true. Otherwise, it will use the stack space prepared by the caller.
And for that part, it is ABI specific.

  /* If we can't trust the parm stack slot to be aligned enough for its
     ultimate type, don't use that slot after entry.  We'll make another
     stack slot, if we need one.  */
  if (stack_parm
      && ((STRICT_ALIGNMENT
	   && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
	  || (data->nominal_type
	      && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
	      && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
    stack_parm = NULL;

So even I made the alignment 32 byte, in this case, the value will be passed on the stack.
the stack slot from caller will be used. No local copy will be created.
It won't shown any code-gen difference in x86 target.

The alignment test will test the stack address prepared by the caller.


Regards,
Renlin



+



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]