[Patch, AArch64, ILP32] 3/5 Minor change in function.c:assign_parm_find_data_types()

Yufeng Zhang Yufeng.Zhang@arm.com
Thu Jun 27 00:51:00 GMT 2013


On 06/27/13 00:57, Andrew Pinski wrote:
> On Wed, Jun 26, 2013 at 4:51 PM, Andrew Pinski<pinskia@gmail.com>  wrote:
>> On Wed, Jun 26, 2013 at 4:41 PM, Yufeng Zhang<Yufeng.Zhang@arm.com>  wrote:
>>> On 06/27/13 00:04, Andrew Pinski wrote:
>>>>
>>>> On Wed, Jun 26, 2013 at 3:39 PM, Yufeng Zhang<Yufeng.Zhang@arm.com>
>>>> wrote:
>>>>>
>>>>> This patch updates assign_parm_find_data_types to assign passed_mode and
>>>>> nominal_mode with the mode of the built pointer type instead of the
>>>>> hard-coded Pmode in the case of pass-by-reference.  This is in line with
>>>>> the
>>>>> assignment to passed_mode and nominal_mode in other cases inside the
>>>>> function.
>>>>>
>>>>> assign_parm_find_data_types generally uses TYPE_MODE to calculate
>>>>> passed_mode and nominal_mode:
>>>>>
>>>>>     /* Find mode of arg as it is passed, and mode of arg as it should be
>>>>>        during execution of this function.  */
>>>>>     passed_mode = TYPE_MODE (passed_type);
>>>>>     nominal_mode = TYPE_MODE (nominal_type);
>>>>>
>>>>> this includes the case when the passed argument is a pointer by itself.
>>>>>
>>>>> However there is a discrepancy when it deals with argument passed by
>>>>> invisible reference; it builds the argument's corresponding pointer type,
>>>>> but sets passed_mode and nominal_mode with Pmode directly.
>>>>>
>>>>> This is OK for targets where Pmode == ptr_mode, but on AArch64 with ILP32
>>>>> they are different with Pmode as DImode and ptr_mode as SImode. When such
>>>>> a
>>>>> reference is passed on stack, the reference is prepared by the caller in
>>>>> the
>>>>> lower 4 bytes of an 8-byte slot but is fetched by the callee as an 8-byte
>>>>> datum, of which the higher 4 bytes may contain junk.  It is probably the
>>>>> combination of Pmode != ptr_mode and the particular ABI specification
>>>>> that
>>>>> make the AArch64 ILP32 the first target on which the issue manifests
>>>>> itself.
>>>>>
>>>>> Bootstrapped on x86_64-none-linux-gnu.
>>>>>
>>>>> OK for the trunk?
>>>>
>>>>
>>>>
>>>> IA64-hpux also uses Pmode != ptr_mode, can you provide the testcase
>>>> which fails without this change?
>>>> I used a powerpc64 target where Pmode != ptr_mode which did not hit
>>>> this bug either.
>>>
>>>
>>> The issue was firstly observed in one of the compat tests which passes a
>>> large number of non-small structures.  The following is a trimmed-down
>>> reproducible code snippet (although not runnable but shall be easy to be
>>> make runnable):
>>>
>>> struct s5
>>> {
>>>    double a;
>>>    double b;
>>>    double c;
>>>    double d;
>>>    double e;
>>> } gS;
>>>
>>> double foo (struct s5 p1, struct s5 p2,struct s5 p3,struct s5 p4,struct s5
>>> p5,struct s5 p6,struct s5 p7,struct s5 p8, struct s5 p9)
>>> {
>>>    return p9.c;
>>> }
>>> --------------- CUT ---------------
>>>
>>> The code-gen (-O2) without the patch is:
>>>
>>>          .text
>>>          .align  2
>>>          .global foo
>>>          .type   foo, %function
>>> foo:
>>>          ldr     x0, [sp]<<=== here!
>>>          ldr     d0, [x0,16]
>>>          ret
>>>          .size   foo, .-foo
>>>
>>> Where the arrow points is the load of the pointer to 'p9' that is passed on
>>> stack.  The instruction really should be ldr w0, [sp], i.e. the pointer mode
>>> is SImode rather than DImode.
>>>
>>> It needs a number of conditions for the issue to manifest:
>>>
>>> 1. pass-by-reference; on aarch64 one example is a struct that is larger than
>>> 16 bytes.
>>> 2. the reference is passed on stack; on aarch64, this usually only happens
>>> after registers x0 - x7 are used.
>>> 3. the size of stack slot for passing pointer is larger than the pointer
>>> size; on aarch64, it is 8-byte vs. 4-byte
>>> 4. the unused part of the stack slot is not zeroed out, i.e. undefined by
>>> the ABI
>>
>> This is the real issue.  I think it is better if we change the ABI to
>> say they are zero'd.  It really makes things like this a mess.
>>
>>> 5. in the runtime, the unused part of such a stack slot contains junk.
>>>
>>> The runtime segmentation fault may only be generated when all the above
>>> conditions are met.  I'm not familiar with IA64-hpux or powerpc64 procedure
>>> call ABIs, but I guess those targets are just being lucky?
>>
>> Or rather their ABIs all say are zero or sign extended for values less
>> than 8 byte wide.
>
> One more thing, it looks like your change will not work correctly for
> big-endian ILP32 AARCH64 either as the least significant word is
> offsetted by 4.

Ah, this is definitely a bug and I can confirm that it only happens on 
the 32-bit pointer-type parameter passed on stack.  I'll bring up a 
patch today to fix it.

> Did you test big-endian ILP32 AARCH64?

I started the big-endian testing fairly recently, so there is limited 
testing done so far but I am still working on it and will prepare 
patches if more issues are found.

Thanks,
Yufeng



More information about the Gcc-patches mailing list