This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)
- From: Richard Earnshaw <Richard dot Earnshaw at foss dot arm dot com>
- To: Jakub Jelinek <jakub at redhat dot com>, Richard Biener <rguenther at suse dot de>
- Cc: gcc-patches at gcc dot gnu dot org, Martin Jambor <mjambor at suse dot de>, Alan Lawrence <alan dot lawrence at arm dot com>
- Date: Tue, 05 May 2015 11:54:44 +0100
- Subject: Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)
- Authentication-results: sourceware.org; auth=none
- References: <20150502082437 dot GW1751 at tucnak dot redhat dot com> <alpine dot LSU dot 2 dot 11 dot 1505040950230 dot 20496 at zhemvz dot fhfr dot qr> <20150504150011 dot GD1751 at tucnak dot redhat dot com> <20150505073228 dot GH1751 at tucnak dot redhat dot com>
On 05/05/15 08:32, Jakub Jelinek wrote:
> On Mon, May 04, 2015 at 05:00:11PM +0200, Jakub Jelinek wrote:
>> So at least changing arm_needs_doubleword_align for non-aggregates would
>> likely not break anything that hasn't been broken already and would unbreak
>> the majority of cases.
>
> Attached (untested so far). It indeed changes code generated for
> over-aligned va_arg, but as I believe you can't properly pass those in the
> ... caller, this should just fix it so that va_arg handling matches the
> caller (and likewise for callees for named argument passing).
>
>> The following testcase shows that eipa_sra changes alignment even for the
>> aggregates. Change aligned (8) to aligned (4) to see another possibility.
>
> Actually I misread it, for the aggregates esra actually doesn't change
> anything, which is the reason why the testcase doesn't fail.
> The problem with the scalars is that esra first changed it to the
> over-aligned MEM_REFs and then later on eipa_sra used the types of the
> MEM_REFs created by esra.
>
> 2015-05-05 Jakub Jelinek <jakub@redhat.com>
>
> PR target/65956
> * config/arm/arm.c (arm_needs_doubleword_align): For non-aggregate
> types check TYPE_ALIGN of TYPE_MAIN_VARIANT rather than type itself.
>
> * gcc.c-torture/execute/pr65956.c: New test.
>
> --- gcc/config/arm/arm.c.jj 2015-05-04 21:51:42.000000000 +0200
> +++ gcc/config/arm/arm.c 2015-05-05 09:20:52.481693337 +0200
> @@ -6063,8 +6063,13 @@ arm_init_cumulative_args (CUMULATIVE_ARG
> static bool
> arm_needs_doubleword_align (machine_mode mode, const_tree type)
> {
> - return (GET_MODE_ALIGNMENT (mode) > PARM_BOUNDARY
> - || (type && TYPE_ALIGN (type) > PARM_BOUNDARY));
> + if (GET_MODE_ALIGNMENT (mode) > PARM_BOUNDARY)
> + return true;
I don't think this is right (though I suspect the existing code has the
same problem). We should only look at mode if there is no type
information. The problem is that GCC has a nasty habit of assigning
real machine modes to things that are really BLKmode and we've run into
several cases where this has royally screwed things up. So for
consistency in the ARM back-end we are careful to only use mode when
type is NULL (=> it's a libcall).
> + if (type == NULL_TREE)
> + return false;
> + if (AGGREGATE_TYPE_P (type))
> + return TYPE_ALIGN (type) > PARM_BOUNDARY;
> + return TYPE_ALIGN (TYPE_MAIN_VARIANT (type)) > PARM_BOUNDARY;
> }
>
It ought to be possible to re-order this, though, to
static bool
arm_needs_doubleword_align (machine_mode mode, const_tree type)
{
- return (GET_MODE_ALIGNMENT (mode) > PARM_BOUNDARY
- || (type && TYPE_ALIGN (type) > PARM_BOUNDARY));
+ if (type != NULL_TREE)
+ {
+ if (AGGREGATE_TYPE_P (type))
+ return TYPE_ALIGN (type) > PARM_BOUNDARY;
+ return TYPE_ALIGN (TYPE_MAIN_VARIANT (type)) > PARM_BOUNDARY;
+ }
+ return (GET_MODE_ALIGNMENT (mode) > PARM_BOUNDARY);
}
Either way, this would need careful cross-testing against an existing
compiler.
R.
>
> --- gcc/testsuite/gcc.c-torture/execute/pr65956.c.jj 2015-05-01 10:32:34.730150257 +0200
> +++ gcc/testsuite/gcc.c-torture/execute/pr65956.c 2015-05-01 10:32:13.000000000 +0200
> @@ -0,0 +1,67 @@
> +/* PR target/65956 */
> +
> +struct A { char *a; int b; long long c; };
> +char v[3];
> +
> +__attribute__((noinline, noclone)) void
> +fn1 (char *x, char *y)
> +{
> + if (x != &v[1] || y != &v[2])
> + __builtin_abort ();
> + v[1]++;
> +}
> +
> +__attribute__((noinline, noclone)) int
> +fn2 (char *x)
> +{
> + asm volatile ("" : "+g" (x) : : "memory");
> + return x == &v[0];
> +}
> +
> +__attribute__((noinline, noclone)) void
> +fn3 (const char *x)
> +{
> + if (x[0] != 0)
> + __builtin_abort ();
> +}
> +
> +static struct A
> +foo (const char *x, struct A y, struct A z)
> +{
> + struct A r = { 0, 0, 0 };
> + if (y.b && z.b)
> + {
> + if (fn2 (y.a) && fn2 (z.a))
> + switch (x[0])
> + {
> + case '|':
> + break;
> + default:
> + fn3 (x);
> + }
> + fn1 (y.a, z.a);
> + }
> + return r;
> +}
> +
> +__attribute__((noinline, noclone)) int
> +bar (int x, struct A *y)
> +{
> + switch (x)
> + {
> + case 219:
> + foo ("+", y[-2], y[0]);
> + case 220:
> + foo ("-", y[-2], y[0]);
> + }
> +}
> +
> +int
> +main ()
> +{
> + struct A a[3] = { { &v[1], 1, 1LL }, { &v[0], 0, 0LL }, { &v[2], 2, 2LL } };
> + bar (220, a + 2);
> + if (v[1] != 1)
> + __builtin_abort ();
> + return 0;
> +}
>
>
> Jakub
>