[RFC, PR 80689] Copy small aggregates element-wise

Martin Jambor mjambor@suse.cz
Fri Nov 24 12:04:00 GMT 2017


Hi Richi,

On Fri, Nov 24 2017, Richard Biener wrote:
> On Fri, Nov 24, 2017 at 11:57 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Fri, Nov 24, 2017 at 11:31 AM, Richard Biener

..

>>>>> And yes, I've been worried about SRA as well here...  it _does_
>>>>> have some early outs when seeing VIEW_CONVERT_EXPR but
>>>>> appearantly not for the above.  Testcase that aborts with SRA but
>>>>> not without:
>>>>>
>>>>> struct A { short s; long i; long j; };
>>>>> struct A a, b;
>>>>> void foo ()
>>>>> {
>>>>>   struct A c;
>>>>>   __builtin_memcpy (&c, &b, sizeof (struct A));
>>>>>   __builtin_memcpy (&a, &c, sizeof (struct A));
>>>>> }
>>>>> int main()
>>>>> {
>>>>>   __builtin_memset (&b, 0, sizeof (struct A));
>>>>>   b.s = 1;
>>>>>   __builtin_memcpy ((char *)&b+2, &b, 2);
>>>>>   foo ();
>>>>>   __builtin_memcpy (&a, (char *)&a+2, 2);
>>>>>   if (a.s != 1)
>>>>>     __builtin_abort ();
>>>>>   return 0;
>>>>> }
>>>>
>>>> Thanks for the testcase, I agree that is a fairly big problem.  Do you
>>>> think that the following (untested) patch is an appropriate way of
>>>> fixing it and generally of extending gimple to capture that a statement
>>>> is a bit-copy?
>>>
>>> I think the place to fix is the memcpy folding.  That is, we'd say that
>>> aggregate assignments are not bit-copies but do element-wise assignments.
>>> For memcpy folding we'd then need to use a type that doesn't contain
>>> padding.  Which effectively means char[].
>>>
>>> Of course we need to stop SRA from decomposing that copy to
>>> individual characters then ;)
>>>
>>> So iff we decide that all aggregate copies are element copies,
>>> maybe only those where TYPE_MAIN_VARIANT of lhs and rhs match
>>> (currently we allow TYPE_CANONICAL compatibility and thus there
>>> might be some mismatches), then we have to fix nothign but
>>> the memcpy folding.
>>>
>>>> If so, I'll add the testcase, bootstrap it and formally propose it.
>>>> Subsequently I will of course make sure that any element-wise copying
>>>> patch would test the predicate.
>>>
>>> I don't think the alias-set should determine whether a copy is
>>> bit-wise or not.
>>
>> Like the attached.  At least FAILs
>>
>> FAIL: gcc.dg/tree-ssa/ssa-ccp-27.c scan-tree-dump-times ccp1
>> "memcpy[^\n]*123456" 2 (found 0 times)
>>
>> not sure why we have this test.
>
> Hum.  And SRA still decomposes the copy to struct elements w/o padding
> even though the access is done using char[].  So somehow it ignores
> VIEW_CONVERT_EXPRs
> (well, those implicitely present on MEM_REFs).

Yes.  SRA is not even too afraid of top-level V_C_Es.  It really bails
out only if they are buried under a handled_component.  And it does not
remove aggregate assignments containing them.

>
> Looks like this is because total scalarization is done on the decls
> and not at all
> honoring how the variable is accessed?  The following seems to fix
> that, otherwise untested.


>
> Index: gcc/tree-sra.c
> ===================================================================
> --- gcc/tree-sra.c      (revision 255137)
> +++ gcc/tree-sra.c      (working copy)
> @@ -1338,7 +1338,9 @@ build_accesses_from_assign (gimple *stmt
>      {
>        racc->grp_assignment_read = 1;
>        if (should_scalarize_away_bitmap && !gimple_has_volatile_ops (stmt)
> -         && !is_gimple_reg_type (racc->type))
> +         && !is_gimple_reg_type (racc->type)
> +         && (TYPE_MAIN_VARIANT (racc->type)
> +             == TYPE_MAIN_VARIANT (TREE_TYPE (racc->base))))
>         bitmap_set_bit (should_scalarize_away_bitmap, DECL_UID (racc->base));
>        if (storage_order_barrier_p (lhs))
>         racc->grp_unscalarizable_region = 1;

I believe that the added condition is not what you want, this seems
to trigger also for ordinary:

  s1 = s2.field

Where racc->type is type of the field but racc->base is s2 and its type
is type of the structure.   

I also think you want to be setting a bit in
cannot_scalarize_away_bitmap in order to guarantee that total
scalarization will not happen for the given candidate.  Otherwise some
other regular assignment might trigger it ...except if we then also
checked the statement for bit-copying types in sra_modify_assign (in the
condition after the big comment), which I suppose is actually the
correct thing to do.

Thanks a lot for the folding patch, I can take over the SRA bits if you
want to.

Martin



More information about the Gcc-patches mailing list