This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: increase alignment of global structs in increase_alignment pass
- From: Prathamesh Kulkarni <prathamesh dot kulkarni at linaro dot org>
- To: Richard Biener <rguenther at suse dot de>
- Cc: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Tue, 17 May 2016 18:24:28 +0530
- Subject: Re: increase alignment of global structs in increase_alignment pass
- Authentication-results: sourceware.org; auth=none
- References: <CAAgBjMnzo0xM_+JAiRS7J3P9sDF-Uoe-k6Q_7fL1dZD+W+x3qA at mail dot gmail dot com> <alpine dot LSU dot 2 dot 11 dot 1602221304130 dot 31547 at t29 dot fhfr dot qr> <CAAgBjM=Oa6Ex9D7qjDq71DsaPvSqNgzM5xpxbsai0Eyuy+PfGg at mail dot gmail dot com> <alpine dot LSU dot 2 dot 11 dot 1602231300130 dot 31547 at t29 dot fhfr dot qr> <CAAgBjMkWYk9XKkugc1SXPqW2cWVWJ2MsT5Pb=QEGCxLgw9rC7A at mail dot gmail dot com> <CAAgBjMnxqw74MKL0vuFNdv1M3mpDo-FAW4ocWEv-rZnDgSPjTw at mail dot gmail dot com> <alpine dot LSU dot 2 dot 11 dot 1605061337470 dot 13384 at t29 dot fhfr dot qr> <CAAgBjM=2o1Hr5Np8UQ=731JTHeqbfx7LrD2mzhN4Uxq98wbmYw at mail dot gmail dot com>
ping https://gcc.gnu.org/ml/gcc/2016-05/msg00120.html
Thanks,
Prathamesh
On 11 May 2016 at 15:39, Prathamesh Kulkarni
<prathamesh.kulkarni@linaro.org> wrote:
> On 6 May 2016 at 17:20, Richard Biener <rguenther@suse.de> wrote:
>> On Wed, 4 May 2016, Prathamesh Kulkarni wrote:
>>
>>> On 23 February 2016 at 21:49, Prathamesh Kulkarni
>>> <prathamesh.kulkarni@linaro.org> wrote:
>>> > On 23 February 2016 at 17:31, Richard Biener <rguenther@suse.de> wrote:
>>> >> On Tue, 23 Feb 2016, Prathamesh Kulkarni wrote:
>>> >>
>>> >>> On 22 February 2016 at 17:36, Richard Biener <rguenther@suse.de> wrote:
>>> >>> > On Mon, 22 Feb 2016, Prathamesh Kulkarni wrote:
>>> >>> >
>>> >>> >> Hi Richard,
>>> >>> >> As discussed in private mail, this version of patch attempts to
>>> >>> >> increase alignment
>>> >>> >> of global struct decl if it contains an an array field(s) and array's
>>> >>> >> offset is a multiple of the alignment of vector type corresponding to
>>> >>> >> it's scalar type and recursively checks for nested structs.
>>> >>> >> eg:
>>> >>> >> static struct
>>> >>> >> {
>>> >>> >> int a, b, c, d;
>>> >>> >> int k[4];
>>> >>> >> float f[10];
>>> >>> >> };
>>> >>> >> k is a candidate array since it's offset is 16 and alignment of
>>> >>> >> "vector (4) int" is 8.
>>> >>> >> Similarly for f.
>>> >>> >>
>>> >>> >> I haven't been able to create a test-case where there are
>>> >>> >> multiple candidate arrays and vector alignment of arrays are different.
>>> >>> >> I suppose in this case we will have to increase alignment
>>> >>> >> of the struct by the max alignment ?
>>> >>> >> eg:
>>> >>> >> static struct
>>> >>> >> {
>>> >>> >> <fields>
>>> >>> >> T1 k[S1]
>>> >>> >> <fields>
>>> >>> >> T2 f[S2]
>>> >>> >> <fields>
>>> >>> >> };
>>> >>> >>
>>> >>> >> if V1 is vector type corresponding to T1, and V2 corresponding vector
>>> >>> >> type to T2,
>>> >>> >> offset (k) % align(V1) == 0 and offset (f) % align(V2) == 0
>>> >>> >> and align (V1) > align(V2) then we will increase alignment of struct
>>> >>> >> by align(V1).
>>> >>> >>
>>> >>> >> Testing showed FAIL for g++.dg/torture/pr31863.C due to program timeout.
>>> >>> >> Initially it appeared to me, it went in infinite loop. However
>>> >>> >> on second thoughts I think it's probably not an infinite loop, rather
>>> >>> >> taking (extraordinarily) large amount of time
>>> >>> >> to compile the test-case with the patch.
>>> >>> >> The test-case builds quickly for only 2 instantiations of ClassSpec
>>> >>> >> (ClassSpec<Key, A001, 1>,
>>> >>> >> ClassSpec<Key, A002, 2>)
>>> >>> >> Building with 22 instantiations (upto ClassSpec<Key, A0023, 22>) takes up
>>> >>> >> to ~1m to compile.
>>> >>> >> with:
>>> >>> >> 23 instantiations: ~2m
>>> >>> >> 24 instantiations: ~5m
>>> >>> >> For 30 instantiations I terminated cc1plus after 13m (by SIGKILL).
>>> >>> >>
>>> >>> >> I guess it shouldn't go in an infinite loop because:
>>> >>> >> a) structs cannot have circular references.
>>> >>> >> b) works for lower number of instantiations
>>> >>> >> However I have no sound evidence that it cannot be in infinite loop.
>>> >>> >> I don't understand why a decl node is getting visited more than once
>>> >>> >> for that test-case.
>>> >>> >>
>>> >>> >> Using a hash_map to store alignments of decl's so that decl node gets visited
>>> >>> >> only once prevents the issue.
>>> >>> >
>>> >>> > Maybe aliases. Try not walking vnode->alias == true vars.
>>> >>> Hi,
>>> >>> I have a hypothesis why decl node gets visited multiple times.
>>> >>>
>>> >>> Consider the test-case:
>>> >>> template <typename T, unsigned N>
>>> >>> struct X
>>> >>> {
>>> >>> T a;
>>> >>> virtual int foo() { return N; }
>>> >>> };
>>> >>>
>>> >>> typedef struct X<int, 1> x_1;
>>> >>> typedef struct X<int ,2> x_2;
>>> >>>
>>> >>> static x_1 obj1 __attribute__((used));
>>> >>> static x_2 obj2 __attribute__((used));
>>> >>>
>>> >>> Two additional structs are created by C++FE, c++filt shows:
>>> >>> _ZTI1XIiLj1EE -> typeinfo for X<int, 1u>
>>> >>> _ZTI1XIiLj2EE -> typeinfo for X<int, 2u>
>>> >>>
>>> >>> Both of these structs have only one field D.2991 and it appears it's
>>> >>> *shared* between them:
>>> >>> struct D.2991;
>>> >>> const void * D.2980;
>>> >>> const char * D.2981;
>>> >>>
>>> >>> Hence the decl node D.2991 and it's fields (D.2890, D.2981) get visited twice:
>>> >>> once when walking _ZTI1XIiLj1EE and 2nd time when walking _ZTI1XIiLj2EE
>>> >>>
>>> >>> Dump of walking over the global structs for above test-case:
>>> >>> http://pastebin.com/R5SABW0c
>>> >>>
>>> >>> So it appears to me to me a DAG (interior node == struct decl, leaf ==
>>> >>> non-struct field,
>>> >>> edge from node1 -> node2 if node2 is field of node1) is getting
>>> >>> created when struct decl
>>> >>> is a type-info object.
>>> >>>
>>> >>> I am not really clear on how we should proceed:
>>> >>> a) Keep using hash_map to store alignments so that every decl gets
>>> >>> visited only once.
>>> >>> b) Skip walking artificial record decls.
>>> >>> I am not sure if skipping all artificial struct decls would be a good
>>> >>> idea, but I don't
>>> >>> think it's possible to identify if a struct-decl is typeinfo struct at
>>> >>> middle-end ?
>>> >>
>>> >> You shouldn't end up walking those when walking the type of
>>> >> global decls. That is, don't walk typeinfo decls - yes, practically
>>> >> that means just not walking DECL_ARTIFICIAL things.
>>> > Hi,
>>> > I have done the changes in the patch (attached) and cross-tested
>>> > on arm*-*-* and aarch64*-*-* without failures.
>>> > Is it OK for stage-1 ?
>>> Hi,
>>> Is the attached patch OK for trunk ?
>>> Bootstrapped and tested on aarch64-linux-gnu, ppc64le-linux-gnu.
>>> Cross-tested on arm*-*-* and aarch64*-*-*.
>>
>> You can't simply use
>>
>> + offset = int_byte_position (field);
>>
>> as it can be placed at variable offset which will make int_byte_position
>> ICE. Note it also returns a truncated byte position (bit position
>> stripped) which may be undesirable here. I think you want to use
>> bit_position and if and only if DECL_FIELD_OFFSET and
>> DECL_FIELD_BIT_OFFSET are INTEGER_CST.
> oops, I didn't realize offsets could be variable.
> Will that be the case only for VLA member inside struct ?
>>
>> Your observation about the expensiveness of the walk still stands I guess
>> and eventually you should at least cache the
>> get_vec_alignment_for_record_decl cases. Please make those workers
>> _type rather than _decl helpers.
> Done
>>
>> You seem to simply get at the maximum vectorized field/array element
>> alignment possible for all arrays - you could restrict that to
>> arrays with at least vector size (as followup).
> Um sorry, I didn't understand this part.
>>
>> + /* Skip artificial decls like typeinfo decls or if
>> + record is packed. */
>> + if (DECL_ARTIFICIAL (record_decl) || TYPE_PACKED (type))
>> + return 0;
>>
>> I think we should honor DECL_USER_ALIGN as well and not mess with those
>> decls.
> Done
>>
>> Given the patch now does quite some extra work it might make sense
>> to split the symtab part out of the vect_can_force_dr_alignment_p
>> predicate and call that early.
> In the patch I call symtab_node::can_increase_alignment_p early.
> I tried moving that to it's callers - vect_compute_data_ref_alignment and
> increase_alignment::execute, however that failed some tests in vect,
> and hence I didn't add the following hunk in the patch. Did I miss some check ?
>
> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> index 7652e21..2c1acee 100644
> --- a/gcc/tree-vect-data-refs.c
> +++ b/gcc/tree-vect-data-refs.c
> @@ -795,7 +795,10 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
> && TREE_CODE (TREE_OPERAND (base, 0)) == ADDR_EXPR)
> base = TREE_OPERAND (TREE_OPERAND (base, 0), 0);
>
> - if (!vect_can_force_dr_alignment_p (base, TYPE_ALIGN (vectype)))
> + if (!(TREE_CODE (base) == VAR_DECL
> + && decl_in_symtab_p (base)
> + && symtab_node::get (base)->can_increase_alignment_p ()
> + && vect_can_force_dr_alignment_p (base, TYPE_ALIGN (vectype))))
> {
> if (dump_enabled_p ())
> {
>
> Thanks,
> Prathamesh
>>
>> Richard.