[PATCH V3] PR88497 - Extend reassoc for vector bit_field_ref

Kewen.Lin linkw@linux.ibm.com
Wed Jul 3 03:20:00 GMT 2019


Hi Richard,

Thanks very much for reviewing my patch.  I'll update it as your comments.
Before sending the next version, I've several questions embedded for further check.

on 2019/7/2 下午8:43, Richard Biener wrote:
> On Wed, 20 Mar 2019, Kewen.Lin wrote:
> 
>> +/* { dg-require-effective-target vect_double } */
>> +/* { dg-require-effective-target powerpc_vsx_ok { target { powerpc*-*-* } } } */
>> +/* { dg-options "-O2 -ffast-math" } */
>> +/* { dg-options "-O2 -ffast-math -mvsx -fdump-tree-reassoc1" { target { powerpc*-*-* } } } */
> 
> Use
> 
> /* { dg-additional-options "-mvsx" { target { powerpc...
> 
> that saves duplicate typing.  I guess that x86_64/i?86 also works
> if you enable SSE2 - can you check that and do adjustments accordingly?
> 

OK, I'll learn SSE2 and update it. 

>> +/* Hold the information of one specific VECTOR_TYPE SSA_NAME.
>> +    - offsets: for different BIT_FIELD_REF offsets accessing same VECTOR.
>> +    - ops_indexes: the index of vec ops* for each relavant BIT_FIELD_REF.  */
>> +struct v_info
>> +{
>> +  auto_vec<unsigned HOST_WIDE_INT, 32> offsets;
> 
> with SVE this probably needs to be poly_int64 or so
> 

OK, will extend it to poly_int64 (can it be negative? or poly_uint64 better?)

>> +  auto_vec<unsigned, 32> ops_indexes;
>> +};
> 
> To have less allocations you could use a
> 
>      auto_vec<std::pair<unsigned HOST_WIDE_INT, unsigned>, 32> elements;
> 
> or even
> 
>  hash_map<tree, vec<std::pair<unsigned HOST_WIDE_INT, unsigned> > >
> 
> where you use .release() in the cleanup and .create (TYPE_VECTOR_SUBPARTS 
> (vector_type)) during collecting and then can use quick_push ()
> (ah, no - duplicates...).
> 

Good idea!

>> +
>> +typedef struct v_info *v_info_ptr;
>> +
>> +/* Comparison function for qsort on unsigned BIT_FIELD_REF offsets.  */
>> +static int
>> +unsigned_cmp (const void *p_i, const void *p_j)
>> +{
>> +  if (*(const unsigned HOST_WIDE_INT *) p_i
>> +      >= *(const unsigned HOST_WIDE_INT *) p_j)
>> +    return 1;
>> +  else
>> +    return -1;
> 
> That's an issue with some qsort implementations comparing
> an element against itself.
> 
> I think so you should add the equality case.
> 

The equality case seems already involved in ">=".
Do you mean that I need to make it equality case explicitly? 
Or return zero for "=="? like:

   
   const unsigned HOST_WIDE_INT val_i = *(const unsigned HOST_WIDE_INT *) p_i;
   const unsigned HOST_WIDE_INT val_j = *(const unsigned HOST_WIDE_INT *) p_j;
   if (val_i != val_j)
     return val_i > val_j? 1: -1;
   else
     return 0;

>> +
>> +   TODO:
>> +    1) The current implementation restrict all candidate VECTORs should have
>> +       the same VECTOR type, but it can be extended into different groups by
>> +       VECTOR types in future if any profitable cases found.
>> +    2) The current implementation requires the whole VECTORs should be fully
>> +       covered, but it can be extended to support partial, checking adjacent
>> +       but not fill the whole, it may need some cost model to define the
>> +       boundary to do or not.
>> +

>> +      tree elem_type = TREE_TYPE (vec_type);
>> +      unsigned HOST_WIDE_INT size = TREE_INT_CST_LOW (TYPE_SIZE (elem_type));
>> +      if (size != TREE_INT_CST_LOW (op1))
> 
>   if (!tree_int_cst_equal (TYPE_SIZE (elem_type), op1))
> 
> avoids some of the TREE_INT_CST_LOW we like to avoid.
> 
> You miss a check for op2 % op1 being zero.  Given you store a 
> HOST_WIDE_INT offset you also want to check for INTEGER_CST op1/op2
> (beware of SVE...).

OK, thanks!  For op2 % op1 == zero, I thought it's a must for granted, I'll fix it.
I think it can be constructed in IR artificially, but since I have almost none knowledge 
on other targets vector support, I'm curious that does it exist in real world codes?

btw, BIT_FIELD_REF in tree.def says the op1/op2 is constant, it looks need to be updated
due to SVE?

> 
> There's also a poly-int friendly multiple_p predicate so you could
> have the initial checks SVE friendly but bail out on non-INTEGER_CST
> offset later.
> 

Got it, thanks!

> 
> Since you are using a hashtable keyed off SSA name pointers code
> generation will depend on host addresses here if you consider
> there's one mismatching vector type that might become ref_vec
> dependent on the order of elements in the hashtable.  That's
> a no-no.
> 
> Even if doing it like above looks to possibly save compile-time
> by avoiding qsort calls I think the appropriate thing to do is
> to partition the vector candidates into sets of compatible
> vectors (consider summing two v2df and two v4df vectors for example)
> and then take out ones you disqualify because they do not cover
> the vector in full.
> 

You are absolutely right, the randomness of hashtable keys order could 
be a problem here.  The partition part is TODO 1).  Since Power has only
one kind whole vector width now, doesn't have v2df and v4df co-existence,
it's put into TODO.  I will extend it in the next version of patch, add
one more hashtable from vector type mode to v_info_map, feel free to
correct me if you have any concerns.

> I think whether the vector is fully covered can be done way cheaper
> as well by using a sbitmap and clearing a bit whenever its 
> corresponding lane is in the vector (and bailing out if the bit
> was already clear since you then got a duplicate).  So start
> with bitmap_ones () and at the end check bitmap_empty_p ().
> I don't think you depend on the vector being sorted later?
> 

Good idea, yes, it doesn't depend on sorted or not.

>> +    {
>> +      sum = build_and_add_sum (TREE_TYPE (ref_vec), sum_vec, tr, opcode);
>> +      info = *(v_info_map.get (tr));
>> +      unsigned j;
>> +      FOR_EACH_VEC_ELT (info->ops_indexes, j, idx)
>> +	{
>> +	  gimple *def = SSA_NAME_DEF_STMT ((*ops)[idx]->op);
>> +	  gimple_set_visited (def, true);
> 
> you set the visited flag to get the ops definition DCEd eventually, right?
> Note this in a comment.
> 

Yes, it's for that.  Will add the comment, thanks!




More information about the Gcc-patches mailing list