This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible


On Mon, Mar 28, 2016 at 9:57 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> Sorry, Should have replied to gcc-patches list.
>
> Thanks,
> bin
>
> ---------- Forwarded message ----------
> From: "Bin.Cheng" <amker.cheng@gmail.com>
> Date: Tue, 29 Mar 2016 03:55:04 +0800
> Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking
> DR against its innermost loop bahavior if possible
> To: Richard Biener <richard.guenther@gmail.com>
>
> On 3/17/16, Richard Biener <richard.guenther@gmail.com> wrote:
>> On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>>
>>>> Hmm.
>>> Hi,
>>> Thanks for reviewing.
>>>>
>>>> +  equal_p = true;
>>>> +  if (e1->base_address && e2->base_address)
>>>> +    equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
>>>> +  if (e1->offset && e2->offset)
>>>> +    equal_p &= operand_equal_p (e1->offset, e2->offset, 0);
>>>>
>>>> surely better to return false early.
>>>>
>>>> I think we don't want this in tree-data-refs.h also because of ...
>>>>
>>>> @@ -615,15 +619,29 @@
>>>> hash_memrefs_baserefs_and_store_DRs_read_written_info
>>>> (data_reference_p a)
>>>>    data_reference_p *master_dr, *base_master_dr;and REALPART) before
>>>> creating the DR (or adjust the equality function
>>> and hashing
>>>>    tree ref = DR_REF (a);
>>>>    tree base_ref = DR_BASE_OBJECT (a);
>>>> +  innermost_loop_behavior *innermost = &DR_INNERMOST (a);
>>>>    tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
>>>>    bool exist1, exist2;
>>>>
>>>> -  while (TREE_CODE (ref) == COMPONENT_REF
>>>> -        || TREE_CODE (ref) == IMAGPART_EXPR
>>>> -        || TREE_CODE (ref) == REALPART_EXPR)
>>>> -    ref = TREE_OPERAND (ref, 0);
>>>> +  /* If reference in DR has innermost loop behavior and it is not
>>>> +     a compound memory reference, we store it to innermost_DR_map,
>>>> +     otherwise to ref_DR_map.  */
>>>> +  if (TREE_CODE (ref) == COMPONENT_REF
>>>> +      || TREE_CODE (ref) == IMAGPART_EXPR
>>>> +      || TREE_CODE (ref) == REALPART_EXPR
>>>> +      || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
>>>> +          || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
>>>> +    {
>>>> +      while (TREE_CODE (ref) == COMPONENT_REF
>>>> +            || TREE_CODE (ref) == IMAGPART_EXPR
>>>> +            || TREE_CODE (ref) == REALPART_EXPR)
>>>> +       ref = TREE_OPERAND (ref, 0);
>>>> +
>>>> +      master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
>>>> +    }
>>>> +  else
>>>> +    master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);
>>>>
>>>> we don't want an extra hashmap but replace ref_DR_map entirely.  So we'd
>>>> need to
>>>> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART
>>>> and REALPART) before creating the DR (or adjust the equality function
>>>> and hashing
>>>> to disregard them which means subtracting their offset from DR_INIT.
>>> I am not sure if I understand correctly.  But for component reference,
>>> it is the base object that we want to record/track.  For example,
>>>
>>>   for (i = 0; i < N; i++) {
>>>     m = *data++;
>>>
>>>     m1 = p1->x - m;
>>>     m2 = p2->x + m;
>>>
>>>     p3->y = (m1 >= m2) ? p1->y : p2->y;
>>>
>>>     p1++;
>>>     p2++;
>>>     p3++;
>>>   }
>>> We want to infer that reads of p1/p2 in condition statement won't trap
>>> because there are unconditional reads of the structures, though the
>>> unconditional reads are actual of other sub-objects.  Here it is the
>>> invariant part of address that we want to track.
>>
>> Well, the variant parts - we want to strip invariant parts as far as we can
>> (offsetof (x) and offsetof (y))
>>
>>> Also illustrated by this example, we can't rely on data-ref analyzer
>>> here.  Because in gathering/scattering cases, the address could be not
>>> affine at all.
>>
>> Sure, but that's a different issue.
>>
>>>>
>>>> To adjust the references we collect you'd maybe could use a callback
>>>> to get_references_in_stmt
>>>> to adjust them.
>>>>
>>>> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple
>>>> as
>>> Is this a part of the method you suggested above, or is it an
>>> alternative one?  If it's the latter, then I have below questions
>>> embedded.
>>
>> It is an alternative to adding a hook to get_references_in_stmt and
>> probably "easier".
>>
>>>>
>>>> Index: tree-if-conv.c
>>>> ===================================================================
>>>> --- tree-if-conv.c      (revision 234215)
>>>> +++ tree-if-conv.c      (working copy)
>>>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo
>>>>
>>>>    for (i = 0; refs->iterate (i, &dr); i++)
>>>>      {
>>>> +      tree *refp = &DR_REF (dr);
>>>> +      while ((TREE_CODE (*refp) == COMPONENT_REF
>>>> +             && TREE_OPERAND (*refp, 2) == NULL_TREE)
>>>> +            || TREE_CODE (*refp) == IMAGPART_EXPR
>>>> +            || TREE_CODE (*refp) == REALPART_EXPR)
>>>> +       refp = &TREE_OPERAND (*refp, 0);
>>>> +      if (refp != &DR_REF (dr))
>>>> +       {
>>>> +         tree saved_base = *refp;
>>>> +         *refp = integer_zero_node;
>>>> +
>>>> +         if (DR_INIT (dr))
>>>> +           {
>>>> +             tree poffset;
>>>> +             int punsignedp, preversep, pvolatilep;
>>>> +             machine_mode pmode;
>>>> +             HOST_WIDE_INT pbitsize, pbitpos;
>>>> +             get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos,
>>>> &poffset,
>>>> +                                  &pmode, &punsignedp, &preversep,
>>>> &pvolatilep,
>>>> +                                  false);
>>>> +             gcc_assert (poffset == NULL_TREE);
>>>> +
>>>> +             DR_INIT (dr)
>>>> +               = wide_int_to_tree (ssizetype,
>>>> +                                   wi::sub (DR_INIT (dr),
>>>> +                                            pbitpos / BITS_PER_UNIT));
>>>> +           }
>>>> +
>>>> +         *refp = saved_base;
>>>> +         DR_REF (dr) = *refp;
>>>> +       }
>>> Looks to me the code is trying to resolve difference between two (or
>>> more) component references, which is DR_INIT in the code.  But DR_INIT
>>> is not the only thing needs to be handled.  For a structure containing
>>> two sub-arrays, DR_OFFSET may be different too.
>>
>> Yes, but we can't say that if
>>
>>   a->a[i]
>>
>> doesn't trap that then
>>
>>   a->b[i]
>>
>> doesn't trap either.  We can only "strip" outermost
>> non-variable-offset components.
>>
>> But maybe I'm missing what example you are thinking of.
> Hmm, this was the case I meant.  What I don't understand is current
> code logic does infer trap information for a.b[i] from a.a[i].  Given
> below example:
> struct str
> {
>   int a[10];
>   int b[20];
>   char c;
> };
>
> void bar (struct str *);
> int foo (int x, int n)
> {
>   int i;
>   struct str s;
>   bar (&s);
>   for (i = 0; i < n; i++)
>     {
>       s.a[i] = s.b[i];
>       if (x > i)
>         s.b[i] = 0;
>     }
>   bar (&s);
>   return 0;
> }
> The loop is convertible because of below code in function
> ifcvt_memrefs_wont_trap:
>
>   /* If a is unconditionally accessed then ... */
>   if (DR_RW_UNCONDITIONALLY (*master_dr))
>     {
>       /* an unconditional read won't trap.  */
>       if (DR_IS_READ (a))
>         return true;
>
>       /* an unconditionaly write won't trap if the base is written
>          to unconditionally.  */
>       if (base_master_dr
>           && DR_BASE_W_UNCONDITIONALLY (*base_master_dr))
>         return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
>       else
>         {
>           /* or the base is know to be not readonly.  */
>           tree base_tree = get_base_address (DR_REF (a));
>           if (DECL_P (base_tree)
>               && decl_binds_to_current_def_p (base_tree)
>               && ! TREE_READONLY (base_tree))
>             return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
>         }
>     }
> It is the main object '&s' that is recorded in base_master_dr, so
> s.b[i] is considered not trap.  Even it's not, I suppose
> get_base_address will give same result?

Well, for this case it sees that s.b[i] is read from so it can't be an
out-of-bound
access.  And s.a[i] is written to unconditionally so 's' cannot be a
readonly object.
With both pieces of information we can conclude that s.b[i] = 0 doesn't trap.

Richard.

>
> Thanks,
> bin
>
>
>
> --
> Best Regards.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]