This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [WIP] Possible Bug in vect_bb_slp_scalar_cost?


On Mon, Aug 28, 2017 at 10:57 AM, Dominik Inführ
<dominik.infuehr@theobroma-systems.com> wrote:
> Hi,
>
> As discussed on IRC: This patches fixes the calculation of the scalar costs of SLP vectorization. I’ve added a test case and the auto_vec allocation is now reused for all children of a node.

Looks good to me and thanks for the testcase.  I'll include this in my
next round of testing and make sure it's fixed for GCC 8.

Thanks,
Richard.

> diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-slp.c b/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-slp.c
> new file mode 100644
> index 0000000..5121a88
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-slp.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-fdump-tree-slp-details" } */
> +
> +#define N 4
> +
> +int s1[N], s2[N], s3[N];
> +void escape(int, int, int, int);
> +
> +void
> +foo ()
> +{
> +  int t1, t2, t3, t4;
> +
> +  t1 = s1[0] + s2[0] + s3[0];
> +  t2 = s1[1] + s2[1] + s3[1];
> +  t3 = s1[2] + s2[2] + s3[2];
> +  t4 = s1[3] + s2[3] + s3[3];
> +
> +  s3[0] = t1 - s1[0] * s2[0];
> +  s3[1] = t2 - s1[1] * s2[1];
> +  s3[2] = t3 - s1[2] * s2[2];
> +  s3[3] = t4 - s1[3] * s2[3];
> +
> +  escape (t1, t2, t3, t4);
> +}
> +
> +/* { dg-final { scan-tree-dump-not "vectorization is not profitable" "slp2" } } */
> +/* { dg-final { scan-tree-dump "basic block vectorized" "slp2" } } */
> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
> index 04ecaab..cffd19b 100644
> --- a/gcc/tree-vect-slp.c
> +++ b/gcc/tree-vect-slp.c
> @@ -2690,9 +2690,17 @@ vect_bb_slp_scalar_cost (basic_block bb,
>       scalar_cost += stmt_cost;
>     }
>
> +  auto_vec<bool, 20> subtree_life;
> +
>   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
>     if (SLP_TREE_DEF_TYPE (child) == vect_internal_def)
> -      scalar_cost += vect_bb_slp_scalar_cost (bb, child, life);
> +      {
> +       /* Do not directly pass LIFE to the recursive call, copy it to
> +          confine changes in the callee to the current child/subtree.  */
> +       subtree_life.safe_splice (*life);
> +       scalar_cost += vect_bb_slp_scalar_cost (bb, child, &subtree_life);
> +       subtree_life.truncate (0);
> +      }
>
>   return scalar_cost;
> }
>
>> On 04 Aug 2017, at 12:19, Richard Biener <richard.guenther@gmail.com> wrote:
>>
>> On Fri, Aug 4, 2017 at 12:08 PM, Dominik Inführ
>> <dominik.infuehr@theobroma-systems.com> wrote:
>>> Hi,
>>>
>>> vect_bb_slp_scalar_cost computes the scalar cost of a SLP node. If there are non-scalar uses of a definition, the costs for it and its operands (children) are ignored. The vector LIFE is used to keep track of this and an element is set to true, such that the def and its children are ignored. But as soon as an element is set to true it is never undone, that means the following sibling and parent nodes of the current node also stay ignored.
>>
>> Yes, that's intended.  They are live as well because they are needed
>> to compute the live scalar.
>>
>> This seems wrong to me, a simple fix would be to clone LIFE for every
>> vector, such that changes to LIFE stay in the subtree:
>>>
>>> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
>>> index 032a9444a5a..f919645f28b 100644
>>> --- a/gcc/tree-vect-slp.c
>>> +++ b/gcc/tree-vect-slp.c
>>> @@ -2590,6 +2590,11 @@ vect_bb_slp_scalar_cost (basic_block bb,
>>>   gimple *stmt;
>>>   slp_tree child;
>>>
>>> +  auto_vec<bool, 20> subtree_life;
>>> +  subtree_life.safe_splice (*life);
>>> +
>>> +  life = &subtree_life;
>>> +
>>>   FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt)
>>>     {
>>>       unsigned stmt_cost;
>>>
>>>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]