This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix PR82255 (vectorizer cost model overcounts some vector load costs)


On Sep 19, 2017, at 4:21 PM, Bill Schmidt <wschmidt@linux.vnet.ibm.com> wrote:
> 
> On Sep 19, 2017, at 3:58 PM, Richard Sandiford <richard.sandiford@linaro.org> wrote:
>> 
>> Bill Schmidt <wschmidt@linux.vnet.ibm.com> writes:
>>> Index: gcc/tree-vect-stmts.c
>>> ===================================================================
>>> --- gcc/tree-vect-stmts.c	(revision 252760)
>>> +++ gcc/tree-vect-stmts.c	(working copy)
>>> @@ -1091,8 +1091,19 @@ vect_model_load_cost (stmt_vec_info stmt_info, int
>>> 			prologue_cost_vec, body_cost_vec, true);
>>>  if (memory_access_type == VMAT_ELEMENTWISE
>>>      || memory_access_type == VMAT_STRIDED_SLP)
>>> -    inside_cost += record_stmt_cost (body_cost_vec, ncopies, vec_construct,
>>> -				     stmt_info, 0, vect_body);
>>> +    {
>>> +      int group_size = GROUP_SIZE (stmt_info);
>>> +      int nunits = TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info));
>>> +      if (group_size < nunits)
>>> +	{
>>> +	  if (dump_enabled_p ())
>>> +	    dump_printf_loc (MSG_NOTE, vect_location,
>>> +			     "vect_model_load_cost: vec_construct required");
>>> +	  inside_cost += record_stmt_cost (body_cost_vec, ncopies,
>>> +					   vec_construct, stmt_info, 0,
>>> +					   vect_body);
>>> +	}
>>> +    }
>>> 
>>>  if (dump_enabled_p ())
>>>    dump_printf_loc (MSG_NOTE, vect_location,
>> 
>> This feels like we've probably got the wrong memory_access_type.
>> If it's a just a contiguous load then it should be VMAT_CONTIGUOUS
>> instead.
> 
> I tend to agree -- that will take more surgery to the code that detects strided loads in
> both the cost model analysis and in vectorizable_load.
> 
It's just a little less clear how to clean this up in vect_analyze_data_refs.  The
too-simplistic test there now is:

      else if (is_a <loop_vec_info> (vinfo)
               && TREE_CODE (DR_STEP (dr)) != INTEGER_CST)
        {
          if (nested_in_vect_loop_p (loop, stmt))
            {
              if (dump_enabled_p ())
		{
                  dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
                                   "not vectorized: not suitable for strided "
                                   "load ");
                  dump_gimple_stmt (MSG_MISSED_OPTIMIZATION, TDF_SLIM, stmt, 0);
                }
              return false;
            }
          STMT_VINFO_STRIDED_P (stmt_info) = true;
        }

Furthermore, this is being set while processing each data reference, not after
all information has been gathered and analyzed.  So this is too early.

Next place to fix this would be in vect_analyze_slp_cost_1, where

      vect_memory_access_type memory_access_type
        = (STMT_VINFO_STRIDED_P (stmt_info)
           ? VMAT_STRIDED_SLP
           : VMAT_CONTIGUOUS);

is too simplistic.  I guess we could do something here, but it would require unpacking
all the grouped accesses and duplicating all the logic from scratch to determine 
whether it's a single load or not.

So it's going to be a little painful to do it "right."  I submitted this patch because I felt
it would be the simplest solution.

Thanks,
Bill


> Bill
> 
>> 
>> Thanks,
>> Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]