[Bug tree-optimization/46785] New: Doesn't vectorize reduction x += y*y

Fri Dec 3 14:54:00 GMT 2010

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46785

           Summary: Doesn't vectorize reduction x += y*y
           Product: gcc
           Version: 4.6.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: rguenth@gcc.gnu.org
                CC: irar@gcc.gnu.org

When looking at why GCC is so slow with the himeno benchmark in the usual
Phoronix testing I noticed that we do not vectorize the reduction in

float x[1024];
float
test (void)
{
  int i;
  float gosa = 0.0;
  for (i = 0; i < 1024; ++i)
    {
      float tem = x[i];
      gosa += tem * tem;
    }
  return gosa;
}

because at analysis time we have

D.3171_6 = __builtin_powf (tem_5, 2.0e+0);

as the def for the addition which doesn't satisfy is_gimple_assign
nor any of the vinfo tests:

$3 = {type = undef_vec_info_type, live = 0 '\000', in_pattern_p = 0 '\000', 
  read_write_dep = 0 '\000', stmt = 0x7ffff7edc908, loop_vinfo = 0x18f77e0, 
  vectype = 0x0, vectorized_stmt = 0x0, data_ref_info = 0x0, 
  dr_base_address = 0x0, dr_init = 0x0, dr_offset = 0x0, dr_step = 0x0, 
  dr_aligned_to = 0x0, related_stmt = 0x0, same_align_refs = 0x18cf7f0, 
  def_type = vect_internal_def, slp_type = loop_vect, first_dr = 0x0, 
  next_dr = 0x0, same_dr_stmt = 0x0, size = 0, store_count = 0, gap = 0, 
  relevant = vect_unused_in_scope, cost = {outside_of_loop = 0, 
    inside_of_loop = 0}, bb_vinfo = 0x0, vectorizable = 1 '\001'}

As we want to allow internal defs we can also just let calls slip through
here (so we vectorize reductions with veclib vectorized calls as well).

Ira?