[Bug tree-optimization/46785] New: Doesn't vectorize reduction x += y*y
rguenth at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Fri Dec 3 14:54:00 GMT 2010
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46785
Summary: Doesn't vectorize reduction x += y*y
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: rguenth@gcc.gnu.org
CC: irar@gcc.gnu.org
When looking at why GCC is so slow with the himeno benchmark in the usual
Phoronix testing I noticed that we do not vectorize the reduction in
float x[1024];
float
test (void)
{
int i;
float gosa = 0.0;
for (i = 0; i < 1024; ++i)
{
float tem = x[i];
gosa += tem * tem;
}
return gosa;
}
because at analysis time we have
D.3171_6 = __builtin_powf (tem_5, 2.0e+0);
as the def for the addition which doesn't satisfy is_gimple_assign
nor any of the vinfo tests:
$3 = {type = undef_vec_info_type, live = 0 '\000', in_pattern_p = 0 '\000',
read_write_dep = 0 '\000', stmt = 0x7ffff7edc908, loop_vinfo = 0x18f77e0,
vectype = 0x0, vectorized_stmt = 0x0, data_ref_info = 0x0,
dr_base_address = 0x0, dr_init = 0x0, dr_offset = 0x0, dr_step = 0x0,
dr_aligned_to = 0x0, related_stmt = 0x0, same_align_refs = 0x18cf7f0,
def_type = vect_internal_def, slp_type = loop_vect, first_dr = 0x0,
next_dr = 0x0, same_dr_stmt = 0x0, size = 0, store_count = 0, gap = 0,
relevant = vect_unused_in_scope, cost = {outside_of_loop = 0,
inside_of_loop = 0}, bb_vinfo = 0x0, vectorizable = 1 '\001'}
As we want to allow internal defs we can also just let calls slip through
here (so we vectorize reductions with veclib vectorized calls as well).
Ira?
More information about the Gcc-bugs
mailing list