[Bug tree-optimization/91573] New: Vectorization failure for a loop to do multiply-add
hliu at amperecomputing dot com
gcc-bugzilla@gcc.gnu.org
Wed Aug 28 07:41:00 GMT 2019
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91573
Bug ID: 91573
Summary: Vectorization failure for a loop to do multiply-add
Product: gcc
Version: 10.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: hliu at amperecomputing dot com
Target Milestone: ---
The following code can not be vectorized ( compiling with gcc -O3 ):
=== begin code ===
char src[512];
char dst[512];
#define WIDTH 8
void foo(int height, int a, int b, int c, int d, int dst_stride) {
char * ptr_src = src;
char * ptr_dst = dst;
for( int y = 0; y < height; y++ )
{
for( int x = 0; x < WIDTH; x++ )
ptr_dst[x] = ( a*ptr_src[x] + b*ptr_src[x+1] + c*ptr_src[x] +
d*ptr_src[x+1]) >> 6;
ptr_dst += dst_stride;
ptr_src += 32;
}
}
=== end code ===
However, the case can be vectorized with either following modifications:
1) If the calculation is simpler, e.g.
ptr_dst[x] = ( a*ptr_src[x] + c*ptr_src[x] ) >> 6;
2) If WIDTH is larger. e.g.
#define WIDTH 16
This case is a hot loop from real application. It can be exposed on both
AArch64 and X86-64 platform.
More information about the Gcc-bugs
mailing list