double x[1024], y[1024], z[1024]; void foo (double w) { int i; for (i = 0; i < 1023; i+=2) { z[i] = x[i] + 3.; z[i+1] = x[i+1] + -3.; } } void bar (double w) { int i; for (i = 0; i < 1023; i+=2) { z[i] = x[i] + w; z[i+1] = x[i+1] + -3.; } } void baz (double w) { int i; for (i = 0; i < 1023; i+=2) { z[i] = x[i] - w; z[i+1] = x[i+1] + 3.; } }
I just made a patch which supports limited non-isomorphic operations (operations on even/odd elements are still isomorphic) for SLP. Then the three loops you listed can be vectorized using SLP by using new VEC_ADDSUB_EXPR or VEC_SUBADD_EXPR. For x86, SSE3 provides ADDSUBPD/ADDSUBPS instructions which can do the job, but I also emulated them for SSE (use mask to negate the even/odd elements and then add). I think we will need to support more general non-isomorphic operations, which is more difficult and challenging. But I think the limited support in this patch is also useful at this time. I will send the patch later.
Created attachment 31209 [details] hack Btw, I also had a patch^Whack, see attached. Also further patches that didn't get merged to take care of vectorizing PR37021 better.
How do you generate the final operations in vectorized code? I just submitted a patch on this issue. The patch supports non-isomorphic operations with the restriction that all operations on even/odd elements still be isomorphic. Please give me the comment on this patch. Thank you! Cong
This is now fixed. Note + 3. vs. - 3. should be handled more optimally (PR68050)