Bug 56902 - Fails to SLP with mismatched +/- and negatable constants
Summary: Fails to SLP with mismatched +/- and negatable constants
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.9.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks: vectorizer 37021
  Show dependency treegraph
 
Reported: 2013-04-10 09:46 UTC by Richard Biener
Modified: 2015-10-22 10:03 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work: 6.0
Known to fail:
Last reconfirmed:


Attachments
hack (1.31 KB, text/plain)
2013-11-13 15:39 UTC, Richard Biener
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Biener 2013-04-10 09:46:44 UTC
double x[1024], y[1024], z[1024];
void foo (double w)
{
  int i;
  for (i = 0; i < 1023; i+=2)
    {
      z[i] = x[i] + 3.;
      z[i+1] = x[i+1] + -3.;
    }
}
void bar (double w)
{
  int i;
  for (i = 0; i < 1023; i+=2)
    {
      z[i] = x[i] + w;
      z[i+1] = x[i+1] + -3.;
    }
}
void baz (double w)
{
  int i;
  for (i = 0; i < 1023; i+=2)
    {
      z[i] = x[i] - w;
      z[i+1] = x[i+1] + 3.;
    }
}
Comment 1 Cong Hou 2013-11-11 18:44:13 UTC
I just made a patch which supports limited non-isomorphic operations (operations on even/odd elements are still isomorphic) for SLP. Then the three loops you listed can be vectorized using SLP by using new VEC_ADDSUB_EXPR or VEC_SUBADD_EXPR. For x86, SSE3 provides ADDSUBPD/ADDSUBPS instructions which can do the job, but I also emulated them for SSE (use mask to negate the even/odd elements and then add).

I think we will need to support more general non-isomorphic operations, which is more difficult and challenging. But I think the limited support in this patch is also useful at this time.

I will send the patch later.
Comment 2 Richard Biener 2013-11-13 15:39:21 UTC
Created attachment 31209 [details]
hack

Btw, I also had a patch^Whack, see attached.  Also further patches that didn't
get merged to take care of vectorizing PR37021 better.
Comment 3 Cong Hou 2013-11-15 02:09:48 UTC
How do you generate the final operations in vectorized code?

I just submitted a patch on this issue. The patch supports non-isomorphic operations with the restriction that all operations on even/odd elements still be isomorphic. Please give me the comment on this patch.

Thank you!


Cong
Comment 4 Richard Biener 2015-10-22 10:03:28 UTC
This is now fixed.  Note + 3. vs. - 3. should be handled more optimally (PR68050)