This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug tree-optimization/78164] New: SLP vectorizer: prologue cost biased by redundancies

From: "glisse at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Sun, 30 Oct 2016 17:17:21 +0000
Subject: [Bug tree-optimization/78164] New: SLP vectorizer: prologue cost biased by redundancies
Auto-submitted: auto-generated

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78164

            Bug ID: 78164
           Summary: SLP vectorizer: prologue cost biased by redundancies
           Product: gcc
           Version: 7.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: glisse at gcc dot gnu.org
  Target Milestone: ---

From http://stackoverflow.com/q/39947582/1918193

void testfunc_flat(double a, double b, double* dst)
{
  dst[0] = 0.1 + ( a)*(1.0 + 0.5*( a));
  dst[1] = 0.1 + ( b)*(1.0 + 0.5*( b));
  dst[2] = 0.1 + (-a)*(1.0 + 0.5*(-a));
  dst[3] = 0.1 + (-b)*(1.0 + 0.5*(-b));
}

We fail to vectorize with AVX, that's understandable because the operations are
different. More surprising is that we reject SSE vectorization

  Vector inside of basic block cost: 14
  Vector prologue cost: 10
  Vector epilogue cost: 0
  Scalar cost of basic block: 22

However, if I disable the cost model, I can see this prologue that is supposed
to have cost 10:

  vect_cst__47 = { 1.000000000000000055511151231257827021181583404541015625e-1,
1.000000000000000055511151231257827021181583404541015625e-1 };
  vect_cst__44 = { 1.0e+0, 1.0e+0 };
  vect_cst__42 = { 5.0e-1, 5.0e-1 };
  vect_cst__40 = {a_19(D), b_23(D)};
  vect_cst__38 = {a_19(D), b_23(D)};
  vect_cst__34 = { 1.000000000000000055511151231257827021181583404541015625e-1,
1.000000000000000055511151231257827021181583404541015625e-1 };
  vect_cst__32 = {a_19(D), b_23(D)};
  vect_cst__30 = { 1.0e+0, 1.0e+0 };
  vect_cst__28 = { 5.0e-1, 5.0e-1 };
  vect_cst__27 = {a_19(D), b_23(D)};

Some very basic CSE would bring it down to a cost of 4 and allow vectorizing
like llvm.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]