[Bug tree-optimization/51074] New: No constant folding performed for VEC_PERM_EXPR, VEC_INTERLEAVE*EXPR, VEC_EXTRACT*EXPR

jakub at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Thu Nov 10 09:37:00 GMT 2011


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51074

             Bug #: 51074
           Summary: No constant folding performed for VEC_PERM_EXPR,
                    VEC_INTERLEAVE*EXPR, VEC_EXTRACT*EXPR
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: jakub@gcc.gnu.org
                CC: irar@gcc.gnu.org, rth@gcc.gnu.org


We don't constant fold what could be constant folded, namely the above
mentioned permutation trees if all the operands of them are VECTOR_CSTs:

-O2

#define vector(type, count) type __attribute__((vector_size (sizeof (type) *
count)))

vector (short, 8) d;

void
foo ()
{
  vector (short, 8) a = { 0, 1, 2, 3, 4, 5, 6, 7 };
  vector (short, 8) b = { 8, 9, 10, 11, 12, 13, 14, 15 };
  vector (short, 8) c = { 0, 8, 1, 9, 2, 10, 3, 11 };
  d = __builtin_shuffle (a, b, c);
}

void
bar ()
{
  vector (short, 8) a = { 0, 1, 2, 3, 4, 5, 6, 7 };
  vector (short, 8) b = { 8, 9, 10, 11, 12, 13, 14, 15 };
  vector (short, 8) c = { 4, 12, 5, 13, 6, 14, 7, 15 };
  d = __builtin_shuffle (a, b, c);
}

or:

-O3 -fno-vect-cost-model -mavx:

char *a[1024];
extern char b[];

void
foo ()
{
  int i;
  for (i = 0; i < 1024; i += 16)
    {
      a[i] = b + 1;
      a[i + 15] = b + 2;
      a[i + 1] = b + 3;
      a[i + 14] = b + 4;
      a[i + 2] = b + 5;
      a[i + 13] = b + 6;
      a[i + 3] = b + 7;
      a[i + 12] = b + 8;
      a[i + 4] = b + 9;
      a[i + 11] = b + 10;
      a[i + 5] = b + 11;
      a[i + 10] = b + 12;
      a[i + 6] = b + 13;
      a[i + 9] = b + 14;
      a[i + 7] = b + 15;
      a[i + 8] = b + 16;
    }
}

I wonder if e.g. expand_vector_operations couldn't handle those (if all the
arguments are either VECTOR_CSTs or SSA_NAMEs initialized to VECTOR_CSTs),
there is of course a risk that if we create from very few VECTOR_CSTs in a loop
many different VECTOR_CSTs then it increases register pressure, so perhaps we'd
want to count how many VECTOR_CSTs we've created vs. how many we've got rid and
allow the number to grow only by some small constant or something similar.

Plus, there is the question if the vectorizer shouldn't be aware of that too
(e.g. in the second testcase the vectorizer could take it into the account when
computing costs and e.g. for interleaved constant stores couldn't just do it
right away.



More information about the Gcc-bugs mailing list