[Bug rtl-optimization/65078] [5 Regression] 4.9 and 5.0 generate more spill-fill in comparison with 4.8.2

jakub at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Tue Mar 17 12:22:00 GMT 2015


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65078

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
During the expansion, we don't try vec_extract because we are trying to extract
low DImode (64bits) out of a V16QImode pseudo, which is not really vector
element extraction, and the middle end doesn't know that on this target it is
beneficial to just subreg the V16QImode pseudo to identically sized vector with
different sized elements (V2DImode in this case).

So, in order to handle this at the expansion level, we probably would need to
add some new optab like vec_extract that would be not just about the source
mode, but also target mode (conversion optab?), or some target hook or macro
that would instruct the middle-end to also try to subreg the vector mode to
identically sized other vector mode before trying vec_extract.

Immediately after the vec_extract check, we already convert the V16QImode to
TImode and force_reg it, so that is the last spot that can do something about
it during expansion.

To fix this up before reload, we have the option of either !reload_completed
splitter or some combiner pattern(s).

Short testcase that shows hopefully optimal or close to that output for f5-f8
and really bad code for f1-f4, both with -O2 -m64 and -O2 -msse2 -m32.

typedef unsigned char V __attribute__((vector_size (16)));
typedef unsigned long long W __attribute__((vector_size (16)));
typedef unsigned int T __attribute__((vector_size (16)));

void
f1 (unsigned long long *x, V y)
{
  *x = ((W)y)[0];
}

unsigned long long
f2 (V y)
{
  return ((W)y)[0];
}

void
f3 (unsigned int *x, V y)
{
  *x = ((T)y)[0];
}

unsigned int
f4 (V y)
{
  return ((T)y)[0];
}

void
f5 (unsigned long long *x, W y)
{
  *x = ((W)y)[0];
}

unsigned long long
f6 (W y)
{
  return ((W)y)[0];
}

void
f7 (unsigned int *x, T y)
{
  *x = ((T)y)[0];
}

unsigned int
f8 (T y)
{
  return ((T)y)[0];
}



More information about the Gcc-bugs mailing list