This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/55147] x86: wrong code for 64-bit load


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55147

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-10-31 16:07:11 UTC ---
For the testcase from this PR it creates better assembly actually (compared to
with the #c1 patch, without that it is both longer and wrong).  That is because
when bswapdi is split too late, nothing optimizes the fact that only 32 bits of
the result are used.

For
unsigned long long
f1 (unsigned long long *p, int i)
{
  return __builtin_bswap64 (p[i]);
}

unsigned long long
f2 (unsigned long long p)
{
  return __builtin_bswap64 (p);
}

void
f3 (unsigned long long *p, int i, unsigned long long q)
{
  p[i] = __builtin_bswap64 (q);
}

void
f4 (unsigned long long *p, int i, unsigned long long *q)
{
  p[i] = __builtin_bswap64 (q[i]);
}

it creates the same number of insns/same quality (just slightly different RA
decisions/scheduling) for f1-f3, but for f4 without bswapdi2 it creates
slightly worse code (with bswapdi2 f4 needs just one call saved register,
without it two, supposedly because both bswap insns are scheduled together.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]