This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/65709] [5 Regression] Bad code for LZ4 decompression with -O3 on x86_64


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65709

--- Comment #17 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Jeffrey Walton from comment #16)
> OK, so you'll have to forgive my ignorance again.
> 
> So you are saying that it may be a bug to use vmovdqa if the source and/or
> destination are not 16-byte aligned; but all the user code you have seen has
> undefined behavior so you're not going to answer. Is that correct?
> 
> (My apologies if its too sharp a point. I'm just trying to figure out what
> the preconditions are for vmovdqa, and if it should be used when source or
> destination is 8-byte aligned).

I'm saying we as the compiler writers know what we are doing, and the various
cases like using unaligned accesses or peeling for alignment or versioning for
alignment, or realigning arrays are handled in the compiler.
They do assume that the source is valid and does not trigger undefined
behavior.
If you e.g. compile on x86_64 with -O3 -mavx2
void
foo (int *a, int *b)
{
  int i;
  for (i = 0; i < 64; i++)
    a[i] = 2 * b[i];
}
you'll see that compiler decided to peel for alignment of b pointer and you can
see an (unrolled) scalar loop first that handles first few iterations to align
b if it is not already aligned, and then the main vector loop uses
vmovdqa for loads and vmovups for stores (because the a pointer modulo 32 might
not be the same as b pointer modulo 32).  If you compile with -O2
-ftree-vectorize -mavx2, you'll see that peeling for alignment isn't performed,
as it enlarges the code, and vmovdqu is used for the loads instead.
The peeling for alignment assumes that there is no undefined behavior
originally, so if you call this with (uintptr) b % sizeof (int) != 0, it will
not work properly, but that is a bug in the code, not in the compiler.
So, if you have some testcase where there is no undefined behavior triggered
(try various sanitizers, inspect the code yourself, read the C standard), and
you are convinced the compiler introduces a bug where there isn't originally
(i.e. miscompiles the code), feel free to file a new bugreport.
Nothing like that has been presented in this PR.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]