Created attachment 47427 [details] Test-case that reproduces the bug with gcc 7.4 and -Os We've seen a miscompilation of Firefox in GCC 7.4 when compiled with -Os, when we pass an `enum class : uint8_t` to an extern function. See https://bugzilla.mozilla.org/show_bug.cgi?id=1600735 for all the details. It seems fixed in GCC 8. Bisection by redi (thanks!) points to https://gcc.gnu.org/PR82260 as the culprit for the fix, which seems just a missed optimization opportunity (thus me filing this bug). The basic setup is just calling an extern function with an `enum class : uint8_t` argument, and having that extern function compiled by LLVM (in this case, by the Rust compiler). LLVM seems to rely on the caller zero-extending the enum to 32 bits. Per bug 46942 it seems gcc also does something like this usually (and it does happen to be the case for non -Os builds). In the bad case, GCC uses a single `mov sil,bl` to set up the argument, which means garbage on the rest of the register remains that LLVM assumes to be zeroed. It seems either GCC should probably always zero-extend when calling an extern function, or LLVM should never rely on this zero-extension... I don't know what the relevant ABI specifications say, but I'm happy to file an LLVM bug if this is deemed not to be a GCC bug per se.
CC jakub who fixed it (or just made it latent) with r253050
I believe it is a LLVM bug. At least, reading https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-1.0.pdf, I can't find in the Parameter Passing section anything that would say that arguments smaller than 64-bit are passed sign or zero extended to 64-bit like some other psABIs require. The only related thing is "When a value of type _Bool is returned or passed in a register or on the stack, bit 0 contains the truth value and bits 1 to 7 shall be zero." with a footnote: "Other bits are left unspecified, hence the consumer side of those values can rely on it being 0 or 1 when truncated to 8 bit." which says that _Bool has only significant low 8 bits and the rest is unspecified.
If my reading of https://godbolt.org/z/BNHxEY is correct (sorry, still a bit of a noob with disassembly), it looks like GCC does correctly load only the low byte in the equivalent implementation of the Rust function, but clang does assume zero-extension to 32 bit... :/
I filed https://bugs.llvm.org/show_bug.cgi?id=44228 to get some input from LLVM folks.
Yes, we (intentionally) haven't required any extensions to happen for arguments or return values smaller than 64bit (e.g. we haven't even specified that arguments <= 32bit would be zero-extended in the high bits, as would have been natural with the instruction set). If LLVM relies on that it would be a bug.
Note that for most types one ends up with zero or sign extension already due to C/C++ promotion rules, though in the ABI still can't use that as guarantee, but C++ typed enums aren't promoted.
Related links: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46942 https://groups.google.com/forum/?hl=en#!topic/x86-64-abi/E8O33onbnGQ
(In reply to Jakub Jelinek from comment #6) > Note that for most types one ends up with zero or sign extension already due > to C/C++ promotion rules, though in the ABI still can't use that as > guarantee, but C++ typed enums aren't promoted. I see. Another point that someone made in the Mozilla bug tracker is that the ABI, in the "Scalar Types" table, specifies that enums are represented as a `signed fourbyte`. I'm not sure that applies to typed enums. But just in case, if it does, the ABI may require sign/zero-extending these too? Looking at https://bugs.llvm.org/show_bug.cgi?id=12207, which might be a dupe of the LLVM bug I just filed, doesn't make me very hopeful of LLVM changing their behavior here. Would it make sense to make gcc always zero/sign-extend on the caller, even if not required by the ABI or the language? This makes a bunch of calls from gcc to llvm-compiled binaries potentially unsafe.
(In reply to Emilio Cobos Álvarez (:emilio) from comment #8) > Another point that someone made in the Mozilla bug tracker is that the ABI, > in the "Scalar Types" table, specifies that enums are represented as a > `signed fourbyte`. It doesn't apply to typed enums nor normal enums that don't fit into the 4 bytes, such as enum E { -9223372036854775807LL, 9223372036854775807LL }; etc. typed enums have a fixed underlying type and that is what they use, in your case uint8_t. > I'm not sure that applies to typed enums. But just in case, if it does, the > ABI may require sign/zero-extending these too? > > Looking at https://bugs.llvm.org/show_bug.cgi?id=12207, which might be a > dupe of the LLVM bug I just filed, doesn't make me very hopeful of LLVM > changing their behavior here. What really matters is what does the psABI say, and as you can see above clarified from one of the ABI authors, there is intentionally no zero or sign extension of shorter arguments. So LLVM has a bug and that bug should be fixed in there, not worked around in other compilers.
(In reply to Jakub Jelinek from comment #9) > What really matters is what does the psABI say, and as you can see above > clarified from one of the ABI authors, there is intentionally no zero or > sign extension of shorter arguments. > So LLVM has a bug and that bug should be fixed in there, not worked around > in other compilers. Well, we do make changes in Gecko to improve interop with other browser engines even when the spec doesn't explicitly specify a given behavior... But yeah, I totally get that, and I hate when we have to do that too. It was worth a shot though :) At this point I guess this should be closed as invalid, unless someone or further evidence shows there's something to fix in GCC. Thanks for all your work in GCC :)