With -O2, this testcase aborts on x86_64: extern "C" void abort (); enum E { A, B, C, D }; void CheckE(const E value) { long v = value; if (v <= D) abort (); } int main() { CheckE(static_cast<E>(5)); } Presumably something in the middle- or back-end is believing what the C++ front end sets TYPE_MAX_VALUE to and optimizing based on the assumption that "value" cannot be greater than 3. I think the C++ standard can definitely be read to allow this optimization, but it means optimizing away bounds checking, which is a serious security problem. The standard should be fixed.
(In reply to comment #0) > I think the C++ standard can definitely be read to allow this optimization I would most definitely think so. 7.2/6 specifically says that the values an enum variable can take on are, in your example, 0...3. See, in particular, footnote 81 in C++98. Everything outside would invoke undefined behavior, and I like that the compiler can make inferences from that. Don't we already have a warning that warns about assigning values outside the range of an enum? W.
Except that the conversion is defined to produce an unspecified value, not undefined behavior. A strict reading of that suggests to me that the compiler is supposed to truncate on conversion so the resulting value is one of the values of the enumeration, but that's certainly not what we want. I also raised this issue as http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1022
Subject: Re: G++ is too aggressive in optimizing away bounds checking with enums > Except that the conversion is defined to produce an unspecified > value, not undefined behavior. A strict reading of that suggests to me > that the compiler is supposed to truncate on conversion so the resulting > value is one of the values of the enumeration, but that's certainly not > what we want. I'm really not in a position to argue standardese with you, but my reading based on the paragraph I cited is that the enumeration has a certain range of values it can represent (which may be larger than the set of values in the enumeration). In my interpretation, any assignment of values outside the range must necessarily produce (unspecified) values within this range. I can see no reason to prohibit a compiler from applying a mask every time. W.
Yes, that's my interpretation as well. But it's not what we do now, and requiring a mask on conversion seems like it would hurt performance more than just doing the comparison.
The middle-end trusts the C++ frontends TYPE_MIN/MAX_VALUE which is according to the C++ standard the range for an (unsigned) integer type with a precision just as wide as to represent all values in the enum. I did want to remove that optimization at some point but people complained that it was highly useful (the C++ FE can still set TYPE_PRECISION accordingly if it wants to, but TYPE_MIN/MAX_VALUE cannot really be trusted in the middle-end as conversions between integer types of same precision but different TYPE_MIN/MAX_VALUE are 1) useless and 2) do not produce code to truncate the values). It's VRP btw, that does this optimization. If I have your support I'll rip out this optimization from VRP ... (I didn't yet manage to create a wrong-code bug due to the useless conversion thing, but I believe that is possible). I also believe that the TYPE_MIN/MAX_VALUE on enums might be needed for proper debug information.
Oh, and actually I don't think this bug is a wrong-code bug but at most a quality of implementation issue.
(In reply to comment #1) > > Don't we already have a warning that warns about assigning values outside the > range of an enum? > pr43680.C:14:26: warning: the result of the conversion is unspecified because ‘5’ is outside the range of type ‘E’ [-Wconversion] Implemented by yours truly in GCC 4.5. (or perhaps already in 4.4.)
> I did want to remove that optimization at some point but people complained > that it was highly useful (the C++ FE can still set TYPE_PRECISION > accordingly if it wants to, but TYPE_MIN/MAX_VALUE cannot really be trusted > in the middle-end as conversions between integer types of same precision > but different TYPE_MIN/MAX_VALUE are 1) useless and 2) do not produce code > to truncate the values). We had the exact same problem in Ada until 4.4, it has been solved by setting TYPE_MIN/MAX_VALUE to their canonical values... > I also believe that the TYPE_MIN/MAX_VALUE on enums might be needed for > proper debug information. ...and this one by providing a get_subrange_bounds lang hook.
I'm not saying we *should* apply a mask (in fact, I think that would be silly). But we *could*, and if we did then VRP's actions might lead to faster but not different code. All I want to say is that VRP is a powerful tool. Let's not just throw the baby out with the bath water. W.
@7: But -Wconversion only warns if it knows the constant value, it doesn't warn about converting from an arbitrary int variable such as (importantly) a loop counter. @5: It seems appropriate to me for VRP to optimize based on TYPE_MIN/MAX_VALUE, the problem is that the front end is lying; either we need to apply a mask on conversion to the enum, or we should set TYPE_MIN/MAX_VALUE to the appropriate values for the mode. My preference is for the latter, and it sounds like Ada made the same choice. The debugging backends don't seem to care about TYPE_MIN/MAX_VALUE for enums.
See also: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42810 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41425 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37281 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36170 There might be others which have been already closed as invalid/duplicate too but I cannot find them with a very simple search :).
Created attachment 20343 [details] patch This patch causes the compiler to avoid this optimization while preserving all the related warnings (except pr33738.C which is specifically testing that this optimization is done).
I think this optimization is valuable in some cases, so I think this is a question of defaults, rather than of behavior per se. While it may be useful for some security-related applications not to eliminate the checks, but it is clearly useful in other contexts to eliminate the checks. "Optimizing away bounds checking" is certainly not in and of itself a bug. In some sense, this is similar to strict-aliasing, or signed-does-not-overflow optimization. There's a lot of code out there that depends on something that's not portable. It's not useful to break that code, but it is useful to be able to optimize. (One case where I think this optimization is likely to be valuable is in switch/if statements where you're cascading through values; this optimization can allow you to eliminate the catch-all case at the end, which is certainly a win for code-size.) I think the standard should say that you get an indeterminate value when you convert an out-or-range input to an enum (and that such value may be outside the bounds of the enum range), but that the program is not allowed to go and reformat your disk. Ada has a notion of errors like this; the damage is bounded. This doesn't need to be in the dereference NULL pointer category of behavior, but neither should it have any guaranteed semantics, including that the value is masked down. So, I think we should have a C++ front-end option that says "interpret enums strictly". I think it should be off by default for compatibility with existing code.
Certainly optimizing away bounds checking is good when it is provably redundant, but that clearly doesn't apply to this case. That said, I'll go ahead and add the option.
Subject: Re: [DR 1022] G++ is too aggressive in optimizing away bounds checking with enums jason at gcc dot gnu dot org wrote: > Certainly optimizing away bounds checking is good when it is provably > redundant, but that clearly doesn't apply to this case. Do you think this is different from signed integer overflow in loops? To me, it seems quite similar. That's a situation where the compiler will now optimize away the check in something like "for (int i = 0; i >= 0; ++i)", leaving us with an infinite loop. And, of course, that can hit you in a security context too. /* Here we know that "i" is positive. */ ... if (i + 100 <= 0) abort(); /* The check above will make sure this never overflows ... <scaryvoice>or will it?</scaryvoice> */ i += 100; > That said, I'll go ahead and add the option. Thanks,
I agree that it's similar to signed integer overflow. One significant difference is that this issue doesn't affect C. One strange thing here is that the VRP optimization only happens when the enum variable is converted to a different type; it doesn't happen if we just say "if (value <= D)." That seems like a bug.
Subject: Bug 43680 Author: jason Date: Mon May 3 21:16:40 2010 New Revision: 159006 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=159006 Log: PR c++/43680 gcc: * c.opt (-fstrict-enums): New. * doc/invoke.texi (C++ Dialect Options): Document -fstrict-enums. gcc/cp: * decl.c (finish_enum): Use the TYPE_MIN_VALUE and TYPE_MAX_VALUE from the selected underlying type unless -fstrict-enums. Set ENUM_UNDERLYING_TYPE to have the restricted range. * cvt.c (type_promotes_to): Use ENUM_UNDERLYING_TYPE. * class.c (check_bitfield_decl): Likewise. Added: trunk/gcc/testsuite/g++.dg/opt/enum2.C Modified: trunk/gcc/ChangeLog trunk/gcc/c.opt trunk/gcc/cp/ChangeLog trunk/gcc/cp/class.c trunk/gcc/cp/cvt.c trunk/gcc/cp/decl.c trunk/gcc/doc/invoke.texi trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/g++.dg/warn/Wswitch-1.C trunk/gcc/testsuite/g++.dg/warn/pr33738.C
Fixed for 4.6.0.
It looks like the committee is making this code undefined again: http://open-std.org/jtc1/sc22/wg21/docs/cwg_toc.html#1766