Summary: | Document trap representations of _Bool | ||
---|---|---|---|
Product: | gcc | Reporter: | gnzlbg <gonzalo.gadeschi> |
Component: | c | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | UNCONFIRMED --- | ||
Severity: | normal | CC: | jsm28, msebor, vstinner |
Priority: | P3 | Keywords: | documentation |
Version: | 9.0 | ||
Target Milestone: | --- | ||
Host: | Target: | ||
Build: | Known to work: | ||
Known to fail: | Last reconfirmed: |
Description
gnzlbg
2019-01-02 13:51:44 UTC
(In reply to gnzlbg from comment #0) > Compiling > > unsigned int foo(unsigned int x, _Bool b) { > return x - (unsigned int)b; > } > > only produces correct results if the value of `_Bool` is either `0` or `1` Because (unsigned int)b is undefined otherwise. > [0], see https://gcc.godbolt.org/z/l0DPjc: > > foo: > movzx esi, sil > mov eax, edi > sub eax, esi > ret > > This probably means that all other representations of `_Bool` are trap > representations, but this does not appear to be documented anywhere. The representation of _Bool is unspecified, not implementation-defined, so doesn't need to be documented. > Because (unsigned int)b is undefined otherwise.
AFAICT this is only undefined behavior iff `b` has a trap representation.
Yes, and an implementation is not required to document which object representations are trap representations. Without that information, how does one know which values can a valid program write to a `_Bool` via a `char*`? AFAIK the C standard guarantees that 0x0 must be a valid representation of _Bool, but there are no guarantees about the bit-pattern of true beyond that such a value must exist. You can copy the bit-pattern from any _Bool with true value, e.g. one initialized with 'true' or an expression like '0==0'. Why do you need more than that? > Why do you need more than that? I'm reading raw data from a file which supposedly contains _Bool's and I'd like to validate it (the _Bools could have been written to the file by a program compiled with a different C toolchain). > You can copy the bit-pattern from any _Bool with true value, The standard does not guarantee that only one such bit-pattern exists AFAICT, i.e., there might be multiple bit-patterns representing true and false, e.g., if only the first bit is used to represent true and false, and all other bits are ignored (e.g., as opposed to just being zero, like the SysV AMD64 ABI requires). (In reply to gnzlbg from comment #2) > > Because (unsigned int)b is undefined otherwise. > > AFAICT this is only undefined behavior iff `b` has a trap representation. Not necessarily. It's undefined if b's value is indeterminate, whether or not it's a trap representation, or whether or not b's type even has a trap representation. See C Defect Report 451 for some background: http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_451.htm I don't think copying arbitrary bits into an object changes that, unless those bits come from an initialized object of the same type in the same program execution. That said, there has been a lot of confusion about padding bits and trap representations so I'm not completely unsympathetic to the request, even though, as Jonathan says, thos aspects of types are unspecified. But rather than documenting which bits are padding bits I think it should be sufficient to either mention which types have padding bits, or expose some additional Common Predefined Macros to make it possible to determine which ones do (and perhaps even compute how many). > I think it should be sufficient to either mention which types have padding bits, I am not sure. An intrinsic that tells me that _Bool has 7 padding bits does not provide me with any new information. The C standard guarantees that _Bool has 1 value bit, so if `sizeof(_Bool)` returns N, then _Bool must have N * CHAR_BITS - 1 padding bits AFAICT. My question is which values are those padding bits allowed to take, which is unspecified in the C standard AFAICT. N1356 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1356.htm) stated: > GCC defines it to have one value bit with the other bits being padding bits and undefined behavior if you access a _Bool representation with any of the padding bits having a nonzero value (such representations being trap representations) Documenting that this is how GCC defines the value that the padding bits in _Bool are allowed to take would be an useful guarantee, even if the standard does not require GCC to make this guarantee. But it constrains GCC in future, which leaving it unspecified does not. > But it constrains GCC in future, which leaving it unspecified does not.
Documenting whether GCC's C _Bool has the same valid and trap representations as the target platform's ABI specifies is a trade-off: it does have a cost as you said, but it also adds value.
The question is whether this trade-off is worth it.
I am not a compiler expert, but using the same representation of _Bool as the platform's ABI allows GCC to avoid conversions on function arguments, return values, and when passing _Bools through memory. It appears to me that GCC would want to avoid doing these conversions anyways. An alternative here would be to, instead of guaranteeing this behavior, document the current behavior with a disclaimer that the behavior can change. So the cost of documenting this could be kept fairly small.
Value-wise, if I want to cast an array of char to an array of _Bool, this guarantee allows me to check whether doing so will introduce undefined behavior, which I think is very valuable.
So from my pov, documenting current behavior without guaranteeing it has almost zero cost, and adds a lot of value.
I disagree. Once it's documented, people will rely on it and scream if it changes. Caveats about something maybe changing in future don't help. If it's documented to behave one way today, people will depend on that. It seems you already know what the behaviour is today, so how would documenting it but saying "this might change tomorrow!" help you? It tells you nothign you don't already know. > I disagree. Once it's documented, people will rely on it and scream if it changes. Caveats about something maybe changing in future don't help. If it's documented to behave one way today, people will depend on that. That's fair. > It seems you already know what the behaviour is today If you tell me that my thoughts about how this currently works are correct then that documents current behavior, and my code will depend on this. > so how would documenting it but saying "this might change tomorrow!" help you? It tells you nothign you don't already know. If this was documented somewhere for a particular version of GCC, when my code is compiled with that particular GCC version, I could check inputs for invalid _Bools in my programs and abort reliably without triggering undefined behavior. If this is not documented anywhere, I can at best write code that "maybe aborts or maybe has undefined behavior". I find the difference very significant. *** Bug 98190 has been marked as a duplicate of this bug. *** |