Bug 110348 - [C++26] P2741R3 - User-generated static_assert messages
Summary: [C++26] P2741R3 - User-generated static_assert messages
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: c++ (show other bugs)
Version: 14.0
: P3 normal
Target Milestone: ---
Assignee: Jakub Jelinek
URL:
Keywords:
Depends on:
Blocks: c++26-core
  Show dependency treegraph
 
Reported: 2023-06-21 16:25 UTC by Marek Polacek
Modified: 2023-09-12 15:21 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2023-06-21 00:00:00


Attachments
gcc14-pr110348-wip.patch (2.39 KB, patch)
2023-08-23 16:46 UTC, Jakub Jelinek
Details | Diff
gcc14-pr110348.patch (5.17 KB, patch)
2023-08-24 11:22 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Marek Polacek 2023-06-21 16:25:10 UTC
See <https://wg21.link/P2741R3>.
Comment 1 Andrew Pinski 2023-06-21 22:47:11 UTC
Confirmed.
Comment 2 Jakub Jelinek 2023-08-23 16:46:01 UTC
Created attachment 55780 [details]
gcc14-pr110348-wip.patch

Untested WIP on this.  Now need to figure out which usual cases I'm going to handle an easy way (I think STRIP_NOPS of ADDR_EXPR of STRING_CST, POINTER_PLUS of that plus INTEGER_CST, ADDR_EXPR of a VAR_DECL with STRING_CST DECL_INITIAL,
ADDR_EXPR of ARRAY_REF of a VAR_DECL with STRING_CST DECL_INITIAL/STRING_CST
for start, for others I'll need to build ARRAY_REF for each char and evaluate).
Plus testcase coverage.
It isn't clear what will or should happen if one uses some special execution
character set, because then strings and literals will be translated into that
execution character set and writing that as message might be weird.
Comment 3 Jakub Jelinek 2023-08-23 20:23:06 UTC
I wonder if the paper wording isn't incorrect, or at least comparing the clang++
implementation vs. the paper gives some differences.

One (minor) is that they emit errors when the size () and/or data () members aren't constexpr, while the paper as voted in only requires that
"- the expression M.size() is implicitly convertible to std::size_t, and
- the expression M.data() is implicitly convertible to ”pointer to const char”."
unless the static assertion fails.  The WIP patch doesn't do that, only effectively diagnoses it during constant evaluation of those when the static assertion fails.

More important, they have in the testcase something similar to what I filed in PR111122, but let's use what works also in GCC:
struct T {
  const char *d = init ();
  constexpr int size () const { return 2; }
  constexpr const char *data () const { return d; }
  constexpr const char *init () const { return new char[2] { 'o', 'k' }; }
  constexpr ~T () { delete[] d; }
};
constexpr int a = T{}.size (); // Ok, a = 2
constexpr int b = T{}.data ()[0]; // Ok, b = 'o'
constexpr const char *c = T{}.data (); // Not constant expression, because it returns
// address from new which is later in the dtor deleted.
static_assert (false, T{}); // Valid?

"- M.data(), implicitly converted to the type ”pointer to const char”, shall be a core
constant expression and let D denote the converted expression,

– for each i where 0 ≤ i < N , D[i] shall be an integral constant expression, and"

Now, I believe T{}.data () is not a core constant expression exactly because it returns address of later deleted heap object, but sure, both T{}.data ()[0] and T{}.data ()[1]
are integral constant expressions.

I don't know how std::format if constexpr (is it?) or string_view etc. work, do they
need M.data () not be actual constant expression and only M.data ()[0] through M.data ()[M.size () - 1] constant expressions?  In the patch I can surely try to constant expr evaluate M.data () quietly and if it isn't constant expression, just use a slower way which will ask for each character individually.  More important question is what is the intention for the standard...
Comment 4 Jakub Jelinek 2023-08-24 11:22:23 UTC
Created attachment 55785 [details]
gcc14-pr110348.patch

Full untested patch.
Comment 5 Jonathan Wakely 2023-08-24 11:56:31 UTC
(In reply to Jakub Jelinek from comment #3)
> I wonder if the paper wording isn't incorrect, or at least comparing the
> clang++
> implementation vs. the paper gives some differences.
> 
> One (minor) is that they emit errors when the size () and/or data () members
> aren't constexpr, while the paper as voted in only requires that
> "- the expression M.size() is implicitly convertible to std::size_t, and
> - the expression M.data() is implicitly convertible to ”pointer to const
> char”."
> unless the static assertion fails.  The WIP patch doesn't do that, only
> effectively diagnoses it during constant evaluation of those when the static
> assertion fails.

I agree with your WIP patch. The requirements for data() and size() to be constant expressions are in p11 (11.2) which only apply if the static assertions fails.

 
> More important, they have in the testcase something similar to what I filed
> in PR111122, but let's use what works also in GCC:
> struct T {
>   const char *d = init ();
>   constexpr int size () const { return 2; }
>   constexpr const char *data () const { return d; }
>   constexpr const char *init () const { return new char[2] { 'o', 'k' }; }
>   constexpr ~T () { delete[] d; }
> };
> constexpr int a = T{}.size (); // Ok, a = 2
> constexpr int b = T{}.data ()[0]; // Ok, b = 'o'
> constexpr const char *c = T{}.data (); // Not constant expression, because
> it returns
> // address from new which is later in the dtor deleted.
> static_assert (false, T{}); // Valid?

Interesting one. The call to .data() occurs before the destructor, so I would naively expect it to be valid. I think it should be equivalent to:

constexpr int i = (T{}.data(), 0);

The data() pointer is valid during the lifetime of the T prvalue. I would expect that a failed static assertion creates some kind of scope for the constant expression M, and evaluation of the data() string occurs before the destruction of any objects created in that scope.

> I don't know how std::format if constexpr (is it?) or string_view etc. work,

std::format isn't a constexpr function. The "with this proposal" example in P2741R3 doesn't actually work with that proposal, it would require changes to std::format that haven't been proposed for the standard.

> do they
> need M.data () not be actual constant expression and only M.data ()[0]
> through M.data ()[M.size () - 1] constant expressions?

string_view doesn't require a data() function in any way, constexpr or not. It just takes a pointer and optional length.

> In the patch I can
> surely try to constant expr evaluate M.data () quietly and if it isn't
> constant expression, just use a slower way which will ask for each character
> individually.  More important question is what is the intention for the
> standard...

I think it has to be a constant expression, the slower fallback shouldn't be needed IMHO.
Comment 6 Jonathan Wakely 2023-08-24 12:10:38 UTC
(In reply to Jonathan Wakely from comment #5)
> I agree with your WIP patch. The requirements for data() and size() to be
> constant expressions are in p11 (11.2) which only apply if the static
> assertions fails.

In other words, I think the paper is clear and clang is wrong here.

Although arguably what clang does is more useful. I'm not sure why you'd want to use a non-constexpr size() or data() that only compiles as long as the static assertion passes. It means you won't find out that your user-generated message can't actually be printed until you run on a target where the assertion fails.

I suppose there could be a case where those functions are only constexpr sometimes, and that happens to coincide with exactly the conditions where the assertion fails. That seems unlikely though. It seems more likely that you would use an assertion to terminate compilation when your code is *missing* features, not make it fail only when features are present.

The wording voted into the draft also seems counter to the design expressed in the paper that says "The message-producing expression is intended to be always instantiated, but only evaluated if the assertion failed." Does instantiating it require it to be a valid constant expression, even if not evaluated?
Comment 7 Jonathan Wakely 2023-08-24 12:19:14 UTC
(In reply to Jonathan Wakely from comment #6)
> Although arguably what clang does is more useful. I'm not sure why you'd
> want to use a non-constexpr size() or data() that only compiles as long as
> the static assertion passes. It means you won't find out that your
> user-generated message can't actually be printed until you run on a target
> where the assertion fails.

In the general case, the user-generated string might only be a constant expression for some inputs, which is why it's unevaluated unless the assertion fails. And the member functions being non-constexpr is only one of many ways in which those functions could fail to be usable in constant expressions. So just checking for constexpr members isn't sufficient to avoid the problem of a message that can never be printed.
Comment 8 Jakub Jelinek 2023-08-24 12:28:04 UTC
(In reply to Jonathan Wakely from comment #6)
> (In reply to Jonathan Wakely from comment #5)
> > I agree with your WIP patch. The requirements for data() and size() to be
> > constant expressions are in p11 (11.2) which only apply if the static
> > assertions fails.
> 
> In other words, I think the paper is clear and clang is wrong here.
> 
> Although arguably what clang does is more useful. I'm not sure why you'd
> want to use a non-constexpr size() or data() that only compiles as long as
> the static assertion passes. It means you won't find out that your
> user-generated message can't actually be printed until you run on a target
> where the assertion fails.

Sure, if the standard is changed such that size() and data() must be constexpr,
it would be nice to check it.  Without full evaluation one can't guarantee it will
be a constant expression and there could be tons of other reasons why it isn't constant expression (say it returns some class and conversion operator isn't constexpr, or it throws, or has asm and many other reasons).

For the M.data () not being a constant expression, it depends on the exact wording as well.  Either the standard could drop the requirement that it is a core constant expression altogether, then e.g. nothing will require that (M.data (), 0) is constant expression when M.size () is 0.  Or it could say that (M.data (), 0) is an integer constant expression, then one can verify that, and then quietly try if M.data () is a constant expression; if it is, it can grab the message from it using say GCC's c_getstr if it works.  If it isn't, it would need to evaluate it character by character.

BTW, there is a third difference between my latest patch and clang++ implementation.
They reject static_assert (false, "foo"_myd); which I have in the testcase.  IMHO
"foo"_myd doesn't match the syntactic requirements of unevaluated-string as the https://eel.is/c++draft/dcl.pre#10 wording says, because unevaluated-string non-terminal is string-literal with some extra rules, while user-defined-literal is some other non-terminal.  And as I've tried to show in the testcase, a constexpr operator ""
can return something on which .size () / .data () can be called and can satisfy the requirements.
Comment 9 corentinjabot 2023-09-12 11:30:08 UTC
(In reply to Jakub Jelinek from comment #3)
> I wonder if the paper wording isn't incorrect, or at least comparing the
> clang++
> implementation vs. the paper gives some differences.
> 
> One (minor) is that they emit errors when the size () and/or data () members
> aren't constexpr, while the paper as voted in only requires that
> "- the expression M.size() is implicitly convertible to std::size_t, and
> - the expression M.data() is implicitly convertible to ”pointer to const
> char”."
> unless the static assertion fails.  The WIP patch doesn't do that, only
> effectively diagnoses it during constant evaluation of those when the static
> assertion fails.

During review in clang we felt that it diagnosing it it all cases
would be preferable to our users, as otherwise errors only manifest when the static assertion fails,
likely at a point where the person getting the diagnostic would not be able to act on it.
So we made it a warning that defaults to an error.

Note that core felt strongly we should not check for constant expressions at the time, 
but maybe opinions changed?

> 
> More important, they have in the testcase something similar to what I filed
> in PR111122, but let's use what works also in GCC:
> struct T {
>   const char *d = init ();
>   constexpr int size () const { return 2; }
>   constexpr const char *data () const { return d; }
>   constexpr const char *init () const { return new char[2] { 'o', 'k' }; }
>   constexpr ~T () { delete[] d; }
> };
> constexpr int a = T{}.size (); // Ok, a = 2
> constexpr int b = T{}.data ()[0]; // Ok, b = 'o'
> constexpr const char *c = T{}.data (); // Not constant expression, because
> it returns
> // address from new which is later in the dtor deleted.
> static_assert (false, T{}); // Valid?


See https://github.com/cplusplus/CWG/issues/350, because i was confused too.
`data()` is a core constant expression. the implementation should behave _as if_ `T{}.data ()[N]` is evaluated for each `N`
even if that would be pretty bad implementation strategy.

> 
> "- M.data(), implicitly converted to the type ”pointer to const char”, shall
> be a core
> constant expression and let D denote the converted expression,
> 
> – for each i where 0 ≤ i < N , D[i] shall be an integral constant
> expression, and"
> 
> Now, I believe T{}.data () is not a core constant expression exactly because
> it returns address of later deleted heap object, but sure, both T{}.data
> ()[0] and T{}.data ()[1]
> are integral constant expressions.
> 
> I don't know how std::format if constexpr (is it?) or string_view etc. work,
> do they
> need M.data () not be actual constant expression and only M.data ()[0]
> through M.data ()[M.size () - 1] constant expressions?  In the patch I can
> surely try to constant expr evaluate M.data () quietly and if it isn't
> constant expression, just use a slower way which will ask for each character
> individually.  More important question is what is the intention for the
> standard...
Comment 10 Jakub Jelinek 2023-09-12 12:03:11 UTC
(In reply to corentinjabot from comment #9)
> During review in clang we felt that it diagnosing it it all cases
> would be preferable to our users, as otherwise errors only manifest when the
> static assertion fails,
> likely at a point where the person getting the diagnostic would not be able
> to act on it.
> So we made it a warning that defaults to an error.

Then it should be a warning rather than error IMHO.  Because it isn't invalid, just
likely unintended.

> See https://github.com/cplusplus/CWG/issues/350, because i was confused too.
> `data()` is a core constant expression. the implementation should behave _as
> if_ `T{}.data ()[N]` is evaluated for each `N`
> even if that would be pretty bad implementation strategy.

Jason, do we have a way to test whether something is a core constant expression in the FE?  Seems the https://eel.is/c++draft/expr.const#13 checks are done in cxx_eval_outermost_constant_expression and I don't see a way to ignore them.

Because for N > 0, I think we can as well check it just by evaluating T{}.data ()[0]
etc., but for N == 0 we can't.
Comment 11 Jakub Jelinek 2023-09-12 12:23:18 UTC
(In reply to Jakub Jelinek from comment #10)
> Jason, do we have a way to test whether something is a core constant
> expression in the FE?  Seems the https://eel.is/c++draft/expr.const#13
> checks are done in cxx_eval_outermost_constant_expression and I don't see a
> way to ignore them.
> 
> Because for N > 0, I think we can as well check it just by evaluating
> T{}.data ()[0]
> etc., but for N == 0 we can't.

Maybe that is what Jonathan was suggesting, see if (T{}.data (), 0) is a constant expression.
Comment 12 Jakub Jelinek 2023-09-12 13:14:35 UTC
BTW, shall size() and data() be manifestly constant-evaluated?
I think it doesn't satisfy any of the https://eel.is/c++draft/expr.const#19
bullets (unlike first static_assert argument).
Comment 13 Jason Merrill 2023-09-12 15:20:41 UTC
(In reply to Jakub Jelinek from comment #12)
> BTW, shall size() and data() be manifestly constant-evaluated?
> I think it doesn't satisfy any of the https://eel.is/c++draft/expr.const#19
> bullets (unlike first static_assert argument).

Good point, I think we're missing some wording to make that all manifestly constant-evaluated; it absolutely should be.

(In reply to Jakub Jelinek from comment #10)
> Then it should be a warning rather than error IMHO.  Because it isn't
> invalid, just likely unintended.

Agreed.