Bug 45437 - Loses reference during update
Summary: Loses reference during update
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: c++ (show other bugs)
Version: 4.4.1
: P3 normal
Target Milestone: 4.7.0
Assignee: Jason Merrill
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2010-08-28 03:12 UTC by Ivan Godard
Modified: 2011-07-09 03:34 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2010-09-02 15:50:13


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ivan Godard 2010-08-28 03:12:43 UTC
This code:
#include <iostream>
struct s{
        s(bool& bb) : i(0), b(bb), j(0) {}
    int i;
    bool& b;
    int j;
    };

bool    f(s s1, s s2, s s3, int k) {
            s3.b = true;
            return false;
            }
int main() {
        bool bz = false;
        s sz(bz);
        sz.b |= f(sz, sz, sz, 3);
        std::cerr << sz.b << "\n";;
        return 0;
        }

prints this:
s3:~/ootbc$ g++ foo.cc
s3:~/ootbc$ a.out
0

It appears that "a |= f()" is being compiled as if it were the same as "a = a | F", with the compiler free to order the evaluation of a and f() in "a | f()" in any way it pleases. The order is exposed when f() has a side effect on a. However, my understanding of "a |= f()" is that it must be evaluated as if it were:
    bool& c = a;
    bool d = f();
    operator|=(c, d)
That is, the first argument to "|=" is a reference, not a value. Thus, both arguments (the reference and the function call) must be evaluated *before* "|=" is called and the reference will have already seen the side effect before "|=.
Comment 1 Andrew Pinski 2010-08-28 03:34:40 UTC
This code is undefined.

        sz.b |= f(sz, sz, sz, 3);

Has two setting of sz.b without a sequence point.
That is it can be interrupted as either:
bool temp = sz.b;
bool temp1 = f(sz, sz, sz, 3);
sz.b = temp | temp1;
or:

bool temp1 = sz.b;
bool temp = f(sz, sz, sz, 3);
sz.b = temp1 | temp;

>is that it must be evaluated as if it were:

No it must be evaluated as if it was:
    operator|=(a, f())
Comment 2 Ivan Godard 2010-08-28 04:11:53 UTC
"No it must be evaluated as if it was:
    operator|=(a, f())"

Exactly. The arguments (a and f()) must be evaluated to their parameter types (bool& and bool) before the call to |=. There *is* a sequence point; it's the call of |=, and the compiler is not permitted to move the side effect of f() into the body of |=, nor is it permitted to dereference "a" before the call. To quote Wikipedia: "A sequence point in C or C++ is ... Before a function is entered in a function call. The order in which the arguments are evaluated is not specified, but this sequence point means that all of their side effects are complete before the function is entered." citing 1998 standard section 1.9, paragraphs 16–18.

Please consult your colleagues on this.
Comment 3 Andrew Pinski 2010-08-28 04:15:15 UTC
>There *is* a sequence point.

No, the comma for function arguments is not a sequence point.  So a or f() could be evaluated first.  And then really |= is not a real function :).  It is an operator which the operator does not have a sequence point related to it.  This is why overriding || does not provide a sequence point.
Comment 4 Ivan Godard 2010-08-28 04:32:36 UTC
Yes, I understand that the comma is not a sequence point, and a may be evaluated (to a&) in any order w/r/t f() (to bool). But it is not legal to evaluate a to bool before the call of |=, because |= takes <emp>bool&</emp>, not <emp>bool</emp>.

If you still don't get my point here please check with someone else. This isn't the usual side-effect idiocy that you usually get. :-)

Suppose we have:
void   g(bool& bref, bool b) { std::cerr << bref << "\n"; }
bool   val = false;
bool   f(bool& bf) { bf = true; return false; }
g(val, f(val));

you would agree that the language requires that f be called before g and it prints true because val is never dereferenced as an argument, yes?
Comment 5 Richard Biener 2010-08-28 11:40:33 UTC
Note that internally there is no such thing as an operator|= for fundamental
types, but things are treated like in C.  If you were in C,

 sz.b |= f (&sz, &sz, &sz, 3);

there is no sequence point before |= as it's not a function call - and I am
not sure your reading of C++ is correct that there is a function call
to operator|= and thus a sequence point before it.  In fact 5.17/7 says
that E1 op= E2 is equal to E1 = E1 op E2 except that E1 is evaluated only
once.

I can confirm though that EDG returns 1 for your example (which by our
reading would be an as valid result).
Comment 6 Ivan Godard 2010-08-28 17:49:49 UTC
Thank you. Don't know about C, but this is C++ in which operators are function. BTW, even in C the standard goes to some lengths in places to make things that look like functions but have odd semantics be defined as macros so as to avoid the semantic guarantees.

If the standard says that C++ argument evaluation semantics is different for the arguments of a pre-defined function vs the arguments of a user-declared overload of the same function then this report is invalid. And I would be very surprised :-)

Is Gabe still with you guys? I'd guess he'd know.
Comment 7 Jonathan Wakely 2010-08-28 23:48:17 UTC
(In reply to comment #6)
> Thank you. Don't know about C, but this is C++ in which operators are function.

Builtin operators are not functions.

See e.g. footnote 12 on 1.9p18 in C++ 2003 which makes it clear that builtin operators have very different effects wrt sequence points from user-defined functions:

12) The operators indicated in this paragraph are the built-in operators, as described in clause 5. When one of these operators is over-loaded (clause 13) in a valid context, thus designating a user-defined operator function, the expression designates a function invocation, and the operands form an argument list, without an implied sequence point between them.


Comment 8 Jonathan Wakely 2010-08-29 00:55:24 UTC
The sequencing rules have changed in C++0x, but G++ doesn't implement them yet AFAIK, and I'm not sure if the new rules affect this example
Comment 9 Jonathan Wakely 2010-08-29 11:28:55 UTC
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1944 proposed the changes to sequencing wording, revised in http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2239.html

The new wording makes it clear that a compound assignment such as |= has this sequence:
- evaluation of operands
- assignment
- evaluation of assignment expression i.e. evaluating result of (a|=b) for use in another expression

However I don't think that affects this case.

The builtin |= operator is not a function (see 5p2) and does not obey the rules of operand types of function calls, so it is not appropriate to describe the effects in terms of 
    bool& c = a;
    bool d = f();
    operator|=(c, d)
The operand to the builtin |= is an lvalue expression, not a reference. 

The new wording also makes it clear it is only evaluated once, before the assignment.  It is not specified whether evaluation of the left operand is sequenced before evaluation of the right operand, therefore the compiler is still permitted to evaluate sz.b (as false) then evaluate f(...) ( also false) then perform the assignment of sz.b=(false|false)

Thus, invalid.
Comment 10 Ivan Godard 2010-08-29 18:00:52 UTC
I agree that Nelson's proposal (in particular 5.17p1 -assignment and compound assignment operators) defines the ordering as:
- evaluation of operands
- assignment
- evaluation of assignment expression i.e. evaluating result of (a|=b) for use
in another expression

However you beg the question because you assume that "evaluation of operands" means "evaluation of rvalues derived from the operands". It does not; it means evaluation of an lvalue (i.e. a reference) and an rvalue respectively, because otherwise the assignment would not have an lvalue to assign to. Hence the function f() must be called before "|=" (the assignment) is performed.

I have left the report as "Resolved Invalid" because I'm sure that you are sick of me reopening it. However, that there is such a disagreement on the language semantics suggests that Nelson's wording is unclear at best and ambiguous at worse, and I have sent him a note to that effect.
Comment 11 Ivan Godard 2010-08-29 18:24:44 UTC
Note to Nelson, for the record here:
There is a disagreement about C++ sequence semantics happening in
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45437

The gist is the following code:
    bool b = false;
    bool f() { b = true; return false; }
    b |= f();
The question is whether this is well defined (with b having the value true after the assignment) or undefined. Your proposal states:

5.17p1 (assignment and compound assignment operators):

    The assignment operator (=) and the compound assignment operators all group right-to-left. All require a modifiable lvalue as their left operand and return an lvalue with the type and value of the left operand after the assignment has taken place an lvalue referring to the left operand. The result in all cases is a bit-field if the left operand is a bit-field. In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression. With respect to an indeterminately-sequenced function call, the operation of a compound assignment is a single evaluation. [ Note: Therefore, a function call shall not intervene between the lvalue-to-rvalue conversion and the side effect associated with any single compound assignment operator. 容nd note ]

All agree that this governs; the question is what "the value computation of the right and left operands" of the |= assignment means.

In one interpretation, it means the rvalue evaluation of b and f(); in this interpretation the ordering of the operand evaluations is undefined and hence the value of b at and after the assignment is also undefined. It is unclear how the assignment receives an lvalue to assign to in this interpretation.

In the second interpretation, it means the lvalue evaluation of b and the rvalue evaluation of f(); in this interpretation the lvalue-to-rvalue conversion of b takes place within the assignment, so consequently f() must be evaluated before the conversion takes place. Hence, the assignment receives the (lvalue) b with a well-defined rvalue of true, and the value after the assignment is also well-defined and true.

Whichever of these interpretations you had intended, the extent of the disagreement suggests that the proposed wording, while fine for simple assignment, might be clarified for the compound assignment operators.

Respectfully,

Ivan Godard

Comment 12 Jonathan Wakely 2010-08-29 20:50:34 UTC
(In reply to comment #10)
> 
> However you beg the question because you assume that "evaluation of operands"
> means "evaluation of rvalues derived from the operands".

I assume nothing of the sort.

> It does not; it means
> evaluation of an lvalue (i.e. a reference)

Of course it means an lvalue, but that does not imply a reference.  If you read Clause 5 you'll see that the requirements on operands are in terms of lvalues, rvalues etc. and not references.  

You're still assuming that invocation of a builtin operator results in a function call, and reasoning in terms of the requirements of a function call.

5p2 makes it clear that the requirements on operands of builtin operators are different to the requirements on function calls. If that was not the case, why would the standard say that replacing a builtin operator with a user-defined oeprator uses the requirements of a function call?

> and an rvalue respectively, because
> otherwise the assignment would not have an lvalue to assign to. Hence the
> function f() must be called before "|=" (the assignment) is performed.

Yes, f() is called before the assignment, but it is indeterminately sequenced w.r.t the evaluation of sz.b

Comment 13 Jonathan Wakely 2010-08-29 22:39:51 UTC
Here's a reduced testcase, struct s is not relevant:

bool f(bool& b) {
  b = true;
  return false;
}

int main() {
  bool b = false;
  b |= f(b);
  return b;
}
Comment 14 Ivan Godard 2010-08-31 02:20:39 UTC
Reopened, based on following communication from Clark Nelson
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

> > In one interpretation, it means the rvalue evaluation of b and f(); in
> > this interpretation the ordering of the operand evaluations is undefined
> > and hence the value of b at and after the assignment is also undefined.
> > It is unclear how the assignment receives an lvalue to assign to in this interpretation.

I don't see how this interpretation is viable. As you point out, if you evaluate the LHS as an rvalue, there's no way to do the assignment.

> > In the second interpretation, it means the lvalue evaluation of b and the
> > rvalue evaluation of f(); in this interpretation the lvalue-to-rvalue
> > conversion of b takes place within the assignment, so consequently f()
> > must be evaluated before the conversion takes place. Hence, the assignment
> > receives the (lvalue) b with a well-defined rvalue of true, and the value
> > after the assignment is also well-defined and true.

I agree with this. (Except the part where you describe lvalue-to-rvalue conversion of b happening within the assignment. That conversion doesn't actually happen at all. Not that that matters to the conclusion.)


I looked at the discussion in the bug. The issue has nothing to do with whether the evaluation of |= acts like a function call (which of course it doesn't).

Instead, it has to do with whether the evaluation of |= is allowed to overlap with the execution of f.

Even before the turn of the century, function calls were described as having sequence points at their beginning and end, and that was always intended to be interpreted as disallowing overlapping execution between caller and callee.

But under the new formulation, the non-overlapping requirement is stated even more clearly. In C++, see 5.17p1:

With respect to an indeterminately-sequenced function call, the operation of a compound assignment is a single evaluation.

In other words, it is not allowed to implement |= by fetching b, then calling f, evaluating the result of |, and storing that result. The evaluation of |= has to happen either entirely before or entirely after the execution of the called function, and in this case it clearly can't happen before. So this code has perfectly well-defined behavior, and it's what Ivan expects.

Now, all that being said, the "new rules" were intended primarily just to be a clearer formulation of what the old rules always were. It could be argued that this is just a "substantive clarification" of the old rules. But, for example, I didn't put it in of my own initiative; I was specifically asked to make things work this way.

Clark

Comment 15 Jason Merrill 2010-09-02 15:50:13 UTC
Indeed, C++0x 5.17p1 is quite clear.
Comment 16 Patrick Horgan 2011-03-23 07:03:09 UTC
Confirming that this is still occurring in gcc (GCC) 4.6.0 20110113 (experimental) where this code:

int
main()
{
    bool b1, b2, b3;
    b1 = b2 = b3 = true;
    b1 |= b2 |= b1 |=b3;
}

results in the error message:

testit.cpp:10:24: warning: operation on ‘b1’ may be undefined [-Wsequence-point]
Comment 17 Jason Merrill 2011-07-09 03:33:56 UTC
Author: jason
Date: Sat Jul  9 03:33:54 2011
New Revision: 176072

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=176072
Log:
	PR c++/45437
gcc/
	* gimplify.c (goa_stabilize_expr): Handle RHS preevaluation in
	compound assignment.
gcc/c-family/
	* c-omp.c (check_omp_for_incr_expr): Handle preevaluation.
gcc/cp/
	* typeck.c (cp_build_modify_expr): Preevaluate RHS.

Added:
    trunk/gcc/testsuite/g++.dg/expr/compound-asn1.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/c-family/ChangeLog
    trunk/gcc/c-family/c-omp.c
    trunk/gcc/cp/ChangeLog
    trunk/gcc/cp/typeck.c
    trunk/gcc/gimplify.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/g++.dg/warn/sequence-pt-1.C
Comment 18 Jason Merrill 2011-07-09 03:34:53 UTC
Fixed for 4.7.0.