Volatile MEMs in statement expressions and functions inlined astrees

Sun Dec 16 10:14:00 GMT 2001

>>>>> "Linus" == Linus Torvalds <torvalds@transmeta.com> writes:

> On Sun, 16 Dec 2001, Jason Merrill wrote:

>> [conv.lval]: The value contained in the object indicated by the lvalue
>> is the rvalue result.

> An assignment -sets- that value.

> So, by implication an assignment inherently _knows_ what the "value
> contained in the object" is at the time of the assignment.

It knows what value we just set.  But since the object is volatile, the
value contained in the object might have changed out from under us
immediately after the store.  Or, for all we know, stores to that address
might have no connection to loads.

>> I don't see anything in the text to suggest that assignment is special
>> in this way.

> Why not? It's certainly one tempting way to read the standard, and I still
> don't see anything that forbids my reading. Even the discussion you quoted
> really seemed to _prefer_ my reading, would you not say?

Several people seemed to want things to work your way (the C way), but in
the end there was general agreement that the actual semantics are
different.  The last note from Andy Koenig was fairly definitive, and was
what we all agreed on at the October meeting.

> So the only thing you need to buy into my world-view is to just buy into
> the semantic conversion above: "on a semantic level the value we store to
> a volatile object _is_ the value it contains immediately after the
> assignment".

I don't, sorry.  If the lvalue is volatile, the only way we can determine
the value it contains is to read it.  The whole point of volatile is to
suppress optimizations based on what we think we know about the contents of
an object; there is no special exception for assignment expressions.

> [ Ok, on to the important part ]

> What is important is being able to clearly say that assignments make
> _one_ and _exactly_ one access to a volatile object. That's a really
> useful thing to have.

I can imagine that it would be.

> Let's say that I have a debugging macro that basically looks like

> 	#ifdef DEBUG
> 	#define ASSERT(cond, string, ...) \
> 		do { if (unlikely(!(cond))) printf(x  ,##...) } while (0)
> 	#else
> 	#define ASSERT(cond, string, , ...) \
> 		do { cond; } while (0)
> 	#endif

> The way _I_ expect things to work, it should not _matter_ whether a value
> is used or not. The side effects of that expression take place, and the
> value is either discarded or not.

> The currently suggested  gcc semantics are bad. If I write

> 	ASSERT( x = result , "What? X was zero? Punt punt punt!\n");

> and "x" is volatile, then the behaviour _changes_ depending on whether the
> value of the assignment is actually used or not. So with DEBUG enabled,
> gcc would load from the volatile "x" (possibly causing side effects),
> while with DEBUG disabled, gcc would only store to it.

> See how the current suggested semantics have very non-obvious and
> non-consistent behaviour?

I see why you don't like them.  But I still don't think your suggested
semantics are conforming.  And again, nobody should be writing anything
like that with volatile variables; they should write

  x = result;

by itself.

Nor do I think the behavior I've been describing is inconsistent.  Yes,
there's a difference between 'x = y' by itself and 'x = y' used in an
rvalue context, but this is simply a consequence of C++ lvalue semantics.
It's a simple rule: If only the address of an lvalue is needed, no load is
necessary.  If the contents are needed, a load is necessary.  This is the
same for all types of lvalues, no matter where they come from, whether from
naming a variable, pointer dereference, a function returning by reference,
or an assignment expression.

The same rule means that, as you have noted, no load is necessary for the
statements 'x;' or '*p;'; both have lvalue results, and no rvalue
conversion is done.  GCC has decided to make both of these access the
stored result for programmer convenience, but the access is not required by
the language.  More correct would be to write, say, '(int)x;'.

>>>>> "Linus" == Linus Torvalds <torvalds@transmeta.com> writes:

> On 16 Dec 2001, Alexandre Oliva wrote:
>> So do you say that, in the following chunk of code:
>> 
>> volatile int i;
>> 
>> volatile int& f() { return i; }
>> 
>> i should be dereferenced, even though all we want is a reference to
>> it?

> Not in f() itself (we explicitly _ask_ for a "volatile int &"), but for
> example:

> 	volatile int i;

> 	volatile int& f() { return i; }

> 	main()
> 	{
> 		f();
> 	}

> Then I think we are semantically and syntactically on the exact same
> ground. Inside main, we _should_ dereference "i", for all the same reasons
> that most people would think that we dereference "i" when it stands on its
> own.

I disagree; the statement already has non-trivial semantics, it calls a
function.  'i;' by itself would be a complete no-op without the GCC
implementation choice.

>> If you step back on this one, you're admitting that you can't just say
>> `every lvalue always decays to rvalue, be it by using the value last
>> assigned to it, be it by accessing it'

> That's not what I'm saying at all.

> I'm saying

> 	Every statement-expression decays to a rvalue.

Nope.

  6.2  Expression statement                                  [stmt.expr]

  The  expression  is evaluated and its value is discarded.  The lvalue-
  to-rvalue (_conv.lval_), array-to-pointer  (_conv.array_),  and  func-
  tion-to-pointer  (_conv.func_) standard conversions are not applied to
  the expression.

Incidentally, Alexandre wondered earlier if a STMT_EXPR should ever be able
to have an lvalue result.  Yes, in the sense that it should be able to have
reference type.  But I think not in the sense that he meant.

Jason