This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Re: Assignment operator

To: nathan at cs dot bris dot ac dot uk
Subject: Re: Assignment operator
From: Zoltan Kocsi <zoltan at bendor dot com dot au>
Date: Tue, 12 Oct 1999 10:50:28 +1000 (EST)
Cc: gcc at gcc dot gnu dot org
References: <14337.51260.355044.733335@tade.bendor.com.au><3801DCD8.FAAC783C@acm.org>
Nathan Sidwell writes:

 > > In the light of the above, would that be a possibility that the
 > > compiler behaviour could be changed (with a command line flag) so that
 > > the value of an assignment can be either the value of the left hand
 > > side read back (this is the current interpretation) or the value that
 > > is written to the left hand side
 > the value of an assignment is the value of the left hand operand after the
 > assignment. 
 
Yes, this is the wording of the standard which I suggested to be
changed to the value written to the LHS, but was told that you have 
to read the standard as a whole.
The standard committee says that according to the standard, 
either interpretation (value written or value read back) is valid.

The read-back IMHO is against the intended semantics of the value of
the assignment operator. The K&R books implicitly assume that any
assignment expression's value is the value *written* to the left hand
side, not the one read back. (It is actually spelled out fairly clearly 
on page 105 of the second edition). Either they assumed that all C
compilers optimise the read-back out *and* all left-hand operands live 
in regular memory (unlikely, by C's close-to-iron nature) or they have 
intended the value of the assignment being the value written to the LHS.

If this was not the case, then of course, the usual strcpy construct

   while ( *a++ = *b++ );
   
would not work since strcpy is defined to terminate on the source 
string. Egcs, for example generates read-back of '*a' even if it 
is not volatile, therefore if you try to copy the string to a
write-only or volatile buffer, egcs will blow into your face. 
Actually, Linux kernel/device driver developers better be aware of 
the fact that their strcpy() is written using the above mentioned 
loop and thus is not conforming to the C lib standard if compiled 
by egcs. Embedded system developers are in trouble too. The strcpy() 
in the C library as distributed by Cygnus is this:

    /*
    FUNCTION
    	<<strcpy>>---copy string
    
    INDEX
    	strcpy
    
    ANSI_SYNOPSIS
    	#include <string.h>
    	char *strcpy(char *<[dst]>, const char *<[src]>);
    
    TRAD_SYNOPSIS
    	#include <string.h>
    	char *strcpy(<[dst]>, <[src]>)
    	char *<[dst]>;
    	char *<[src]>;
    
    DESCRIPTION
    	<<strcpy>> copies the string pointed to by <[src]>
    	(including the terminating null character) to the array
    	pointed to by <[dst]>.
    
    RETURNS
    	This function returns the initial value of <[dst]>.
    
    PORTABILITY
    <<strcpy>> is ANSI C.
    
    <<strcpy>> requires no supporting OS subroutines.
    
    QUICKREF
    	strcpy ansi pure
    */
    
    #include <string.h>
    
    /*SUPPRESS 560*/
    /*SUPPRESS 530*/
    
    char *
    _DEFUN (strcpy, (s1, s2),
    	char *s1 _AND
    	_CONST char *s2)
    {
      char *s = s1;
    
      while (*s1++ = *s2++)
        ;
    
      return s;
    }

I hope we can agree that this, despite its claims in the header, is 
a non-conforming implementation because if you compile it with egcs,
then egcs will generate a re-fetch on *s1 and terminate the loop on 
that value. On a volatile destination it may not terminate ever, to 
the amusement of the system developer. 
The standard, however, says:

	The strcpy function copies the string pointed to by s2 (including 
	the terminating null character) into the array pointed to by s1. 
	If copying takes place between objects that overlap, the behavior 
	is undefined.

so the developer rigthfully expected it to stop after processing the
'\0' in the source string. Either every implementation of strcpy() in
the past 30 years is broken because their implementors (including the 
designers of the C language) did not gasp the correct interpretation
of the value of an assignment expression or something is wrong with 
the wording of the standard or the implementation of the compiler or 
both.

 > After assigning to a volatile what is its value? A reasonable
 > interpretation of that phrase would be to read the left hand object to
 > determine its value. Adding the switch you suggest would not aid portable
 > programming. 
 
Not having the switch does not support portable porgramming either.
For the above mentioned strcpy() while loop (without any qualifiers)
egcs generates read-back, gcc does not. 

In addition, consider the following cases:

  char a, c;
  volatile char * volatile b; 

  a = *b = c;

You get 'c', fetch 'b', store the value of 'c' to '*b'. Now the
standard says "... the value of the left hand operand ...". The left
hand operand is *b. Since b is volatile, you should re-fetch 'b' as
well, don't you ? If you don't, then you interpret the standard as
'the value of the memory location at the address which was calculated 
as the address of the left hand operand when the write took place'.
Mind you, this is exactly what gcc/egcs do. Other compilers can do as
they please, re-read 'b' or not to re-fetch *b.

  volatile int a, b;

  (void) a;
  a = (b=0);
  (void) (b=0);
  ! (b=0);
  
The first expression generates a fetch of 'a' because it's value is
referenced although it is thrown away. The second generates a read 
of b, because it is the value of the b=0 expression and this value is
referenced. The third/forth ones, however, do not generate a read of 'b'
despite the fact that just like in the second expr. it is the value of 
the b=0 expression, which, just like in the first expr. is implicitly 
referenced (by the casting or by the ! operator) then thrown away. 
Yes, the standard allows gcc to do this, but where's the consistency ?

 > No. the way to avoid these traps it to avoid the kind of construct you
 > are having problems with. When dealing with volatile objects, copy them
 > into non-volatile temporaries, do the work and then write the results
 > into the volatile objects. Then you will avoid pitfalls.

No, you don't. The compiler has every right to generate as many write
and read accesses to volatile objects as it likes, even in the simple
x = 0; construct. 
On the other hand, having explicite control over the way the compiler
treats volatile objects and some consistency in the treatment would 
allow you to use those C constructs which make C a great language
while getting predictable results. Currently, the only way to
explicitly control accesses to volatile objects is through assembly
routines. As a matter of fact, everyone should re-code a few of their 
standard lib str...() and mem...() functions in assembly if they want 
a conforming implementation. (If you restrict yourself to egcs, then a
re-coding in C is sufficient.)

Alternatively, you can generate test cases of your accesses, 
test them with your current compiler version and with every other
version thereof to see what C idioms will compile to how many and
which kind of access of various kind of objects. You can't even trust 
a single fetch or store as in x = a; or a = x; . You can *assume* that 
it will compile to a single read or write but you can't be sure. 
I wouldn't call that very portable programming ...

In addition, I use gcc in part for its non-standard extensions. 
Those make code written for gcc *really* unportable. 
My suggestion, on the other hand, does not change gcc's conformance, 
or the portability of the code. According to the standard, no code 
that accesses a volatile object, declared so or otherwise, is
portable. 

What I suggest, however, would allow those who use gcc to program
systems full of volatile things to be able to write code using
constructs which match the spirit of the K&R books (and thus the 
language, IMHO). It would also elliminates a handfull of tricky
situations (such as non-confoming strcpy()).
Providing a flag for that would make these people happy and cause no
disturbance to others. The side effect is a more predictable and 
controllable compiler behaviour.

 > If this results in code that is too slow, you have entered the realm of
 > machine specific optimizations.

Well, I think that the optimisation issues are worth a thread
themselves. This re-fetch issue thing has nothing to do with the code 
speed, it has to do with unpredictable access patterns. It doesn't
hurt you if you don't touch HW or shared memory but will burn you if 
you do.

Regards,

Zoltan
Follow-Ups:
- Re: Assignment operator
  - From: Rask Ingemann Lambertsen
References:
- Assignment operator
  - From: Zoltan Kocsi
- Re: Assignment operator
  - From: Nathan Sidwell
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]