This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: char should be signed by default


> -----Original Message-----
> From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org] On Behalf
Of
> devils_advocate@austin.rr.com
> Sent: Wednesday, January 24, 2007 12:19 AM
> To: gcc@gcc.gnu.org
> Subject: char should be signed by default
> 
> GCC should treat plain char in the same fashion on all types of
machines
> (by default).

No.  GCC should fit in within the environment it is running in.  That's
the whole point of ABI's.  Even in the case of GNU/Linux where you had a
clean slate at the beginning, there are now existing ABI's that you need
to adhere to.
 
> The ISO C standard leaves it up to the implementation whether a char
> declared plain char is signed or not. This in effect creates two
> alternative dialects of C.

During the standards process we called those "don't chars".  But there
are other places where the standard explicitly doesn't say which
alternative an implementation should choose (whether plain bitfields
sign extend or not, whether ints are 32 or 64 bits, etc.).

> The GNU C compiler supports both dialects; you can specify the signed
> dialect with -fsigned-char and the unsigned dialect with
> -funsigned-char. However, this leaves open the question of which
dialect
> to use by default.

You use the ABI, which specifies whether chars and plain bitfields sign
extend or not.
 
> The preferred dialect makes plain char signed, because this is
simplest.
> Since int is the same as signed int, short is the same as signed
short,
> etc., it is cleanest for char to be the same.

However, I've worked on machines that did not have a signed character
instruction and you had to generate about 3 instructions to sign extend
it.

During the standards process of the original C standard (ANSI C89),
Dennis Ritchie expressed an opinion that in hindsight, making chars
signed was a bad idea, and that logically chars should be unsigned.
This is because outside of the USA, people use 8-bit character sets, and
you want to index into arrays.
 
> Some computer manufacturers have published Application Binary
Interface
> standards which specify that plain char should be unsigned. It is a
> mistake, however, to say anything about this issue in an ABI. This is
> because the handling of plain char distinguishes two dialects of C.
Both
> dialects are meaningful on every type of machine. Whether a particular
> object file was compiled using signed char or unsigned is of no
concern
> to other object files, even if they access the same chars in the same
> data structures.

No, this is the whole purpose of an ABI, to nail down all of these
niggling details.  If you use either -fsigned-char or -funsigned-char,
you are essentially breaking the ABI.  Now in the case of chars, usually
it won't bite you, but it can if you include header files with structure
fields written for the ABI.
 
> A given program is written in one or the other of these two dialects.
> The program stands a chance to work on most any machine if it is
> compiled with the proper dialect. It is unlikely to work at all if
> compiled with the wrong dialect.

It depends on the program, and whether or not chars in the user's
character set is sign extended (ie, in the USA, you likely won't notice
a difference between the two if chars just hold character values).
 
> Many users appreciate the GNU C compiler because it provides an
> environment that is uniform across machines. These users would be
> inconvenienced if the compiler treated plain char differently on
certain
> machines.

And many users appreciate that GNU C fits in with the accepted practices
on their machine.
 
> Occasionally users write programs intended only for a particular
machine
> type. On these occasions, the users would benefit if the GNU C
compiler
> were to support by default the same dialect as the other compilers on
> that machine. But such applications are rare. And users writing a
> program to run on more than one type of machine cannot possibly
benefit
> from this kind of compatibility.
> 
> There are some arguments for making char unsigned by default on all
> machines. If, for example, this becomes a universal de facto standard,
> it would make sense for GCC to go along with it. This is something to
be
> considered in the future.

Unfortunately you are usually limited by the choices you made at the
original implementation.  Any change involves a massive flag day.
 
> (Of course, users strongly concerned about portability should indicate
> explicitly whether each char is signed or not. In this way, they write
> programs which have the same meaning in both C dialects.)
> 
> 


--
Michael Meissner
AMD, MS 83-29
90 Central Street
Boxborough, MA 01719




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]