This is the mail archive of the gcc-help@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: how gcc thinks `char' as signed char or unsigned char ?

From: Tom St Denis <tstdenis at ellipticsemi dot com>
To: John Love-Jensen <eljay at adobe dot com>
Cc: PRC <panruochen at gmail dot com>, GCC-help <gcc-help at gcc dot gnu dot org>
Date: Wed, 05 Mar 2008 08:33:21 -0500
Subject: Re: how gcc thinks `char' as signed char or unsigned char ?
References: <C3F3F73A.2D860%eljay@adobe.com>

John Love-Jensen wrote:

typedef char byte; byte b = GetByte(); if ((b & 0xFF) < 200) { std::cout << "byte is under 200" << std::endl; }

Presumably GetByte() would be responsible for ensuring it's range is limited to 0..255 or -128..127, so the &0xFF is not required.

The explicit (b & 0xFF) converts the byte to an int, and ensures that it is between 0 and 255.

b < 200 would work just fine if GetByte() were spec'e properly.

Also, using 'byte' instead of 'char' is a way to convey, in code (rather than in comment) that you are working with byte information and not character information.

It's more apt to think of "char" as a small integer type, not a "character type." You could have a platform where char and int are the same size for instance.

Some may advocate using 'unsigned char' for byte. I used to advocate that, too (via: typedef unsigned char byte;). After a stint doing Java development, I've changed my mind and now much prefer using an unspecified 'char' for byte (via: typedef char byte;), and employ the (b & 0xFF) paradigm when/where needed. More typing, but -- in my opinion -- much better code clarity, better self-documenting code, and less obfuscation.

In two's compliment it doesn't really matter, unless you multiply the type. You'd get a signed multiplication instead of unsigned. Which probably won't matter, but it's good to be explicit. Also shifting works differently. unsigned right shifts fill with zeros. signed shifts may fill with zeros OR the sign bit.

It's IMO a better idea to use the unsigned type (that's why it exists) and write your functions so that their domains and co-domains are well understood. If you can't figure out what the inputs to a function should be, it's clearly not clearly clearly documented. ;-)

Tom

Follow-Ups:
- Re: how gcc thinks `char' as signed char or unsigned char ?
  - From: John Love-Jensen

References:
- Re: how gcc thinks `char' as signed char or unsigned char ?
  - From: John Love-Jensen

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]