Thoughts on memcmp expansion (PR43052)

Joseph Myers joseph@codesourcery.com
Fri May 13 13:53:00 GMT 2016


On Fri, 13 May 2016, Bernd Schmidt wrote:

> On 05/13/2016 03:07 PM, Richard Biener wrote:
> > On Fri, May 13, 2016 at 3:05 PM, Bernd Schmidt <bschmidt@redhat.com> wrote:
> > > Huh? Can you elaborate?
> > 
> > When you have a builtin taking a size in bytes then a byte is 8 bits,
> > not BITS_PER_UNIT bits.
> 
> That makes no sense to me. I think the definition of a byte depends on the
> machine (hence the term "octet" was coined to be unambiguous). Also, such a
> definition would seem to imply that machines with 10-bit bytes cannot
> implement memcpy or memcmp.
> 
> Joseph, can you clarify the standard's meaning here?

* In C: a byte is the minimal addressable unit; an N-byte object is made 
up of N "unsigned char" objects, with successive addresses in terms of 
incrementing an "unsigned char *" pointer.  A byte is at least 8 bits.

* In GCC, at the level of GNU C APIs on the target, which generally 
includes built-in functions: a byte (on the target) is made of 
CHAR_TYPE_SIZE bits.  In theory this could be more than BITS_PER_UNIT, or 
that could be more than 8, though support for either of those cases would 
be very bit-rotten (and I'm not sure there ever have been targets with 
CHAR_TYPE_SIZE > BITS_PER_UNIT).  Sizes passed to memcpy and memcmp are 
sizes in units of CHAR_TYPE_SIZE bits.

* In GCC, at the RTL level: a byte (on the target) is a QImode object, 
which is made of BITS_PER_UNIT bits.  (HImode is always two bytes, SImode 
four, etc., if those modes exist.)  Support for BITS_PER_UNIT being more 
than 8 is very bit-rotten.

* In GCC, on the host: GCC only supports hosts (and $build) where bytes 
are 8-bit (though writing it as CHAR_BIT makes it clear that this 8 means 
the number of bits in a host byte).

Internal interfaces e.g. representing the contents of strings or other 
memory on the target may not currently be well-defined except when 
BITS_PER_UNIT is 8.  Cf. e.g. 
<https://gcc.gnu.org/ml/gcc/2003-06/msg01159.html>.  But the above should 
at least give guidance as to whether BITS_PER_UNIT, CHAR_TYPE_SIZE (or 
TYPE_PRECISION (char_type_node), preferred where possible to minimize 
usage of target macros) or CHAR_BIT is logically right in a particular 
place.

-- 
Joseph S. Myers
joseph@codesourcery.com



More information about the Gcc-patches mailing list