This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Big-endian Gcc on Intel IA32

From: Linus Torvalds <torvalds at transmeta dot com>
To: <dewar at gnat dot com>
Cc: <gcc at gcc dot gnu dot org>
Date: Mon, 17 Dec 2001 13:40:04 -0800 (PST)
Subject: Re: Big-endian Gcc on Intel IA32

On Mon, 17 Dec 2001 dewar@gnat.com wrote:
>
> This is a much trickier language feature to design than you would imagine.
> We have been struggling with this in Ada for a while.

Hmm.. It sounds like one of those "obvious in principle" things, but I can
imagine that it falls afoul of a lot of the gcc optimizations (ie x86.md
has a pattern for doing "load + and $255" with a "movzbl" instruction,
which is legal only on little-endian data: on big-endian you can still do
it, but you have to modify the address).

That's just the _really_ obvious kind of problem I can imagine off-hand. I
assume you've seen many many more..

However, I think that the most _fundamental_ problem is completely
independent of whether a simple and good implementation for gcc is even
feasible: it's not even clear that a byte-order attribute necessarily
helps porting of legacy applications all that much.

The problem is pointers do data - you must _never_ lose the byte-order
attribute by mistake, and you must never mix them. And a compiler (and
particularly a C compiler) has a really hard time asserting that people
don't mis-use pointers, with "void *" often being used as a "whatever".

So I realize that a lot of code is byte-order dependent exactly because
the code itself uses the same pointer in different ways (ie what happens
when you pass a byte-order-aware pointer to something like "memcpy()"?
It's ok if _both_ pointers are of the same byte order and the same type,
but not in general. And that's the _easy_ case, with a standard function
that the compiler could check for).

So it may be that the feature itself is simply not very helpful, simply
because it's so hard to retrofit existing programs even if you had some
compiler support for the notion.

So the actual _implementation_ on a gcc level might be the least of your
troubles.

That said, it still sounds like one of those dangerously "simple and
clever" ideas.

On a tangential issue:

I actually think that it might be equally powerful to just have a way of
"tainting" certain pointers, and disallowing their use at compile-time
unless the recipient claims to accept the specific form of "tainting".
This is, in fact, more-or-less what the "const" qualifier does, but it
might be useful to allow user-defined "taints".

The reason this is tangential is that byte-order would be one such
potential use of "tainting" - not so much for compiler-assisted code
generation, but simply for compiler-assisted type-checking: allowing the
person who gets stuck with the job of fixing byte-order problems to
"taint" the pointers with byte-order information, and make the compiler
warn about it when a pointer is ever passed into any function that doesn't
expect that byte-order.

So the byte-order-attribute thing doesn't actually have to affect code
generation to be potentially useful.

(Inside the kernel, I'd love to be able to taint pointers and data that
came from user space, for example, to make sure that the compiler will
refuse to even _compile_ code that uses such data without the proper
safety checks. This is not all that different from keeping track of what
byte-order a specific datum has).

Ehh?

		Linus

Follow-Ups:
- Re: Big-endian Gcc on Intel IA32
  - From: guerby
- Re: Big-endian Gcc on Intel IA32
  - From: Richard Henderson
- Re: Big-endian Gcc on Intel IA32
  - From: Ross Smith

References:
- Re: Big-endian Gcc on Intel IA32
  - From: dewar

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]