This is the mail archive of the egcs@egcs.cygnus.com mailing list for the EGCS project. See the EGCS home page for more information.
[People tired of my explanations can skip to the bottom for a question
that is actually *pertinent* to the underlying issue. And, no, this
issue has not been "solved" per se, though some of the specifics have
been, for the time being, I gather. Search for "Getting back".]
>Indeed. Right now, to me, it seems like there is _no_ sane way to do
>what I want to do, and rely on what gcc will actually do. This is the
>result of having a bad definition of what things do.
Well...I look at it as the result of having a product (gcc) that
is trying to serve one kind of need and thus isn't really a good
fit for other kinds of needs.
>This is why I objected to the early suggestion to just make the
>documentation even more uncertain on exactly what "inline" did. I think
>the problem in the first place was documentation that said "almost" and
>"mostly" and "maybe" and "in a blue moon".
It should have been, and will become, even more vague, and intentionally
so. The "as if" rule should be more clearly alluded to in the gcc
docs; meanwhile, programmers reading docs about extensions to any
compilers should *assume* the "as if" rule still applies (in the sense
that it applies to the underlying standardized language) unless clearly
documented otherwise.
At least, I think this is the direction the gcc developers would choose
to go, as it avoids the problems resulting from gcc overcommitting
to a particular implementation of "inline" that will be clearly
suboptimal someday, speaking from a SWYM perspective.
>When you have documentation like that, you can't but rely on
>implementation, because the documentation doesn't actually give you
>anything to depend on. And then we end up in a situation where people
>like me depend on implementation and implementors point fingers at bad
>documentation.
The idea is to not rely on particular implementations compilers choose
*at all*, or, if you must for some extra-linguistic reason, you have
to take on all the responsibilities that entails. In other words,
if the compiler developers have not *guaranteed* you a particular
behavior given a particular combination of language constructs, now
and forever, don't rely on it, or, if you do, plan for the day when
it stops working as you expect.
I'm not saying that's easy. :)
>Testability goes to hell too.
Um...no, it's actually much easier in the end, because the compiler
is considered to "work" as long as it meets the specifications for
the language (which equals standard + specified extensions), rather
than do that *plus* meet various, somewhat arbitrary, expectations
regarding implementation choices.
What makes testability worse is any combination of too-large a
language standard ("large" being not just language features, but
excess specifications of how the language is processed) and
too many extensions.
What would make testability of gcc easier in the DWIS-like direction
you favor is to rededicate it to conform to some newly specified
model of the underlying abstract state machine and of how compilers
generate code, and permanently eliminate all optimization phases that
would violate that model.
That would make things like loop unrolling go away, of course, but
clearly, with lots less "fancy" optimization, testing gcc would
be easier.
And, for a given amount of complexity, a DWIS system is, I think,
quite a bit easier to test (and validate) than a SWYM one, indeed.
The problem for gcc is that it's SWYM+DWIS right now, every
DWIS addition makes the problem worse, not better, and it's
almost *guaranteed* we'll be making SWYM additions as well anyway.
(It occurred to me, late last night, that part of the SWYM/DWIS
conflict here is that "inline", loosely speaking, has a pretty
clear meaning for compiler *developers* that is somewhat different
than what it means to the *users* of compilers. By "exporting" this
term to the outside world via constructs like "inline", that
conflict has been exposed. Do the compiler developers then promise
to forever give up their ideas of what "inline" means and the
optimizations they could possibly achieve, or do their users give
up their ad-hoc ideas of what it means? Pertinent language
standards will almost certainly evolve in the former direction,
and I expect gcc will too.)
>And yes, it's made worse by the fact that there _are_ actually sane ways
>to do _exactly_ what I want to do, but they don't work with older
>versions. People suggest I use #ifdef's etc, but that doesn't exactly
>help testability either, and makes one of the points of using inline
>functions go away completely - readability.
Right. I think it's just a bad fit, using gcc (or almost any C
compiler) for that purpose. I think a DWIS-like C compiler probably
exists out there, or, if not, it would be a great project to
undertake, as something using an almost entirely different code
base than gcc (though, offhand, I think it could start out sharing
the front ends, but that's about as far as it goes).
>I guess I count as a DWIS person - I think that just is what C is
>designed for. In systems programming, you just can't let the compiler
>do too many of the decisions for you.
Right. C was probably viewed as DWIS by its developers early on.
Of course, the concept of SWYM was considered pretty hard to
practically implement, so in effect, *everything* was DWIS,
except some things like the early Fortran compilers and such.
>Obviously, C++ is very much a SWYM environment, and the compiler copes
>best it can. That's one of the reasons we very quickly discarded C++ as
>a kernel language (we used to compile the Linux kernel with C++ a
>_loong_ time ago, back when people still thought OO was a sexy
>"must-have"). The compiler was just too unpredictable in many ways.
I'd say that's a pretty solid judgement, and would posit that what
you're running into is a combination of "feature creep" of SWYM-like
characteristics into the C language and especially the gcc (vs. g++)
compiler, and of optimizations (put into gcc) that rely more and more
on assuming C programmers are really SWYM'ers, not DWIS'ers. (Actually,
you're not really running into this *yet*; I'm getting ahead of myself,
as my main concern over these posts is to help everyone understand
this problem is going to get worse, not better, in the future, though
*which* of these problems we decide to accept has yet to be determined.)
And, of course, the reason C is a bad language for DWIS'ers is that
they're not permitted to fully express their DWIS constraints (which
would amount to SWYM regarding DWIS). `volatile' is an example
of that ability, but is nowhere near enough, and already it has
been overloaded somewhat.
It occurs to me that, in a sense, what you want in combining `extern
inline' with an address of a label is akin to making a particular
function *invocation* `volatile', and being able to report the address
of where that *invocation* occurs, in memory. One way to look at
`volatile' in this context is, "I want to be able to tell the CPU,
which supports hardware breakpointing on a PC value, to break exactly
when it gets to this *particular* invocation as it appears in the
source code, which means there must be a one-to-one mapping between
the source-code expression of an invocation and the machine-code
invocation itself, and I must have a reliable way of extracting the
address of that invocation so I can provide it to the hardware using
some extra-linguistic method."
Not that I'm proposing this as an extension (it'd still be feature
creep), but, if you had *that* feature, in that instance, you wouldn't
really care if the function invocation itself was inlined, right?
I mean, if the compiler *knew* it'd go faster if it was inlined, or
if it wasn't, and thus actively ignored "inline", that wouldn't
bother you for the case I think I'm talking about?
Still, it's overspecified for your purposes anyway: you don't really
*need* a one-to-one mapping. You wouldn't mind it being loop-unrolled
or similar, into multiple copies, as long as you could identify any
of *several* possible PC values as belonging to one particular
source invocation. (Well, I'm guessing about what you need here,
which is risky.)
So, if you did use `volatile' here, you'd be constraining the compiler
to *not* produce, perhaps, more optimal code, even though it might
be able to do so and still meet your *actual* objectives, which
are not communicable in C. For DWIS'ers, that's often acceptable,
while, for SWYM'ers, it isn't, and they tend to fight over keyword
or namespace "territory" the way people fight over `.com' addresses.
>>In the meantime, it'd be a good idea to decide which direction
>>gcc should go, which I think the industry as a whole has concluded is
>>"SWYM", since that's where C's going, and stick with it, completely,
>>or nearly, ignoring proposed extensions representing the other side,
>>because C, as a language, cannot support both paradigms (it can
>>barely support one).
>
>I would hope you don't propose to go too far. In many areas, the main
>strength of C is still as a reasonably portable low-level programming
>tool that doesn't get in your face.
Yes. I've seen the collision coming for a *long* time now, and
I'm sure others have too. I think the train has rolled out of
the station already, long ago, once the C language standardization
effort decided to become less DWIS-like and more SWYM-like, e.g.
by trying to "reach out" to the Fortran audience (Fortran being
*much* more SWYM like, at the lower levels, than C).
But that doesn't mean a group can't form to specify a DWIS-like
subset+extensions of C and fork off a DWIS-like implementation
of gcc. It'd be a lot of work to do, and I'd worry about whether
they'd be going down a technological cul-de-sac, but I think the
sheer mass of need for such a thing would make that less of a
concern, perhaps less likely, especially in the context of the
C language, which will probably never offer enough SWYM-like
facilities to obviate their need for DWIS-like ones.
(I've gotten the impression that, in addition to Linux and other
OSes compiled by gcc, there are *lots* of embedded developers out
there who would prefer to use a more DWIS-like gcc.)
>For SWYM, we already have things like visual basic,
Heh, well, I don't really know that language; maybe. I don't know
Haskell either, but I gather it's much more SWYM-like than most
anything we're discussing. Ditto for Prolog, where the cut operator
is one of the closest things to DWIS-like behavior the original
language had.
The way I look at the evolution of the compiler "beast" over the
decades is that it started out as basically an insect-like robot --
very predictable, but not very useful, and very DWIS-like.
As it evolved, it became slowly more SWYM-like, because people wanted
to be able to say things more like "here's what I want to accomplish"
rather than "here's what you do". Kind of like how it's sometimes
easier to get a 15-year-old to clean up his room by saying "clean up
your room", and he knows what a clean room looks like, etc., than
a 3-year-old, who doesn't know from "clean" and must therefore be
told, step by step, exactly what to do: "put this toy *there*, put
that toy up *there*". (And, yes, the more SWYM-like entity can be
more resistant to doing things the *way* you want. :)
At the same time, as compilers *multiplied*, more and more people
became dependent on exactly *how* they happened to do things -- i.e.,
though they were *evolving* to be more SWYM-like, these people
were relying on their *particular* copies to retain certain DWIS-like
characteristics. Then, when they decide to upgrade, they've suddenly
got an *evolved* copy that doesn't behave they way they expect,
even though it might be behaving exactly as the developers want and
expect it to.
Here's where we are right now: gcc is kinda like the Arnold Schwarzenegger
robot in the movie Terminator. You tell it something to do, and you
can pretty much guess how it'll do it, and be right most of the time,
so it's still fairly DWIS-like. But it also has lots of SWYM-like
characteristics, of course -- for a sufficiently complicated task,
you *can't* predict exactly how it'll accomplish it. You might think you
have *some* idea of how reliably you can predict some tasks, but you're
almost certainly wrong, even if you're its original programmer. (The
Asimov series starting with "I, Robot" is, IIRC, an excellent introduction
to the sorts of problems that result from using automatons, including
assuming too much about how they will accomplish their tasks.)
So, if you've got a friend you want to intimidate and a Terminator
robot at your command, you can say "Terminator, go strangle my friend".
You *know* the robot will have to walk across the room to accomplish
this...that it'll make *lots* of noise and look *very* menacing doing
so...take a certain amount of time to do so...and that your friend
will get *real* scared. At the last moment, you can say "Terminator,
cancel command" (or whatever), and you can all enjoy a good laugh.
But Terminator has some pretty serious limitations in terms of what
it can accomplish, and many of its users expect more of it. In that
sense, it needs to become even more flexible -- more able to do tasks
in ways that would seem completely unpredictable.
Now, you hear about this *new* robot (in "Terminator II") and upgrade,
because you hear it'll do things faster, better, etc. than your
current robot, and you *gotta* have that, because while you love
the old robot, you've learned a lot about its limitations.
You try the same thing -- "Terminator, go strangle my friend" -- and
within about 50 milliseconds it has slithered small liquid-metallic
tendrils across the room, which crushed your friend's windpipe, all
without most of it apparently moving an inch. Not so much laughter,
though you end up with more pizza and beer for yourself.
Now, is either robot completely SWYM? No, nor completely DWIS. Did
both robots do what you told them? Yes. Whose fault was it that
your friend got strangled? Yours.
Why? Because you failed to explicitly *constrain* the T2 robot to
meet your requirements -- requirements that were *implicitly* met
by the design limitations of the T1 robot, which the T2 robot neither
had, nor was designed to emulate. You didn't say "*pretend* you're
going to strangle my friend, so he has time to see it coming and
get real scared, but don't actually do it", or something like that.
The designers of the C language, and of computer languages in general
(excluding Java, perhaps), just like the fictional designers of the
Terminator robots, know that The Future is in more SWYM-like designs.
They *also* know that, to keep complexity down, newer designs must
jettison archaic, DWIS-like constructs that existed primarily to work
around old limitations that they were superseding in the new designs
via more SWYM-like things. (E.g. Fortran is likely to jettison COMMON,
EQUIVALENCE, and other "old favorites" someday, though it'll be some
time after before *compilers* actually do.)
So, as long as gcc is following the general path of implementing
standards like the C, C++, and Fortran languages, which are evolving,
almost exclusively, along SWYM lines, gcc will *have* to evolve, itself,
along SWYM lines.
And, to keep complexity down, it'll perhaps have to drop DWIS features,
and it certainly will have to stop accumulating new ones. Heck,
we're *already* spending way too much time dealing with configury
issues (the Makefiles plus the configurations), largely because
make, originally intended to be SWYM-like, has accumulated features
and a culture that are too DWIS-like and thus too brittle, and because
of other DWIS-like stuff in that area (use of shell scripts, configure
scripts, etc.). That doesn't even involve the code to do things
like read C programs, optimize them, and generate assembly, which
*should* be where we're spending the vast majority of our efforts!
As an OS developer, I've *long* appreciated the advantages of a more
DWIS-like compiler, and, in my experience, OSes are almost always
compiled by something *other* than the primary compiler being vended
for application code. (At Prime, PRIMOS was written in assembler,
a FORTRAN-66 subset, and a small PL/1 subset called PL/P, and compiled
by the corresponding archaic compilers *long* after Prime offered
new, "better", FORTRAN 77, PL1/G, and full PL/1 compilers. I recall
hearing that Sun used *gcc*, not its vended, unbundled compiler,
to compile either SunOS or Solaris for quite a while, though I
think they started switching away from that, last I heard around a
year ago. Then there's BLISS at Digital, et al, but I don't have
much personal experience with these company's OS design efforts.)
So, I have *no problem* with someone meeting the need of OS developers
and embedded-software developers by providing a much more predictable
DWIS-like design. It could even be an offshoot of gcc.
I just can't see any way gcc will be able to continue to evolve if it
takes on both roles, one SWYM-like, the other DWIS-like. Heck,
look at the problems we're already having accommodating the two
*audiences*! That is, there's *often* a clash here among the
SWYM-like audience that wants gcc to just "do the right thing"
to make their code faster, smaller, etc, the DWIS-like audience
that wants gcc to just do what they think they've told it to, and
the gcc developers, usually somewhere in between, who say "we're
trying to make it fast [a la SWYM] within the constraints of what
people actually expect beyond and above what the standard specifies
[a la DWIS]".
Also, I've long had high hopes that a much more comprehensive SWYM-like
language and compiler could actually *eliminate* the need for DWIS-like
tools in areas such as operating systems and embedded software,
by permitting SWYM expressions of external timing constraints,
for example.
That language won't be C, though, and probably not any other imperative
language currently fashionable (if it's even imperative at all).
Getting back to the proposal for a "force inline" directive, let me
ask a simple question, and see how you and others respond:
Would it *ever* be permissible for gcc to generate *different* code
from sources containing normal "inline", without warning at all
given `-Winline' (meaning it succeeds at all the requested inline),
as compared to the exact same sources compiled by the exact same
copy of the compiler, but with any of those normal "inline"s modified,
in the source, to be "force inline"?
In other words, does the simple act of specifying "force inline"
where "inline" used to *succeed* (as defined by `-Winline')
permit gcc to change the code it produces, perhaps radically?
tq vm, (burley)