This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: [egcs 1.1] Fix gcc.c handling of %O vis-a-vis temp files


>  > The bug I'm fixing is that the handling of `%g%O' is broken.  It is
>  > *supposed* to be exactly as if the proper target-specific suffix
>  > (normally `.o') is present, as in `%g.o'.  But, unlike when `.o' is
>  > explicitly used, do_spec_1 currently generates a *new* filename per
>  > *occurrence* of `%g%O' in the specs it is processing.
>OK.  Is it %g or %O that is causing us to create a new filename?

It's %g's handling of %O.  The code is clearly trying to *avoid*
creating a new filename when the suffix is one for which a
filename has already been created.  The code to do this is simply
broken for the case of %O.

>I won't claim to understand all the spec junk.  It's a horribly ugly
>mess.  The explaination in this message has helped me much more than
>your previous message in understanding the problem.  When submitters
>fail to explain patches, it's quite likely I'll mis-understand the
>problem being solved -- I don't know everything about the compiler.

I still feel my explanation was, while short, sufficient such that,
in combination with looking at the docs for the relevant bits,
it could be clearly understood as a fix rather than a change in
the design.  Perhaps the fact that whoever changed the behavior of
%g didn't also change the commentary documenting %g would cause
confusion -- however, looking at the code for just a couple of
seconds makes it clear how %g is supposed to work, as compared to
how it's documented (especially since the differences are quite
minor, almost irrelevant, to most any user of %g -- an exception
would be someone supplying two %g's where one instance had an
explicit suffix, the other an implicit one, as required by the tool,
in which case `... %g.foo ... %g ...' no longer names a single
file, even though the second %g is implicitly given the `.foo'
suffix, though that's quite unlikely and in any case basically
irrelevant vis-a-vis my patch).

>I would claim that using %g is a bad thing because it creates easily
>predictable names (since it's supposed to be choosen once per
>compilation) which leads to a denial of service attack.

Actually, %g no longer chooses a name once per *compilation*.  It
chooses a single name *per suffix* per compilation.  That's the
point of the recent changes (that broke `%g%O', unfortunately).

And, clearly, after you write assembly code to a file named `foo.s',
you can read that file (that is, get the assembler to process it)
*only* by opening a file named `foo.s'.

If that's a security breach, then, well, tough -- there's nothing
you can do about this.

Note that this is a *different* thing than naming the corresponding
preprocessor file `bletch.i' and the corresponding object file `bar.o'
-- my patch doesn't affect that at all.  It merely makes the code
*work*, as specified, and as it should.

So, the recent changes -- that broke `%g%O' -- don't IMO make %g
prone to denial-of-service attacks, because once you see a file
named /tmp/foo.i, you can no longer correctly guess that soon there
will be an attempt to create a file named /tmp/foo.s, /tmp/foo.o,
and so on.

And my patch, which fixes `%g%O', doesn't affect that at all.

In summary: you're complaining that *my* patch wasn't sufficiently
described, but IMO it was, and the fault was that somebody, somewhere,
applied patches changing the behavior of %g without sufficiently
describing *and documenting* them.  I'm always willing to accept blame
for inadequately documenting things (my stint as a tech writer doesn't
immunize me from such mistakes), but this isn't one of those cases.
I just made existing code work at least as well as it was supposed
to vis-a-vis the *existing* documentation, coping with the fact that
the docs weren't exactly "correct" in a fairly minor way.

So I suggest you take up the argument of inadequately documented patches
with whoever accepted the patches to change the behavior of %g without
ensuring that the docs be changed accordingly, as there's clearly some
confusion now.  (IMO, not on my part, but it does seem that the docs
for %g are confusing, or at least troubling, you in a way that the code
perhaps would not.)

>We should be moving away from using %g, which I guess was the ultimate
>point of my previous message.  Why are you using %g?

Because it's the right thing to use.  Why does gcc.c use %g?  Why
does cp/lang-specs.h use %g?

Moreoever, what alternative do you have to offer?  %u/%U seems
close, but can't cope with "cpp -o %u.i source.c; cc1 -o %u.s %U.i;
as -o %u.o %U.s; ld -o a.out %U.o" because of the ordering problems
(probably due to POSIX requiring options before filenames; otherwise the
middle commands could be "cc1 %U.i -o %u.s; as %U.s -o %u.o", but I
don't think it's wise to assume such changes would be safe on every
system).

However, upon looking at the code more closely, it seems this, too,
differs from the docs, in that the former command line above might
actually work -- %U.i in the second command would probably match the
previous %u.i even though there's an intervening %u.s, and so on.

In any case, AFAICT, %U is nearly identical, functionally, to the way
%g behaves *now*, in terms of reusing a previously obtained temp-file
name and being prone to denial-of-service attacks.

Put another way: %u/%U "keys" on the %U string combined with the
order in which these items appear in specs to determine whether to
obtain a new temp-file name.  Whereas, %g now "keys" on the suffix
that follows (which may be nil -- I checked this).  IMO %g is actually
less prone to abuses that could lead to denial-of-service, because
it disallows "foo -o %u.s source.foo; as -o %U.o %U.s", etc.

Now, I *could* make `g77 -v' use %u/%U instead of %g in this
situation, especially given my "However" paragraph above.

However (yet again), that won't work since %u%O and %U%O are, I think,
just as broken as %g%O.  That is, `... %u%O ... %U%O ...' is currently
likely to turn into two *different* names, despite what the docs
say should happen (though they're slightly wrong in other ways).

And, amazingly enough, my patch is likely to fix that just as it
fixes %g%O.  Which isn't so amazing, given that the ChangeLog entry
doesn't discriminate between %g and %u/%U either.

In fact, at this point, I don't think there's any real difference
between %g and %u/%U, except the latter allows creation of a whole
new temp-file name for a previously-used suffix -- a difference I don't
think is ever exploited.

>[ Note this is separate from whether or not %g correctly implements
>  its documented behavior, which I presume it does not, unless it
>  is %O that is causing us to create a new temporary.  ]

It is %O that is causing us to create a new temporary *each* time
it follows %g (or %U); %g by itself does *not* necessarily create a new
temporary, but its *implementation* does based *entirely* on whether
the *suffix* that follows it is "new" (unique).  (That amounts to
the same thing; the distinction matters in case you're looking at
the code that handles %O when it isn't preceded by %g, %u, or %U.
All *that* code does is substitute `.o' or whatever.  %g/%u/%U now
slurp up whatever suffix info follows, and are just broken when it
comes to handling `%O' compared to `.o'.)

So given a specs string like "... %g.i ... %g.s ... %g.i ... %g.o ...
%g.s ... %g.o ...", the sample temp file names should be "... bletch.i
... foo.s ... bletch.i ... bar.o ... foo.s ... bar.o ...".

The bug is that when %O is used instead of .o in such a specs string,
do_spec_1 mistakenly assigns a new temp name to the final bar.o, making
it `frob.o' or similar.

That `g77 -v' happens to be the only way I know of offhand to expose
this bugs is irrelevant -- the code is broken, my patch fixes it, end
of story.  It's nice it makes `g77 -v' work again, though.

>Can you give more details why you can't use the existing link mechanisms?  
>Maybe that is the core issue -- g77 is doing something "different" from
>what most front-ends do, and as a result we seem to run into lots of
>problems (particularly with -v).

IIRC, g77 uses all the canonical stuff the canonical way *except* for the
`g77 -v' case.  For that case, it not only needs to link an executable,
but *run* it afterward.  To my knowledge, the current mechanisms do
not allow this -- notice they are *outside* the `specs' setup that
language-independent front ends use, which suggests the specs setup
is inadequate or incomplete in some way.

And, I believe the only times `g77 -v' has been broken have been the
result of outright *bugs* created in gcc.c, that would have been bugs
regardless of whether `g77 -v' exposed them.  But I might be forgetting
something.  I'm pretty sure `g77 -v' is implemented right along the
guidelines implied by the docs for the specs stuff in gcc, so the fact
that it "stresses" gcc's spec handling should, IMO, be seen as a
feature, not a bug -- especially if we stop taking every failure
exposed by `g77 -v' not working as yet another indication of what a
lousy thing `g77 -v' is, and instead take such failures as what they've
almost always turned to be, namely, indications of bugs in gcc.c.

>  > P.S. Why do I feel I'm having to defend the workability of `g77 -v'
>  > every couple of weeks?  Couldn't we just decide "it works, it's
>  > useful, g77 users and developers have been depending on it for a
>  > couple of years, so anything that breaks will be *assumed* to be
>  > broken until proven otherwise"?
>Please don't look at things this way.
>
>Just because it's the way g77 has done things for years and its the way
>g77 is documented does not make it right.  gcc has had conventions in
>use for much longer and for the sake of consistency those conventions
>should be used in all the languages.  Deviating from those conventions
>for any language should not be taken lightly -- even if it's the way
>that language has worked in the past.

Yet, that's exactly what gcc does with its support for `gcc -v',
which g77 expands upon.  I've pointed this out over a very long time,
I believe, going back to ancient days on gcc2.

And, "for the sake of consistency", the convention of `g77 -v' reporting
all the info pertinent to a product with as long and difficult a
history as g77 should not be lightly deviated from.  (To wit: g77 not
only has long been separate from gcc, and still is, it also has,
at times, patched the gcc back end, and further has been prone to
misinstallation such that the run-time library on which it depends
is not the one that ships with it, etc.  So getting reliable info on
the run-time-library version info has long been quite helpful.)

The question is, are *you* willing to ignore a perfectly good patch
that happens to fix `g77 -v', and thus eliminate `g77 -v' as *the*
established way to get complete information on a given installation of
g77, whether it came from the FSF with gcc, or egcs?  It's bad enough
that egcs 1.0 didn't handle this, but egcs 1.1 is likely to be much
more widely used, IMO.  (At least on the g77 side, egcs 1.1 offers
a fully-modern g77, which egcs 1.0 of course does not, compared to g77
0.5.23.)

The other question is, given that we have a *feature freeze* on for
1.1 (for weeks now), why is it you're willing to consider tossing `g77 -v'
overboard, when the reason it is broken is that the code in gcc.c
I've already fixed is *itself* broken according to the documentation
*and* to common sense?

>This is part of the price of moving from a separate add-on language
>to being part of the main distribution -- you have to conform to existing
>practice more :-)

I'd be more impressed with that argument if *gcc* conformed to
existing practice.  ARAIK, it doesn't.  I need to see a *much*
better argument to eliminate support for `g77 -v' than what you're
presently offering, to even consider it, especially during a "feature
freeze".

>Note that the same applies to gcc itself, but to a lessor extent since
>it defines the conventions.  In some cases we may choose to move towards
>a different convention found in one of the languages because it is better
>than existing conventions (ie g77 -v).

I agree, and if I'd seen solid movement towards that goal awhile ago
when you last claimed the very concept `g77 -v' was "horribly broken"
(if I recall your words correctly), I'd be more willing to discuss
this now vis-a-vis 1.1.  E.g. I'd *much* rather make `gcc -v', `g77 -v',
and so on all simply return "No input files", which I suspect is what
the GNU Coding Guidelines would say; have `--version' be the way to
get basic version info (for g77, this would be both the front end
and gcc version numbers); and have `--verbose --version' (`-v --version')
be the way to get what `g77 -v' now provides, to be somewhat similar
to how `gcc --verbose --help' now works, and have all front ends
(including gcc) support that equally.  (Just as `gcc --help' and others
now support what `g77 --help' has provided for a long time, plus more.)

But, IMO, the decision has been made, via inaction as much as anything,
so `g77 -v' is a feature that *is* frozen, must work, and is easily made
to do so by simply fixing gcc.c.  IMO, if you're willing to toss `g77 -v',
you should be just as willing to allow introducing the kinds of changes
I mention in the above paragraph in the same release timeframe.  I'd
be willing to pitch in and help out to make that happen, but seeing
as we're where we are in the schedule, I understand why that probably
shouldn't be allowed, so IMO dropping `g77 -v' during the feature-freeze
time shouldn't be allowed.

What I don't understand is why you seem to need me to justify things
like `g77 -v', but yet you seem to be:

  -  Unwilling to learn about how %g is supposed to work on your own,
     to thus determine whether my patch fixes it

  -  Unwilling to accept my repeated promises that I *have* done the
     necessary learning and that my patch fixes the `%g%O' bug

I'm not sure what more I can do to convince you to apply the patch.
I appear to be caught in a Catch-22: if I say "this fixes g77 -v",
you say "that doesn't matter, g77 -v is a Bad Thing anyway"; if I
say "fixing g77 -v isn't important" (which is effectively what my
original patch said, as the ChangeLog entry did not mention it,
and stood on its own as a valid bug-fix), you say "it's not worth
doing this" or "this isn't even a bug" (even though `g77 -v', among
other things, clearly demonstrates that it is); if I say too little about
my fix, that's a problem, too, but now that I'm saying much more, you're
asking me to explain the entire workings of specs entries like %g, which
is hardly something I can claim to do better than by suggesting you read
the docs and code yourself, which you still seem to be unwilling to do.

And all this time I'm spending writing these long emails trying to explain
why what is, IMO, a trivially obvious bug-fix is time I'm not spending
trying to take care of much more important stuff (mainly, the stuff
I need to do to get CVS access).

Perhaps you should just do what I already did: try a few compiles with
`-v' present, as in `gcc foo.c -v', and look carefully at the temporary
names get generated, and which ones match.  Then try to understand why
some names are the same as others, and why not all names are the same
(that is, they don't even have the same base names).

Also, if you're willing to finally apply my patch, I'm willing to
submit another patch to fix the commentary that describes %g/%u/%U/%O.
That should be done anyway; but my patch would clarify that way I
believe it is *supposed* to work, so there's no point in my submitting
it unless you first accept my previous patch.  (Otherwise somebody
should fix the commentary to document the IMO-buggy way of doing things.)

        tq vm, (burley)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]