This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Compiling GCC with g++: a report


Zack Weinberg <zack@codesourcery.com> writes:

| On Mon, 2005-05-23 at 01:15 -0500, Gabriel Dos Reis wrote:
| > Hi,
| > 
| >   I spent the week-end trying to get GCC -- mainline -- compilable
| > (i.e. those compoenents written in C) with a C++ compiler (e.g. g++).
| 
| These results are very interesting.
| 
| As a general observation: A lot of the things you have found to be
| problematic, are in fact preferred idioms for C code.  For instance,
| no standard-C programmer would ever write an explicit cast on malloc's
| return value.  I think that we are losing something, if only in
| readability, if we restrict our code to the subset of C which is also
| correct C++. 

I think opinions are variable here a lot.  If for example, you take a
look at examples in The C Programming Language (all editions), you'll
find explicit casts on malloc's return value.  Yet, I would refrain
from calling Dennis Ritchie as a non standard-C programmer or his
book, TCPL2, not describing standard C.  Most of the C programmers
I've met have learnt from his book.  (Yes, I've also read some C
programmers comment that nobody should cast the return value of
malloc, but for large scale sofwtare, I have not seen their opinions
as dorminating). 

The cast you're talking about is buried deep in XNEWVEC, XRESIZEVEC
and such.  It is not anything you'll find in the code directly.  So,
in fact we do not lose readability as you claim.

| Now, if we were migrating to C++, that would be okay,
| because we would (eventually) get all of the additional expressive power
| of C++ in exchange.  However, if we're not migrating to C++, I'm opposed
| to the inclusion of patches that restrict our C code to the subset which
| is correct C++.

The patches are aligning us to our coding standards.  I don't think it
is reasonable to throw roadblocks in the way, especially when they are
contrary to our current coding standards.  The claim that the cast
will obscure the code is unjustified as the use of the libiberty
macros relieve us of springling cast in the code.  See my previous
patches to libiberty and fixincludes.
I don't think your suggestion of moving to C++ is workable at this
point.  The patches of aligning us to the common subsets of C90 and
C++ is following the consensus we developed as our coding standards.

|  Furthermore, as I've said before, I support migrating
| to C++ -- but only if the C++ ABI and libstdc++ soname are first
| permanently frozen.  If we do not do that first, we risk being trapped
| into a situation where we need specific versions of GCC to compile
| specific newer versions of GCC, which would be a Bad Thing.

Throwing roadblocks in the way is not going to help the GCC project.
It is unreasonable to that at this time.

| The C++ ABI seems to be stable at this point, but there is not yet
| consensus that it will never again be changed.  The libstdc++ team is
| currently developing yet another new, incompatible version, so I see no
| hope for a permanent freeze of its soname in the near future.  Thus,
| while you've discovered some interesting things by trying this, I don't
| think C++ compatibility patches should be applied now.

The issue of moving to C++ is independent of our aligning ourselves to
our coding standards.  I don't beleive it is reasonable to block these
patches on the ground that we could conceive moving to C++ (which is a
controversial issue).  The decision to code at the intersection of C90
and C++ is a consensus we reached after repeated debates.

| Having said that, some comments on the problems you have found:
| 
| > Third, there is some "type-punning" with enums, int and unsigned int,
| > where the middle-end (mostly) relies on implicit conversion from int
| > to enums.  
| 
| Being allowed to do this is very important.  Some enumerated types are
| to be treated as opaque outside a very narrow context; the only way to
| do that in C is to have (a typedef of) unsigned int as the visible type,
| and only declare the enumerated type in the context where it's allowed
| to be used. 

I have looked at every of those uses -- since I went through editing
almost every file needed for compiling GNU C and GNU C++ compilers.
None of the cases appear important.  The only compelling cases are
when front-ends (eg.g C or C++) extend them (e.g. c_tree_code or
cplus_tree_code).  However, none of the current approach is necessary.
As, RTH pointed out in the past, front-ends should define those
enumerators as a whole by appropriately #include the file.  We can
arrange for that -- in fact I've tested variants of that in my
experiments.   No cast is neeeded when done properly.

| I want to see more use of this idiom, not less; for
| example, 'enum machine_mode' ought to be a black box to almost the
| entire compiler. 

Me too, but the way to make it a black box is not to cast it so
unsigned int back forth willy nilly -- that does not make it a black
box, on the contrary.  For example, we should be using EXPAND_NORMAL
instead of plain "0".

| I'd be delighted to hear of a more C++-friendly way to
| code this. 

See above.

| Naturally, where the constant is _not_ opaque outside of a
| defined context, but is part of an interface (as your examples seemed to
| be), not using it is just sloppy.
| 
| > Fourth, it appears that we're implicilty using C99's semantics of 
| > "extern inline" in our source -- when we have a pure C90 compiler that
| > does not understand "inline", we just #define inline to nothing so we
| > don't get into trouble.  With a C++ compiler, we're in trouble because
| > an inline function needs to be defined in every translation where it
| > is used.  So, I either move the affected functions to "static inline"
| > or just make then non-inline (cases are in hashtable.c and toplev.c).
| 
| Use of bare 'inline' is just plain wrong in our source code; this has
| nothing to do with C++, no two C compilers implement bare 'inline'
| alike.  

Well, the way I figureed it out was running the code source through a
C++ compiler.  I'm aware that inline is absent from C90 and that many
of the current compilers that claim to implement C99 have their own
opinions on the matter.  However, what I was reporting is  an *actual*
experiment, no a thought.  And it popped up only because I ran the
source code through g++.  Which, I think I should mention.

| Patches to add 'static' to such functions (AND MAKING NO OTHER
| CHANGES) are preapproved, post-slush.
| 
| > Fifth, there is a slight difference between "const" in C and in C++.
| > In C++, a const variable implicitly has an internal linkage; so a
| > C++ compiler tends to optimize it out when its address is not taken
| > (so no storage is wasted).  This is an issue for the objects
| > automatically generated by the gengtype support machinery.  The are
| > supposed to have external linkage, so we need to explicitly say
| > "extern" in their definitions. 
| 
| Presumably such constants are declared in some header file, with
| external linkage.  It would be better to make that declaration visible
| at the point of definition, rather than marking up the declarations with
| 'extern'.

I'm talking of the various gt_* objects created by the gengtype.
Please, do have a look at the actual contents of the file and re-read
what I wrote. 

| > Sixth, there is a real "mess" about name spaces.  It is true that
| > every C programmers knows the rule saying tags inhabit different name
| > space than variable of functions.  However, all the C coding standards
| > I've read so far usually suggest 
| > 
| >    typedef struct foo foo;
| > 
| > but *not*
| > 
| >    typedef struct foo *foo;
| > 
| > i.e. "bringing" the tag-name into normal name space to name the type
| > structure or enumeration is OK, but not naming a different type!
| 
| Ugh.  Where do we do that?

In our source code. :-)  To name one that come to mind,

    alias.c:96:typedef struct alias_set_entry *alias_set_entry;

I've also found that we have hash_table from libcpp as a typedef-name
and hash_table as global (static variable) in cselib.c.

| I will suggest, when you find these, that
| you tack "_s" on the end of the tag-name; 

That is what I did in my local tree. But I believe we need to
standardize on a coherent coding standards.  Which is why I brought up
the issue.

| that doesn't conflict with
| POSIX, and should require fewer changes elsewhere in the code.

-- Gaby


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]