This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: How to understand gcc source code?

Hello All.

Denys Vlasenko wrote:
On Saturday 22 March 2008 11:14, Basile STARYNKEVITCH wrote:

* on the positive side, GCC is still doing well and alive

I did not meant that everything is perfect. But I would not qualify GCC as a sick or dead project. This is why I wrote "doing well & alive"!
I am pretty aware of some of GCC problems. I'm not sure the total effort contributed to GCC is decreasing (I feel the opposite, at least by measuring traffic on gcc mailing lists). And GCC could be much worse!

Then why LLVM (at least is said to) surpass gcc in generated code's quality?

AFAIK, llvm C front end (called clang) is still in alpha stage (their previous frontend is GCC based!) Of course the llvm generator (back-end & middle-end) are very good. By the way, the best feature of LLVM2 is IMHO its JIT abilities.

Why Intel and MS compilers are surpassing it?
Honestly, I never coded last years on any Microsoft systems (except back in the MSDOS 3 days). I know some colleagues, very familiar with Microsoft, who still occasionally use GCC (in particular, they seems to find GCC diagnostic messages better than MS Studio one's).

Sorry for "negative" comment, but burying your head in the sand and pretending that everything is ok does not seem like a good strategy to me.

I was not pretending that everything is ok within GCC. However, I do understand, that, at least on most Linux systems, it is the compiler mostly used. For example, I know no major Linux distribution whose user-land software (coded in a GCC compilable language like C, C++, Ada) is not compiled with GCC. And I find this fact significant: Mandriva, Debian, Redhat, SuSE could compile some of their software with something else than GCC, and they don't! Actually, few linux distributions even ship non GCC compilers (and such beast exist: tinycc, nwcc, ...) [of course I'm talking about GCC supported languages].

Actually, I would welcome a lot a better equilibrated ecosystem for opensource compilers. I do wish that other (non GCC) opensource compilers become much more used! I'm a bit annoyed by GCC quasi monopoly within linux distributions.

My current impression is that the main GCC languages (ie C, C++, Ada) are perhaps less used than a dozen or even five years ago. Scripting languages (like Perl, Ruby, PHP) have an increasing importance, Java (and perhaps C#) has a major place on the market, and better niche compiled languages (like Ocaml -which I like a lot-, Haskell, CommonLisp, ...) are perhaps more used than 5 years ago.

Actually, this discussion raises a very interesting *quantitative* questions: who uses GCC, why, how? I dream of quantitative results to questions like:

* about X thousands developers run GCC at least once in the last month (I would imagine X is more than 100, ie more 100K developers are using GCC)

* about Y hundreds people compiled GCC from its source last month. I would imagine Y to be in the dozens (so that GCC -some version, or branch, or trunk of it- is compiled each month by several thousands persons)

* what is the proportion of source languages: I would guess that a majority of developers using GCC compile some code in C, then a significant minority in C++, and the other languages Fortran, Java, ObjectiveC, Ada ... are residually used. I might be completely wrong!

* what host system is used? I would guess that linux x86, linux AMD64, mingw x86 (i.e. some gcc used on some Windows machines) are perhaps the favorite ones. I even believe that non [quasi] POSIX.2001 (ie neither Linux, nor recent *BSD or Solaris or AIX or HPUX), non Windows hosts are very rare! (in practice, this would mean than the common set of functions usable within GCC is really much bigger than what libiberty offers; my hypothesis is that dlopen or some tldl_dlopenext is very commonly available. Plugins could be implemented, and I believe they should be somehow.).

* what is the proportion of cross-compilation? I have no real clue. I would suppose that during all the GCC runs in the last month, only a minority was cross-compilation (for some embedded systems). Of these, what are the favorite target machines & systems. I really don't know this one (maybe ARM or PowerPC32?).

* what is the distribution of compilation unit sizes? What is the mean size of a compilation unit (I would imagine a few thousand lines of C or C++)? What is the median size? The first and last decile?

* what is the distribution of object code size (for a given compilation unit)?

* what is the distribution of whole (GCC compiled) binary program size?

* what is the distribution of function sizes?

* what are the frequently used optimization flags? I would guess that developers actually coding run with -g -00, and sometimes build it with -O2. I would guess -O3 (and perhaps also -O1) is rarely used (this seems true at least within linux distributions)? I would imagine that individual optimization tuning flags (e.g. -funsafe-loop-optimizations ...) are very rarely used.

* what is the qualification (e.g. formal education) of the "typical" GCC user? of the "typical" GCC developer?

* what is the labor done on GCC development each year? I would believe several hundred persons*year each year (but maybe I am optimistic)?

I actually would be extremely interested by such figures (because they would help me to get funded)....

Perhaps we might add some "spying" code in GCC.
* technically, this would be quite easy. For example, we could have the gcc building machinery send an email (or an HTTP request) somewhere, to measure who is compiling the GCC code. We could also have gcc write some tiny statistics (e.g. size of compilation unit, flags used, ...) in some log file, and ask some system adminstrators to send them.
* I tend to believe that the major drawback is social. I would imagine that most GCC users would be unhappy to have some numbers sent outside.

Regards (and happy Easter to those celebrating it tomorrow).

email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]