This is the mail archive of the
mailing list for the GCC project.
Re: GSoC proposal: Provide optimizations feedback through post-compilation messages
- From: Tomasz Borowik <timon37 at lavabit dot com>
- To: gcc at gcc dot gnu dot org
- Date: Fri, 30 Mar 2012 02:00:58 +0200
- Subject: Re: GSoC proposal: Provide optimizations feedback through post-compilation messages
- References: <47466B24D5352C4AA77737202B165839227F5535@EXDB1.ug.kth.se>
On Tue, 27 Mar 2012 22:33:39 +0000
Thibault Raffaillac <email@example.com> wrote:
> Hello all,
> My name is Thibault Raffaillac, CS degree student at Kungliga Tekniska Högskolan,
> Stockholm, Sweden (in double-degree partnership with Ecole Centrale Marseille,
> GCC currently provides no concise way to inform the user whether it applied an
> expected optimization (ie, it "understood" the code). As a result, some will do
> premature optimizations when they do not trust the compiler, and some others
> will create overly convoluted code with blind belief in the compiler. This is
> especially relevant for users non-initiated to the internals of GCC.
> The project I would like to propose is a feedback for the optimizations
> performed by GCC. To avoid binding users to the compiler, I would focus on some
> very standard optimizations across vendors, or for some specific yet nice
> features I would indicate their specificity to GCC/an architecture.
> The feedback would be triggered when compilation is successful, and display a
> couple of different messages each time it is run:
> gcc --feedback test.c
> test.c:xx:x: info: All operands being constant, constant folding was applied to assign '2560' to 'a'
> test.c:xx:x: info: GCC could not fold constants here because...
> test.c:xx:x: info: As integers are stored in binary format, strength reduction was applied to replace '* 8' by '<< 3'
> test.c:xx:x: info: Basic block vectorization was applied to pack the 3 independent additions into a single SIMD instruction
> test.c:xx:x: info: GCC implements unordered_map as open-addressed hash tables, with double hashing probing
> As a difference with the internal verbose messages, here they would form a set,
> and the system would remember those already displayed and decrease their
> frequency of occurence between compilations. All messages would explain what
> triggered them, cite the optimization name, and describe the consequence.
> As for the work plan, it would consist in:
> _ Enumerating all possible messages in the messages set.
> _ Implementing a function receiving feedback from each optimization unit and
> choosing whether to display it: info_printf(enum INFO_INDEX, const char*, ...);
> _ Write a formatting guide for adding messages in the set.
> My academic background includes compiler construction, C programming and Human-
> Computer Interactions. I am very much interested in the usability of compilers
> (on which I am currently carrying my degree thesis -
> http://www.csc.kth.se/~traf/traf-sketch.pdf) and thus would be glad to
> contribute to GCC.
> If this can be of interest, suggestions are welcome!
> Best regards,
> Thibault (http://www.csc.kth.se/~traf/)
I completely agree, and it's actually a part of what I'm targeting in the long term, so I think we might be able to join forces. I'm also thinking of a gsoc project though in different areas (there's an email in the list about them on 19.03), so maybe we could do separate parts that combine into something even more awesome;)
I think a huge part of the issue is in the medium of communication between the programmer and compiler. I'm targeting an environment where the source code editor practically becomes the compiler's front-end. My project allows extremely dynamic presentation of the source code, so I can e.g.
- easily inform the programmer about anything in an unobtrusive manner within the code,
- give him different perspectives of the same code,
- allow him to give precise and detailed information to the compiler about possible code optimizations without making the code unreadable.
The first two points may seem already solved by eclipse, xcode or whatever other gigantic ide, but I'm talking about a much larger scale of feedback presented instantly like: ex/implicit and inferred typing info, constant folds, dead code, unfolded loops, data flow, vector operations, tree view of expressions.
The first issue is that for any non trivial amount of code you'll end up with thousands of messages 90% of which are probably not very interesting (similarly to warnings in a certain style of objective programming in C). As long as the output is not interleaved with the code at the right place and the delay from writing to getting feedback is too long, the feature will loose much of its usefullness. Though don't misunderstand me, I think it's still better to have the info in any form than not.
The last point is probably the more important, as there often is a large amount of optimizations that cannot be done due to for example pointer aliasing rules, but the programmer knows that the optimization is safe. I can easily add literally hundreds of markers like "this expression is volatile", "the result of this function call will not change within this loop", "these two pointers don't alias" and it wouldn't obfuscate the code as much as with normal languages. Furthermore my editor can easily list only the meaningful options for a given expression with full descriptions of what they do.
Tomasz Borowik <firstname.lastname@example.org>