This is the mail archive of the
mailing list for the GCC project.
GSoC ideas: sc frontend, multi output compilation, constant path swap runtime optimization
- From: Tomasz Borowik <timon37 at lavabit dot com>
- To: gcc at gcc dot gnu dot org
- Date: Mon, 19 Mar 2012 00:53:09 +0100
- Subject: GSoC ideas: sc frontend, multi output compilation, constant path swap runtime optimization
I'm sending the email again as my connection seems to be having strange issues and it doesn't look like it got through the first time (hope it doesn't get duplicated).
I'm thinking of applying for GSoC, and I've got three main ideas for gcc based around my project.
I've been working on a language and programming environment named sc: http://sf.net/projects/sclang - short description, http://sclang.sf.net - main website (there are even some outdated screencasts there).
The most beneficial task (for me) would be to just bring the front-end I've already written up to mainline quality (though not necessarily inclusion), and in the process update some of the documentation or maybe even cleanup some gcc code.
Here the main "against" I'm guessing is that another front-end is a maintenance burden (though that of course would become my duty), and the benefits to the world are dubious.
Of course I can elaborate on why I think sc is valuable and not "just another language" no one would want to use.
As for the argument that a front-end is too much work for gsoc, in my case that's probably not the case as the main part atm is about 3.5k lines and it supports most of sc which is equivalent to most of c.
As for how far along the language is, I've recently switched to developing it in itself (that includes a fairly advanced source code editing widget for qt) and I've written some critical pieces like just-in-time compilation (using libjit) purely in it. So it's working to at least some extent, though I'm afraid it may be hard for anyone else to use atm due to the unstable state of the editor.
Some other ideas revolve around what I need for sc.
The first one is that I need to be able to compile one source file into multiple output files, currently I've hacked it with an extra argument taking a list of symbols to output, but it requires running gcc once per every output file, the overhead of that is fairly large, and it's unsafe to recompile just a few output files due to inlining, though since it can work on multiple cores it's still way worth it.
The perfect solution would be to shun away the standard model and actually support a kind of on-demand recompilation where the editor tells the compiler (running in background) what has changed and the compiler (having a function inlining map) recompiles only the necessary pieces and replaces them in the elf files. Though that's probably too far-fetched (there's also the question of how much of gimple/generic could be kept between recompilations).
Simplifying we skip the "working in background" and the last step, and just generate separate .s files, I've almost hacked this but I had a lot of trouble with missing or duplicate variables/labels etc.
Either way it would be fairly important to allow for that one instance to work on multiple threads but I'm guessing that's a fairly daunting task, even if the front-end could make guarantees that no objects would overlap (maybe even make duplicates so they don't).
Regardless of the scope of the task it would also have to include extracting the function inlining map and probably writing it into a file.
A more middle-back-end task is that I see a need for a certain fairly simple kind of self-modifying code.
Basically a lot of applications have lots of configuration of their behavior that doesn't get changed through most of run-time, yet there are lots of branches, variable references and pointer dereferences associated with it.
It should be possible to create something like a global variable that can be read like one (from the programmers perspective) but it gets inlined everywhere like a constant, and a function for changing it's value is generated that changes that value at every point in the output binary.
Now unfortunately that would still leave a lot of branches in the code so something a bit more complex would be nice, like leaving enough space for either the then or else block, but putting only one in and padding with nops or a jump.
This isn't really smc very much imo, something more like templating binary code... or template based jit compilation, or let's call it something like "constant path swapping";p
Either way the benefit is getting a lot of branches, de/references, etc. out of the hot-path, at the cost of significantly slowing down the cold path (assuming people use it for what it's meant for).
If a given backend does not support the feature it just outputs normal code. There is probably a few more aspects/possibilities to this I'm just loosely throwing it up for comment atm.
I probably haven't given enough thought to this since I don't have a good idea of the issues associated. Definitely lto would break this but if you ask me lto is a bit misguided as the previously mentioned idea coupled with the way sc and its editor is designed makes it obsolete (unless I'm misunderstanding something).
I've got one and a half more ideas, and possibly more that skipped my mind but these should be enough for now.
Any feedback and/or questions are very welcome.
Tomasz "timon" Borowik