This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Git and GCC
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Thu, 6 Dec 2007, Jon Loeliger wrote:
>> I guess one question I posit is, would it be more accurate
>> to think of this as a "delta net" in a weighted graph rather
>> than a "delta chain"?
>
> It's certainly not a simple chain, it's more of a set of acyclic directed
> graphs in the object list. And yes, it's weigted by the size of the delta
> between objects, and the optimization problem is kind of akin to finding
> the smallest spanning tree (well, forest - since you do *not* want to
> create one large graph, you also want to make the individual trees shallow
> enough that you don't have excessive delta depth).
>
> There are good algorithms for finding minimum spanning trees, but this one
> is complicated by the fact that the biggest cost (by far!) is the
> calculation of the weights itself. So rather than really worry about
> finding the minimal tree/forest, the code needs to worry about not having
> to even calculate all the weights!
>
> (That, btw, is a common theme. A lot of git is about traversing graphs,
> like the revision graph. And most of the trivial graph problems all assume
> that you have the whole graph, but since the "whole graph" is the whole
> history of the repository, those algorithms are totally worthless, since
> they are fundamentally much too expensive - if we have to generate the
> whole history, we're already screwed for a big project. So things like
> revision graph calculation, the main performance issue is to avoid having
> to even *look* at parts of the graph that we don't need to see!)
Hmmm...
I think that these two problems (find minimal spanning forest with
limited depth and traverse graph) with the additional constraint to
avoid calculating weights / avoid calculating whole graph would be
a good problem to present at CompSci course.
Just a thought...
--
Jakub Narebski
Poland
ShadeHawk on #git