Hi !
In the gcc manual, p.276 (p.288 of the pdf version), we can read for
-funroll-loops : "This option makes code larger, and may or may not make
it run faster". My first idea is that if you unroll the loops that have
a determined number of iteration, you don't have to jump a lot of time,
you can replace a variable by several constants and consequently
optimize more. Another thing is that variables that control loops are
often on registers and the fact that they disappear provides another
register for another variable what can only improve the speed, I think.
Finally, I would like to know some reason that could make the code
slower by unrolling loops.
And, maybe, that we (I) could write to the people that write the manual
to add what will be said here to improve the manula, because I find the
"may or may not" quite weak for a manual.