... And to mention, I also experimented with teaching loop unrolling to not give up and rather unroll the loop with smaller number of iterations so the expected number of iterations is roughly 3*n_unrollings. This didn't score well for SPEC (resulting in both slower code and bigger binaries), so I went for this patch instead. Honza