As Uros says in bug 54507, the reflect test from libgo is another example.
Without var tracking it takes 20s on my x86_64 box (tested with a 32-bit
compiler). With var tracking it takes 46m 40s.
It only happens with -m32. It takes less than 20 seconds for
-m64 and -mx32. Turn-off LRA takes only 13 seconds to finish.
It isn't LRA that doesn't scale, it is var-tracking memory clobbering that needs improvements, but it would be nice to analyze what are the changes that LRA does compared to reload that make var-tracking to take that much longer and whether the generated code is better or worse. If it is better, this is just PR54402 dup.
LRA reuses stack memory much better than reload (in all modes but especially in -O0). May be that is the reason of the var-tracking problem.
The testcase is compiled with -O2, not -O0.
(In reply to comment #2)
> LRA reuses stack memory much better than reload (in all modes but especially
> in -O0). May be that is the reason of the var-tracking problem.
I forgot to say that LRA understands -fno-ira-share-spill-slots. In this case, each pseudo gets own stack slot.
I thing it is worth to try it.
-fno-ira-share-spill-slots doesn't make a difference.
I believe on that testcase it was because without LRA the function didn't use a frame pointer, while with LRA for some reason it does.
(In reply to comment #6)
> I believe on that testcase it was because without LRA the function didn't use a
> frame pointer, while with LRA for some reason it does.
Can we make this bug a dup or is this now about IRA vs. LRA and the frame
GCC 4.8.0 is being released, adjusting target milestone.
GCC 4.8.1 has been released.
GCC 4.8.2 has been released.
*** This bug has been marked as a duplicate of bug 54402 ***