This is the mail archive of the
gcc-help@gcc.gnu.org
mailing list for the GCC project.
gcc optimizes template recursion better - Thank You!!!
- From: "Hite, Christopher" <Christopher dot Hite at partner dot commerzbank dot com>
- To: "'gcc-help at gcc dot gnu dot org'" <gcc-help at gcc dot gnu dot org>
- Cc: 'Jonathan Wakely' <jwakely dot gcc at gmail dot com>
- Date: Wed, 8 Aug 2012 13:11:36 +0200
- Subject: gcc optimizes template recursion better - Thank You!!!
The following code is a simplified version of what someone doing C++ metaprogramming would hand the optimizer for example via boost::mpl::for_each().
#include <iostream>
void foo(int i){
std::cout<<i<<std::endl;
}
template<int N>
void bar(){
bar<N-1>();
foo(N);
}
template<>
void bar<-1>(){}
int main(){
bar<1024>();
return 0;
}
gcc used to generate a bunch of bars that could be inlined, but weren't.
g++ -O3 -ftemplate-depth=1034 template_recurision_ctest.cpp
objdump -f -d -C a.out | grep bar |grep ">:"
08048700 <void bar<-1>()>:
08048720 <void bar<19>()>:
08048fb0 <void bar<39>()>:
08049840 <void bar<59>()>:
0804a0d0 <void bar<79>()>:
0804a960 <void bar<99>()>:
0804b1f0 <void bar<159>()>:
0804cb60 <void bar<179>()>:
0804d3f0 <void bar<319>()>:
0804fc00 <void bar<339>()>:
08050490 <void bar<959>()>:
080535a0 <void bar<979>()>:
08053e30 <void bar<1019>()>:
The only way to prevent this was turning on LTO, which then realized that all these functions were called only from one place and could be inlined.
g++ 4.7.1 does it right an inlines all calls to foo() into main(). Awesome!!! What changed? Who do I thank?
Even more cool is the debug information lets gdb attribute the code to a stack of 1000 bar()s.
Chris