This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Peeling loops at tree level?
- From: Richard Guenther <rguenth at tat dot physik dot uni-tuebingen dot de>
- To: gcc at gcc dot gnu dot org
- Date: Mon, 13 Sep 2004 16:09:03 +0200 (CEST)
- Subject: Peeling loops at tree level?
Hi!
Do we (I suspect not) peel small loops at tree level yet? This seems
to inhibit further (tree) optimization of inlined dimension-unaware
code like:
template <int Dim>
struct Vector
{
int operator[](int i) const { return val[i]; }
int val[Dim];
};
template <int Dim>
inline int foo(const Vector<Dim>& x)
{
int res;
for (int i=0; i<Dim; ++i)
res += x[i];
return res;
}
int bar(const Vector<3>& x)
{
return foo(x);
}
where optimized tree dump for -O2 -funroll-loops looks like
;; Function int bar(const Vector<3>&) (_Z3barRK6VectorILi3EE)
int bar(const Vector<3>&) (x)
{
struct Vector<3> & x.15;
<unnamed type> D.1651;
<unnamed type> D.1652;
const int * ivtmp.8;
int i.2;
int D.1634;
struct Vector<3> * const this;
int i;
int i;
int res;
int D.1625;
int retval.1;
int D.1623;
bool retval.0;
struct Vector<3> & x;
int D.1591;
int D.1590;
<bb 0>:
ivtmp.8 = &x->val[0];
i = 0;
Invalid sum of incoming frequencies 12233, should be 10000
<L0>:;
res = *ivtmp.8 + res;
D.1652 = (<unnamed type>) i + 1;
i = (int) D.1652;
ivtmp.8 = ivtmp.8 + 4B;
if (D.1652 != 3) goto <L0>; else goto <L3>;
Invalid sum of incoming frequencies 1100, should be 3333
<L3>:;
return res;
}
and only in the assembler dump the loop is unrolled (g++-3.5 (GCC) 4.0.0
20040913 (experimental)).
Is there some magic option to tell the tree-level loop-optimizer peel
constant running loops completely?
Thanks,
Richard.
--
Richard Guenther <richard dot guenther at uni-tuebingen dot de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/