This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
empty loop elimination (lack of) ?
- From: Cyrille Chepelov <cyrille at chepelov dot org>
- To: gcc at gcc dot gnu dot org
- Date: Mon, 28 Jan 2002 22:03:17 +0100
- Subject: empty loop elimination (lack of) ?
Hi,
while testing some code on Borland C++, and as a matter of cross-checking,
I've found that there is an opportunty for optimisation that g++ misses
(bcc32 also, but that is off-topic).
Consider the attached loop-destroy.cpp file, in particular the compilation
of List<int>::isValid(). gcc (2.95.3 and 3.0.3 on ix86 give me the following
results (with -O3):
_isValid__Ct4List1Zi:
pushl %ebp
xorl %eax,%eax
movl %esp,%ebp
.align 4
L15:
incl %eax
cmpl $3,%eax
jbe L15
movl %ebp,%esp
movb $1,%al
popl %ebp
ret
... including an empty, useless stunt with %eax, which is finally replaced
with the contents of the last movb. (There is also some games with %esp and
%ebp, which don't look totally useful, but 3.0.3 looks better than 2.95.3
with that respect).
(in case you wonder, bcc32 makes even more stupid things).
Compiling with -O3 -funroll-loops gives this:
_isValid__Ct4List1Zi:
pushl %ebp
movb $1,%al
movl %esp,%ebp
movl %ebp,%esp
popl %ebp
ret
.. somewhat better, still some stuff which probably ought to be optimised.
As a point of comparison, VC++ 6.0 is completely inlining this stuff (with
fairly default optimisations) into a single (true) constant. Just as what
g++ does when the original function is simply a "return true;"
2.95.3 and 3.0.3 give the same results, give or take a movl %ebp,%esp.
1) How hard would it be to notice that the loop is empty and produces no
useful results (without unrolling it), and then to remove it ?
2) How hard would it be to re-do an inlining pass once the loop
unrolling/empty loop removal pass has been done ?
3) How hard would it be to omit stack frames whenever there is no
dereference in a tail function ? Wouldn't that be a sensible default ?
I'd be more than happy to give that a try, if someone gently shows me where
to fiddle with (no warranties, though).
-- Cyrille
For comparison's sake, there is a tarball at
http://www.chepelov.org/cyrille/loop-destroy.tar.gz, which contains this
.cpp file, and assembly outputs of a few compilers.
--
Grumpf.
#include <cstdio>
template<typename T>
class Validator {
public:
static bool isValid(const T* t) { return true; }
};
template<typename T> class List {
private:
T a_values[4];
Validator<T> a_validator;
public:
List();
bool isValid() const;
};
template<typename T> bool List<T>::isValid() const {
for (unsigned int i = 0; i < 4; ++i) {
if (!a_validator.isValid(&(a_values[i]))) return false;
}
return true;
}
int main (int argc, char **argv) {
List<int> foo;
std::printf("Hello: %d\n",foo.isValid());
return 0;
}