A commonly used benchmark contains a hot loop which calls one of 2 virtual functions via a static variable which is set just before. A reduced example is: int f1(int x) { return x + 1; } int f2(int x) { return x + 2; } static int (*virt)(int); int f(int *p, int n, int x) { int i; if (x > 10) virt = f1; else virt = f2; for (i = 0; i < n; i++) p[i] = virt (i); } This is obviously very stupid code, however GCC could do better. virt is not address-taken, neither f1 nor f2 make a call, so virt is effectively a local variable. So the loop could be transformed to p[i] = (x > 10) ? f1 (i) : f2 (i); to enable inlining after which the if can be lifted: int f_opt(int *p, int n, int x) { int i; if (x > 10) virt = f1; else virt = f2; if (x > 10) for (i = 0; i < n; i++) p[i] = f1 (i); else for (i = 0; i < n; i++) p[i] = f2 (i); }
To some extend this is loop versioning. There is another bug where we don't need the versioning and GCC does not devirtualization the loop but I can't find it right now; this bug should depend on that bug. Anyways confirmed.
Note that for inlining to kick in this all has to be done during early optimization which is somewhat against its intent (perform transforms reducing code-size). Putting x > 10 ? f1 : f2 into the loop is also against any reasonable heuristics (move invariant stuff outside). Thus it's going to be tricky overall to teach GCC to handle this situation. Can you share preprocessed source of that "commonly used benchmark" (or name it at least?)
I think this shows up in h264 code in spec2006.