This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug c++/11131] New: Unrelated declaration removes inline flag from function


           Summary: Unrelated declaration removes inline flag from function
           Product: gcc
           Version: 3.4
            Status: UNCONFIRMED
          Keywords: pessimizes-code
          Severity: normal
          Priority: P2
         Component: c++

This is a weird failure: for this code, calling FOO=foo1
yields worse results than FOO=foo2 (despite the throw() specification)
if and only the forward declaration of the specialization of UNRELATED
is present:
template <int>
struct A {
    void foo1 () throw ();
    void foo2 ();

    void UNRELATED ();

template <> void A<0>::UNRELATED ();

template <int dim> inline void A<dim>::foo1 () throw () {}
template <int dim> inline void A<dim>::foo2 ()          {}

void bar (A<0> &a) {
  a.FOO ();

Compiling on an x86 box with present mainline like this
    c++ -O2 -fPIC -S -o 1.s -DFOO=foo1
    c++ -O2 -fPIC -S -o 2.s -DFOO=foo2
yields two assembler files for the cases where bar either
calls A<2>::foo1 or A<2>::foo2. One would expect the
output to be the same, since neither foo1 not foo2 can
throw exceptions, so the exception specification seems
redundant to what the compiler can infer anyway.

Alas, this is not the case. The output for calling foo2 is optimal
(labels etc deleted): 
	pushl	%ebp
	movl	%esp, %ebp
	popl	%ebp
The output for calling foo1, on the other hand, is outrageously bad:
	pushl	%ebp
	movl	%esp, %ebp
	pushl	%ebx
	subl	$4, %esp
	movl	8(%ebp), %eax
	call	.LPR3
	addl	$_GLOBAL_OFFSET_TABLE_, %ebx
	movl	%eax, (%esp)
	call	_ZN1AILi0EE4foo1Ev@PLT
	popl	%eax
	popl	%ebx
	popl	%ebp

	pushl	%ebp
	movl	%esp, %ebp
	popl	%ebp

	movl	(%esp), %ebx

First, A<2>::foo not inlined, and the call to LPR3 is really atrocious!
The really mysterious thing is that calling foo1 produces exactly the
same (optimal) code as for foo2 when the specialization declaration of
UNRELATED is not present, or if it is for some other template value
than the one that is also used in bar(). In other words, the right
questions seems not why foo2 is better than foo1, but: what has
UNRELATED to do with it?

Now, admittedly the problem also goes away if one uses -O3 instead of
-O2, or more precisely -finline-functions, but that flag is somewhat
unpopular due to its compile time implications. The code is so
obviously wrong that I think it would be worthwhile to figure out what
the specialization has to do with it. Something's really not right
with that and it might be that this might lead to problems with -O3 in
other contexts.

I tried to poke a little at what goes on in the compiler, but the
output for the two cases generated by -da differs right from the first
phase, Note that UNRELATED does not appear even once in
the output files of -da, so the problem is really due to something
that seems to be in the front-end.

The obvious thing that UNRELATED strips the "inline" attribute from
the _next_ function is also not true, since exchanging the definitions
of foo[12] does not change the result. On the other hand, placing the
declaration of the specialization _after_ foo[12] _does_ make the
problem go away. I fear this is about as far as I can help...


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]