This code: ---- (example.cpp) class Fred { public: Fred(int x) : x_(x) { } private: int x_; }; class Barney : public Fred { public: Barney(int x, int y) : Fred(x), y_(y) { } private: int y_; }; extern const Barney abarney(5, 6); extern const Fred afred(2); ---- when compiled liked this: g++ -pipe -march=athlon -O2 -S example.cpp results in this assembly: ---- (pessimized example.s) .file "example.cpp" .version "01.01" gcc2_compiled.: .globl abarney .bss .align 4 .type abarney,@object .size abarney,8 abarney: .zero 8 .globl afred .align 4 .type afred,@object .size afred,4 afred: .zero 4 .text .align 16 .type __static_initialization_and_destruction_0,@function __static_initialization_and_destruction_0: .LFB1: pushl %ebp .LCFI0: movl %esp, %ebp .LCFI1: movl 12(%ebp), %eax movl 8(%ebp), %edx cmpl $65535, %eax jne .L3 cmpl $1, %edx jne .L3 movl $5, abarney movl $6, abarney+4 .L3: cmpl $65535, %eax jne .L11 decl %edx jne .L11 movl $2, afred .L11: popl %ebp ret .LFE1: .Lfe1: .size __static_initialization_and_destruction_0,.Lfe1-__static_initialization_and_destruction_0 .align 16 .type _GLOBAL_.I.abarney,@function _GLOBAL_.I.abarney: .LFB2: pushl %ebp .LCFI2: movl %esp, %ebp .LCFI3: subl $16, %esp .LCFI4: pushl $65535 pushl $1 .LCFI5: call __static_initialization_and_destruction_0 addl $16, %esp leave ret .LFE2: .Lfe2: .size _GLOBAL_.I.abarney,.Lfe2-_GLOBAL_.I.abarney .section .ctors,"aw" .long _GLOBAL_.I.abarney .ident "GCC: (GNU) 2.96 20000731 (Red Hat Linux 7.1 2.96-81)" ---- when it should've resulted in something like this: ---- .file "example2.cpp" .version "01.01" gcc2_compiled.: .globl abarney .section .rodata .align 4 .type abarney,@object .size abarney,8 abarney: .long 5 .long 6 .globl afred .align 4 .type afred,@object .size afred,4 afred: .long 2 .ident "GCC: (GNU) 2.96 20000731 (Red Hat Linux 7.1 2.96-81)" ---- The results of all these initializers can be statically determined at compile time. There is no reason I can think of they shouldn't be stuck in the .rodata section. If they were not const, they could be stuck in the .data section. Even if the initializer is run, the initializer itself will make repeated comparisons of %eax against $65535 instead of simply jumping to the end if the first such comparison registers as being equal. This situation is also easily determined by simple analysis, and jump optimization in ordinary code would catch it. Actually, the repeated compare jump sequence can be created with non-initializer code too. Here's an example program that I've been told produced sub-optimal assembly: int abarney[2]; int afred[1]; void foo(int edx, int eax) { if (eax == 65535) { if (edx == 1) { abarney[0] = 5; abarney[1] = 6; } } if (eax == 65535) { if (--edx == 0) afred[0] = 2; } } Release: gcc 2.96 (Yeah, yeah, I know, RedHat's, but I bet gcc 2.95.3 does it too) Environment: RH Linux 7.0
State-Changed-From-To: open->analyzed State-Changed-Why: yes, it bugs me too sometimes :-)
Still happens on the mainline (20030615) and changing the symmary to really what this bug is about, the initializers are just that simplified which is caused by this.
Well, actually, the bug is two issues. The first is that trivial static initializers of const objects like this should be analyzed at compile time, and put in the .rodata segment, provided, of course, there are equally trivial destructors. If it's non-const or there are trivial constructors, but not trivial destructors, then it should be put into the .data segment. There is no reason those things need to have any run time code at all. The second issue is the "if statements are not combined together" issue.
In fact if the if are combined together before emitting to rtl some of the problems in bug 10060 would fixed as the number of rtl's would be greatly reduced.
Suspending as the combining of the if is fixed on the tree-ssa branch. On PPC the code now looks like: _foo: li r2,0 cmpwi cr6,r3,1 ori r0,r2,65535 cmpw cr7,r4,r0 bnelr+ cr7 bnelr+ cr6 lis r7,ha16(L_abarney$non_lazy_ptr) lis r5,ha16(L_afred$non_lazy_ptr) lwz r9,lo16(L_abarney$non_lazy_ptr)(r7) li r6,6 li r11,2 lwz r4,lo16(L_afred$non_lazy_ptr)(r5) li r3,5 stw r6,4(r9) stw r11,0(r4) stw r3,0(r9) blr
Closing as fixed for the tree-ssa.
Reopening the bugs that are fixed on the tree-ssa (but not reported against the branch).
Suspending based on this is fixed on the tree-ssa.
We are now missing one jump threading for some reason: (From .DOM3): { <bb 0>: if (eax_1 == 65535) goto <L0>; else goto <L6>; <L0>:; if (edx_2 == 1) goto <L1>; else goto <L4>; <L1>:; abarney[0] = 5; abarney[1] = 6; <L4>:; edx_3 = edx_2 - 1; if (edx_3 == 0) goto <L5>; else goto <L6>; <L5>:; afred[0] = 2; <L6>:; return; }
Subject: Re: if's are not combined together In message <20040304030605.4965.qmail@sources.redhat.com>, "pinskia at gcc dot gnu dot org" writes: > >------- Additional Comments From pinskia at gcc dot gnu dot org 2004-03-04 0 >3:06 ------- >We are now missing one jump threading for some reason: >(From .DOM3): >{ ><bb 0>: > if (eax_1 == 65535) goto <L0>; else goto <L6>; > ><L0>:; > if (edx_2 == 1) goto <L1>; else goto <L4>; > ><L1>:; > abarney[0] = 5; > abarney[1] = 6; > ><L4>:; > edx_3 = edx_2 - 1; > if (edx_3 == 0) goto <L5>; else goto <L6>; > ><L5>:; > afred[0] = 2; > ><L6>:; > return; What jump do you think should be threaded? I don't see any jump threading opportunities here -- at least not without knowing that edx_3 is only used once. jeff
Subject: Re: if's are not combined together In message <20040304030605.4965.qmail@sources.redhat.com>, "pinskia at gcc dot gnu dot org" writes: > >------- Additional Comments From pinskia at gcc dot gnu dot org 2004-03-04 0 >3:06 ------- >We are now missing one jump threading for some reason: >(From .DOM3): >{ ><bb 0>: > if (eax_1 == 65535) goto <L0>; else goto <L6>; > ><L0>:; > if (edx_2 == 1) goto <L1>; else goto <L4>; > ><L1>:; > abarney[0] = 5; > abarney[1] = 6; > ><L4>:; > edx_3 = edx_2 - 1; > if (edx_3 == 0) goto <L5>; else goto <L6>; > ><L5>:; > afred[0] = 2; > ><L6>:; > return; OK. edx_3 is a single use variable and we can (in theory) propagate it into the conditional and adjust the conditional. ie if (edx_3 == 0) can be turned into if (edx_2 == 1) Once that's done, then the jump threader can thread through that block. I've got a patch which does that as a straightforward extension to tree-ssa-forwprop. I'm still evaluating whether or not I want to risk introducing it now or wait until after we merge. On a positive note, with that improvement to tree-ssa-forwprop tree-ssa generates noticeably better code than the mainline GCC sources. jeff
We now fully thread the IF conditionals in both examples.