Bug 2806 - if's are not combined together
Summary: if's are not combined together
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 2.95.3
: P3 enhancement
Target Milestone: tree-ssa
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2001-05-11 15:06 UTC by eric-gcc
Modified: 2004-03-05 20:54 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work: tree-ssa
Known to fail:
Last reconfirmed: 2004-03-04 03:06:04


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description eric-gcc 2001-05-11 15:06:01 UTC
This code:
----  (example.cpp)
class Fred {
   public:
      Fred(int x) : x_(x) { }

   private:
      int x_;
};

class Barney : public Fred {
   public:
      Barney(int x, int y) : Fred(x), y_(y) { }

   private:
      int y_;
};

extern const Barney abarney(5, 6);
extern const Fred afred(2);
----

when compiled liked this: g++ -pipe -march=athlon -O2 -S example.cpp

results in this assembly:
---- (pessimized example.s)
        .file	"example.cpp"
	.version	"01.01"
gcc2_compiled.:
.globl abarney
.bss
	.align 4
	.type	 abarney,@object
	.size	 abarney,8
abarney:
	.zero	8
.globl afred
	.align 4
	.type	 afred,@object
	.size	 afred,4
afred:
	.zero	4
.text
	.align 16
	.type	 __static_initialization_and_destruction_0,@function
__static_initialization_and_destruction_0:
.LFB1:
	pushl	%ebp
.LCFI0:
	movl	%esp, %ebp
.LCFI1:
	movl	12(%ebp), %eax
	movl	8(%ebp), %edx
	cmpl	$65535, %eax
	jne	.L3
	cmpl	$1, %edx
	jne	.L3
	movl	$5, abarney
	movl	$6, abarney+4
.L3:
	cmpl	$65535, %eax
	jne	.L11
	decl	%edx
	jne	.L11
	movl	$2, afred
.L11:
	popl	%ebp
	ret
.LFE1:
.Lfe1:
	.size	
__static_initialization_and_destruction_0,.Lfe1-__static_initialization_and_destruction_0
	.align 16
	.type	 _GLOBAL_.I.abarney,@function
_GLOBAL_.I.abarney:
.LFB2:
	pushl	%ebp
.LCFI2:
	movl	%esp, %ebp
.LCFI3:
	subl	$16, %esp
.LCFI4:
	pushl	$65535
	pushl	$1
.LCFI5:
	call	__static_initialization_and_destruction_0
	addl	$16, %esp
	leave
	ret
.LFE2:
.Lfe2:
	.size	 _GLOBAL_.I.abarney,.Lfe2-_GLOBAL_.I.abarney
		.section	.ctors,"aw"
	.long	 _GLOBAL_.I.abarney
	.ident	"GCC: (GNU) 2.96 20000731 (Red Hat Linux 7.1 2.96-81)"
----

when it should've resulted in something like this:
---- .file	"example2.cpp"
	.version	"01.01"
gcc2_compiled.:
.globl abarney
		.section	.rodata
	.align 4
	.type	 abarney,@object
	.size	 abarney,8
abarney:
	.long	5
	.long	6
.globl afred
	.align 4
	.type	 afred,@object
	.size	 afred,4
afred:
	.long	2
	.ident	"GCC: (GNU) 2.96 20000731 (Red Hat Linux 7.1 2.96-81)"
----

The results of all these initializers can be statically determined at compile time.  There is no reason I can think of they shouldn't be stuck in the .rodata section.  If they were not const, they could be stuck in the .data section.

Even if the initializer is run, the initializer itself will make repeated
comparisons of %eax against $65535 instead of simply jumping to the end if
the first such comparison registers as being equal.  This situation is also
easily determined by simple analysis, and jump optimization in ordinary
code would catch it.

Actually, the repeated compare jump sequence can be created with non-initializer code too.  Here's an example program that I've been told produced sub-optimal assembly:

int abarney[2];
int afred[1];

void foo(int edx, int eax)
{
  if (eax == 65535)
    {
      if (edx == 1)
        {
          abarney[0] = 5;
          abarney[1] = 6;
        }
    }
  if (eax == 65535)
    {
      if (--edx == 0)
        afred[0] = 2;
    }
}

Release:
gcc 2.96  (Yeah, yeah, I know, RedHat's, but I bet gcc 2.95.3 does it too)

Environment:
RH Linux 7.0
Comment 1 Nathan Sidwell 2002-04-26 02:49:15 UTC
State-Changed-From-To: open->analyzed
State-Changed-Why: yes, it bugs me too sometimes :-)
Comment 2 Andrew Pinski 2003-06-15 19:01:53 UTC
Still happens on the mainline (20030615) and changing the symmary to really what this 
bug is about, the initializers are just that simplified which is caused by this.
Comment 3 eric-gcc 2003-06-16 16:34:38 UTC
Well, actually, the bug is two issues.

The first is that trivial static initializers of const objects like this should
be analyzed at compile time, and put in the .rodata segment, provided, of
course, there are equally trivial destructors.

If it's non-const or there are trivial constructors, but not trivial
destructors, then it should be put into the .data segment.  There is no reason
those things need to have any run time code at all.

The second issue is the "if statements are not combined together" issue.
Comment 4 Andrew Pinski 2003-07-21 02:56:44 UTC
In fact if the if are combined together before emitting to rtl some of the problems in bug 
10060 	would fixed as the number of rtl's would be greatly reduced.
Comment 5 Andrew Pinski 2003-11-17 15:41:02 UTC
Suspending as the combining of the if is fixed on the tree-ssa branch.
On PPC the code now looks like:
_foo:
        li r2,0
        cmpwi cr6,r3,1
        ori r0,r2,65535
        cmpw cr7,r4,r0
        bnelr+ cr7
        bnelr+ cr6
        lis r7,ha16(L_abarney$non_lazy_ptr)
        lis r5,ha16(L_afred$non_lazy_ptr)
        lwz r9,lo16(L_abarney$non_lazy_ptr)(r7)
        li r6,6
        li r11,2
        lwz r4,lo16(L_afred$non_lazy_ptr)(r5)
        li r3,5
        stw r6,4(r9)
        stw r11,0(r4)
        stw r3,0(r9)
        blr
Comment 6 Andrew Pinski 2003-12-17 00:53:16 UTC
Closing as fixed for the tree-ssa.
Comment 7 Andrew Pinski 2003-12-17 17:05:49 UTC
Reopening the bugs that are fixed on the tree-ssa (but not reported against the branch).
Comment 8 Andrew Pinski 2003-12-17 17:10:08 UTC
Suspending based on this is fixed on the tree-ssa.
Comment 9 Andrew Pinski 2004-03-04 03:06:03 UTC
We are now missing one jump threading for some reason:
(From .DOM3):
{
<bb 0>:
  if (eax_1 == 65535) goto <L0>; else goto <L6>;

<L0>:;
  if (edx_2 == 1) goto <L1>; else goto <L4>;

<L1>:;
  abarney[0] = 5;
  abarney[1] = 6;

<L4>:;
  edx_3 = edx_2 - 1;
  if (edx_3 == 0) goto <L5>; else goto <L6>;

<L5>:;
  afred[0] = 2;

<L6>:;
  return;

}
Comment 10 Jeffrey A. Law 2004-03-04 05:00:26 UTC
Subject: Re:  if's are not combined together 

In message <20040304030605.4965.qmail@sources.redhat.com>, "pinskia at gcc dot 
gnu dot org" writes:
 >
 >------- Additional Comments From pinskia at gcc dot gnu dot org  2004-03-04 0
 >3:06 -------
 >We are now missing one jump threading for some reason:
 >(From .DOM3):
 >{
 ><bb 0>:
 >  if (eax_1 == 65535) goto <L0>; else goto <L6>;
 >
 ><L0>:;
 >  if (edx_2 == 1) goto <L1>; else goto <L4>;
 >
 ><L1>:;
 >  abarney[0] = 5;
 >  abarney[1] = 6;
 >
 ><L4>:;
 >  edx_3 = edx_2 - 1;
 >  if (edx_3 == 0) goto <L5>; else goto <L6>;
 >
 ><L5>:;
 >  afred[0] = 2;
 >
 ><L6>:;
 >  return;
What jump do you think should be threaded?   I don't see any jump threading
opportunities here -- at least not without knowing that edx_3 is only used
once.

jeff

Comment 11 Jeffrey A. Law 2004-03-05 03:53:24 UTC
Subject: Re:  if's are not combined together 

In message <20040304030605.4965.qmail@sources.redhat.com>, "pinskia at gcc dot 
gnu dot org" writes:
 >
 >------- Additional Comments From pinskia at gcc dot gnu dot org  2004-03-04 0
 >3:06 -------
 >We are now missing one jump threading for some reason:
 >(From .DOM3):
 >{
 ><bb 0>:
 >  if (eax_1 == 65535) goto <L0>; else goto <L6>;
 >
 ><L0>:;
 >  if (edx_2 == 1) goto <L1>; else goto <L4>;
 >
 ><L1>:;
 >  abarney[0] = 5;
 >  abarney[1] = 6;
 >
 ><L4>:;
 >  edx_3 = edx_2 - 1;
 >  if (edx_3 == 0) goto <L5>; else goto <L6>;
 >
 ><L5>:;
 >  afred[0] = 2;
 >
 ><L6>:;
 >  return;
OK.  edx_3 is a single use variable and we can (in theory) propagate it into
the conditional and adjust the conditional.  ie

if (edx_3 == 0)

can be turned into

  if (edx_2 == 1)

Once that's done, then the jump threader can thread through that block.

I've got a patch which does that as a straightforward extension to
tree-ssa-forwprop.  I'm still evaluating whether or not I want to risk
introducing it now or wait until after we merge.

On a positive note, with that improvement to tree-ssa-forwprop tree-ssa
generates noticeably better code than the mainline GCC sources.


jeff

Comment 12 Jeffrey A. Law 2004-03-05 20:54:44 UTC
We now fully thread the IF conditionals in both examples.