18076 – Missed jump threading optimization

Bug 18076 - Missed jump threading optimization

Summary: Missed jump threading optimization

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	tree-optimization (show other bugs)
Version:	unknown

Importance:	P2 enhancement
Target Milestone:	4.1.0
Assignee:	Diego Novillo

URL:
Keywords:	alias, missed-optimization, TREE

Depends on:
Blocks:	jumpthreading
	Show dependency tree / graph

Reported:	2004-10-20 12:03 UTC by Steven Bosscher
Modified:	2005-06-05 07:46 UTC (History)
CC List:	3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:	2004-10-20 14:35:09

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Steven Bosscher 2004-10-20 12:03:38 UTC

On AMD64 the following piece of code still triggers one jump 
threading opportunity in the RTL threader that we miss it in 
the tree threader on the tree-cleanup-branch: 
 
===================================================== 
extern int x; 
extern int y; 
 
void 
foo (void) 
{ 
  if ((x & 0x00000001) || (x & 0x00004000)) 
    y = 0; 
  if ((x & 0x00000001) || (x & 0x00004000)) 
    y = 1; 
  if ((x & 0x00000001) || (x & 0x00004000)) 
    y = 2; 
  if ((x & 0x00000001) || (x & 0x00004000)) 
    y = 3; 
  if ((x & 0x00000001) || (x & 0x00004000)) 
    y = 4; 
} 
===================================================== 
 
This test case was reduced from insn-opinit.c where we're still 
threading 405 jumps, most of them of the kind shown in the test 
case, with the RTL threader. 
 
We do catch this on mainline. 
 
The assembly for the tree-cleanup-branch is worse than mainline: 
 
MAINLINE: -O2                           TCB: -O2 
        .file   "t.c"                           .file   "t.c" 
        .text                                   .text 
        .p2align 4,,15                          .p2align 4,,15 
.globl foo                                      .globl foo 
        .type   foo, @function                  .type   foo, @function 
foo:                                    foo: 
.LFB2:                                  .LFB2: 
        movl    x(%rip), %eax                   movl    x(%rip), %eax 
        movl    %eax, %edx                      movl    %eax, %edx 
        andl    $1, %edx                        andl    $1, %edx 
        jne     .L2                             jne     .L2 
        testb   $64, %ah                        testb   $64, %ah 
        je      .L19                  |         je      .L4 
.L2:                                    .L2: 
        testb   %dl, %dl                        testb   %dl, %dl 
        movl    $0, y(%rip)                     movl    $0, y(%rip) 
        je      .L23                  |         jne     .L5 
                                      >         testb   $64, %ah 
                                      >         je      .L4 
                                      > .L5: 
        testb   %dl, %dl                        testb   %dl, %dl 
        movl    $1, y(%rip)                     movl    $1, y(%rip) 
        je      .L24                  |         je      .L4 
.L10:                                 < 
        testb   %dl, %dl                        testb   %dl, %dl 
        movl    $2, y(%rip)                     movl    $2, y(%rip) 
        je      .L25                  |         je      .L16 
.L14:                                 | .L9: 
        testb   %dl, %dl                        testb   %dl, %dl 
        movl    $3, y(%rip)                     movl    $3, y(%rip) 
        je      .L26                  |         je      .L17 
.L17:                                 | .L12: 
        movl    $4, y(%rip)                     movl    $4, y(%rip) 
.L19:                                 | .L14: 
        rep ; ret                               rep ; ret 
        .p2align 4,,7                           .p2align 4,,7 
.L23:                                 | .L4: 
        testb   $64, %ah                        testb   $64, %ah 
        je      .L19                  |         je      .L14 
        testb   %dl, %dl              < 
        movl    $1, y(%rip)           < 
        jne     .L10                  < 
.L24:                                 < 
        testb   $64, %ah              < 
        je      .L19                  < 
        testb   %dl, %dl                        testb   %dl, %dl 
        movl    $2, y(%rip)                     movl    $2, y(%rip) 
        jne     .L14                  |         jne     .L9 
.L25:                                 |         jmp     .L16 
                                      >         .p2align 4,,7 
                                      > .L17: 
        testb   $64, %ah                        testb   $64, %ah 
        je      .L19                  |         .p2align 4,,2 
                                      >         je      .L14 
                                      >         movl    $4, y(%rip) 
                                      >         .p2align 4,,4 
                                      >         jmp     .L14 
                                      >         .p2align 4,,7 
                                      > .L16: 
                                      >         testb   $64, %ah 
                                      >         .p2align 4,,2 
                                      >         je      .L14 
        testb   %dl, %dl                        testb   %dl, %dl 
        movl    $3, y(%rip)                     movl    $3, y(%rip) 
        jne     .L17                  |         jne     .L12 
.L26:                                 < 
        testb   $64, %ah              < 
        jne     .L17                  < 
        .p2align 4,,2                           .p2align 4,,2 
        ret                           |         jmp     .L17 
.LFE2:                                  .LFE2: 
        .size   foo, .-foo                      .size   foo, .-foo 
 
This could come from no longer iterating DOM, I guess??

Comment 1 Andrew Pinski 2004-10-20 14:35:08 UTC

If I make the x variable a paramater and y a variable (return y so that y is still used) then it works on the 
tree level so this is an aliasing causing missed optimization.
Aka this works:
int
foo (int x)
{
 int y =-1;
 if ((x & 0x00000001) || (x & 0x00004000))
    y = 0;
  if ((x & 0x00000001) || (x & 0x00004000))
    y = 1;
  if ((x & 0x00000001) || (x & 0x00004000))
    y = 2;
  if ((x & 0x00000001) || (x & 0x00004000))
    y = 3;
  if ((x & 0x00000001) || (x & 0x00004000))
    y = 4;
  return y;
}

Comment 2 Jeffrey A. Law 2005-02-14 20:43:52 UTC

I'll note the updated jump threading selection code will catch all these
threading opportunities.  I get something like this:


foo:
        pushl   %ebp
        movl    %esp, %ebp
        movl    x, %eax
        testb   $1, %al
        jne     .L2
        testb   $64, %ah
        je      .L7
.L2:
        movl    $3, y
        movl    $4, y
.L7:
        leave
        ret


I'll note we still have dead stores.   :(  Missed by both the tree-ssa
optimizers because we don't handle V_MUST_DEF and the RTL optimizers for reasons
unknown.

Comment 3 Jeffrey A. Law 2005-04-23 01:01:42 UTC

The threading part of this has been fixed  Now we just need to fix DSE to
finish cleaning things up.

Comment 4 Steven Bosscher 2005-04-23 16:54:07 UTC

Nice.  And indeed surprising that the RTL DSE doesn't catch that trivially 
dead store.  Should I open a separate bug report for that?

Comment 5 Jeffrey A. Law 2005-04-25 05:02:11 UTC

Subject: Re:  Missed jump threading
	optimization

On Sat, 2005-04-23 at 16:54 +0000, steven at gcc dot gnu dot org wrote:
> ------- Additional Comments From steven at gcc dot gnu dot org  2005-04-23 16:54 -------
> Nice.  And indeed surprising that the RTL DSE doesn't catch that trivially 
> dead store.  Should I open a separate bug report for that? 
Your call.  Or we could just link this bug into the existing DSE bug.

I think we'd be better off improving the tree DSE rather than the RTL
stuff.  This one is something we really should be catching before
we hand the code off to the RTL expanders.

jeff

Comment 6 Andrew Pinski 2005-05-08 18:04:55 UTC

Fixed both the DSE and the threading issue.