Bug 56631 - duplicated sse code in switch
Summary: duplicated sse code in switch
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: unknown
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2013-03-16 11:34 UTC by Ondrej Bilka
Modified: 2021-12-05 10:37 UTC (History)
0 users

See Also:
Host:
Target: x86_64-*-*
Build:
Known to work: 7.1.0
Known to fail: 5.1.0, 6.4.0
Last reconfirmed:


Attachments
testcase (439 bytes, text/x-csrc)
2013-03-16 11:36 UTC, Ondrej Bilka
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ondrej Bilka 2013-03-16 11:34:51 UTC
Consider attached testcase. When compiled with -Os,-O2,-O3 it duplicates zeroing xmm1 register across all branches. Moving zeroing before braches will save space.

Relevant assembly at -Os is

  jmp *.L19(,%rax,8)
  .section  .rodata
  .align 8
  .align 4
.L19:
  .quad .L21
  .quad .L4
  .quad .L5
snip 

.L21:
  xorps %xmm1, %xmm1
.L38:
  movaps  %xmm0, %xmm2
  pcmpeqb %xmm1, %xmm2
  pmovmskb  %xmm2, %eax
  testl %eax, %eax
  jne .L1
.L2:
  movdqu  %xmm0, (%rdi)
  addq  $64, %rdi
  movups  64(%rsi), %xmm0
  addq  $64, %rsi
  jmp .L38
.L4:
  xorps %xmm1, %xmm1
  incq  %rdi
.L23:
snip
.L5:
  xorps %xmm1, %xmm1
  addq  $2, %rdi
Comment 1 Ondrej Bilka 2013-03-16 11:36:04 UTC
Created attachment 29678 [details]
testcase
Comment 2 Andrew Pinski 2021-12-05 10:37:12 UTC
Fixed in GCC 7 by an extra copy loop header.