56631 – duplicated sse code in switch

Bug 56631 - duplicated sse code in switch

Summary: duplicated sse code in switch

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	tree-optimization (show other bugs)
Version:	unknown

Importance:	P3 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:	missed-optimization

Depends on:
Blocks:

Reported:	2013-03-16 11:34 UTC by Ondrej Bilka
Modified:	2021-12-05 10:37 UTC (History)
CC List:	0 users

See Also:
Host:
Target:	x86_64--
Build:
Known to work:	7.1.0
Known to fail:	5.1.0, 6.4.0
Last reconfirmed:

Attachments
testcase (439 bytes, text/x-csrc) 2013-03-16 11:36 UTC, Ondrej Bilka	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Ondrej Bilka 2013-03-16 11:34:51 UTC

Consider attached testcase. When compiled with -Os,-O2,-O3 it duplicates zeroing xmm1 register across all branches. Moving zeroing before braches will save space.

Relevant assembly at -Os is

  jmp *.L19(,%rax,8)
  .section  .rodata
  .align 8
  .align 4
.L19:
  .quad .L21
  .quad .L4
  .quad .L5
snip 

.L21:
  xorps %xmm1, %xmm1
.L38:
  movaps  %xmm0, %xmm2
  pcmpeqb %xmm1, %xmm2
  pmovmskb  %xmm2, %eax
  testl %eax, %eax
  jne .L1
.L2:
  movdqu  %xmm0, (%rdi)
  addq  $64, %rdi
  movups  64(%rsi), %xmm0
  addq  $64, %rsi
  jmp .L38
.L4:
  xorps %xmm1, %xmm1
  incq  %rdi
.L23:
snip
.L5:
  xorps %xmm1, %xmm1
  addq  $2, %rdi

Comment 1 Ondrej Bilka 2013-03-16 11:36:04 UTC

Created attachment 29678 [details]
testcase

Comment 2 Andrew Pinski 2021-12-05 10:37:12 UTC

Fixed in GCC 7 by an extra copy loop header.