Summary: | Excessive loop versioning done by vectorization + predictive commoning | ||
---|---|---|---|
Product: | gcc | Reporter: | Richard Biener <rguenth> |
Component: | tree-optimization | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | RESOLVED WORKSFORME | ||
Severity: | normal | Keywords: | missed-optimization |
Priority: | P3 | ||
Version: | 4.7.0 | ||
Target Milestone: | --- | ||
Host: | Target: | ||
Build: | Known to work: | ||
Known to fail: | Last reconfirmed: | ||
Bug Depends on: | |||
Bug Blocks: | 53947 |
Description
Richard Biener
2011-07-27 14:20:09 UTC
On the trunk I only see one copy of the loop: .L11: movups (%rbx,%rax), %xmm7 movups 0(%rbp,%rax), %xmm0 movups (%r9,%rax), %xmm1 subps %xmm7, %xmm0 movups (%r11,%rax), %xmm7 addq $16, %rax addps %xmm7, %xmm1 mulps %xmm6, %xmm0 mulps %xmm1, %xmm0 movaps %xmm0, %xmm1 addss %xmm2, %xmm1 movaps %xmm0, %xmm2 shufps $85, %xmm0, %xmm2 addss %xmm2, %xmm1 movaps %xmm0, %xmm2 unpckhps %xmm0, %xmm2 shufps $255, %xmm0, %xmm0 addss %xmm2, %xmm1 movaps %xmm1, %xmm2 addss %xmm0, %xmm2 cmpq %rax, %r10 jne .L11 pcom and vect have been exchanged, now we no longer perform predictive commoning. We're also better in identifying single-iteration loops. Let's close the issue. |