This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug middle-end/60482] New: Loop optimization regression

From: "yvan.roux at linaro dot org" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Mon, 10 Mar 2014 12:10:08 +0000
Subject: [Bug middle-end/60482] New: Loop optimization regression
Auto-submitted: auto-generated

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60482

            Bug ID: 60482
           Summary: Loop optimization regression
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: yvan.roux at linaro dot org

Created attachment 32323
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=32323&action=edit
trunk.s

Hi,

I didn't had time to investigate further, but I want to raise quickly that the
code bellow was optimized at r204283 by taking into account the trip count
information of the loop and is not with the trunk (I spotted the issue on
AArch64 and x86_64).

code:

typedef double adouble __attribute__ ((__aligned__(16)));

double p1(adouble *x, int n)
{
  double p1_ = 0.0;

  (!(n % 128) == 0) ? __builtin_unreachable() : 1 ;

  for (int i=0; i<n; i++)
    p1_ += x[i] ;
  return p1_ ;
}

compiled with flags : -Ofast -std=c99

x86_64 generated assembly at r204283:

p1:
.LFB0:
        .cfi_startproc
        testl   %esi, %esi
        jle     .L5
        pxor    %xmm1, %xmm1
        shrl    %esi
        xorl    %eax, %eax
.L4:
        movq    %rax, %rdx
        addq    $1, %rax
        salq    $4, %rdx
        cmpl    %eax, %esi
        addpd   (%rdi,%rdx), %xmm1
        ja      .L4
        movapd  %xmm1, %xmm0
        unpckhpd        %xmm1, %xmm1
        addsd   %xmm1, %xmm0
        ret
        .p2align 4,,10
        .p2align 3
.L5:
        pxor    %xmm0, %xmm0
        ret
        .cfi_endproc


X86_64 trunk generated assembly is attached.

Thanks,
Yvan

Follow-Ups:
- [Bug middle-end/60482] Loop optimization regression
  - From: jakub at gcc dot gnu.org
- [Bug middle-end/60482] Loop optimization regression
  - From: jakub at gcc dot gnu.org
- [Bug middle-end/60482] Loop optimization regression
  - From: jakub at gcc dot gnu.org
- [Bug middle-end/60482] Loop optimization regression
  - From: jakub at gcc dot gnu.org
- [Bug middle-end/60482] Loop optimization regression
  - From: jakub at gcc dot gnu.org
- [Bug middle-end/60482] Loop optimization regression
  - From: jakub at gcc dot gnu.org
- [Bug middle-end/60482] Loop optimization regression
  - From: yvan.roux at linaro dot org

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]