This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/78687] New: inefficient code generated for eggs.variant


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78687

            Bug ID: 78687
           Summary: inefficient code generated for eggs.variant
           Product: gcc
           Version: 6.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

Created attachment 40254
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40254&action=edit
gcc-eggs-variant-missing-opt.cpp

I have a piece of code that actively uses library called Eggs.Variant. It is a
library that implements C++17-like variant class. Profiling of this piece of
code revealed that the generated code for the most expensive function is quite
inefficient. Here is the generated code (with percentage of time spent in each
instruction and my comments):

 Percent |      Source code & Disassembly of a.out for cycles:pp
----------------------------------------------------------------
         :      ref_proxy<qual_option, inplace_ref<qual_option> > f()
         :      {
    0.00 :        400710:       sub    $0x140,%rsp
    0.00 :        400717:       mov    %rdi,%rax
    1.54 :        40071a:       movq   $0x2,0x28(%rdi)
    0.00 :        400722:       movl   $0x0,0x128(%rsp)
   22.36 :        40072d:       mov    0x128(%rsp),%rdx
!!! reading of stack memory immediately after writing to it
    1.63 :        400735:       mov    %rdx,0xe8(%rsp)
    0.00 :        40073d:       movl   $0x0,0xe8(%rsp)
!!! writing 0 immediately after writing some other value to it
   22.74 :        400748:       mov    0xe8(%rsp),%rdx
    1.59 :        400750:       mov    %rdx,0xa8(%rsp)
    0.00 :        400758:       movl   $0x0,0xa8(%rsp)
   22.72 :        400763:       mov    0xa8(%rsp),%rdx
!!! writing 0 immediately after writing some other value to it, then reading
from it
    2.16 :        40076b:       mov    %rdx,0x128(%rsp)
    0.00 :        400773:       mov    -0x78(%rsp),%rdx
    0.00 :        400778:       movl   $0x0,0x128(%rsp)
!!! writing some value to stack memory, then writing 0 to it
    0.01 :        400783:       mov    %rdx,(%rdi)
    1.66 :        400786:       mov    -0x70(%rsp),%rdx
    0.00 :        40078b:       mov    %rdx,0x8(%rdi)
    0.00 :        40078f:       mov    -0x68(%rsp),%rdx
    0.01 :        400794:       mov    %rdx,0x10(%rdi)
    1.72 :        400798:       mov    -0x60(%rsp),%rdx
    0.00 :        40079d:       mov    %rdx,0x18(%rdi)
    0.00 :        4007a1:       mov    -0x58(%rsp),%rdx
    0.02 :        4007a6:       mov    %rdx,0x20(%rdi)
   20.15 :        4007aa:       mov    0x128(%rsp),%rdx
!!! again reading stack memory where 0 was written several instruction ago
    1.68 :        4007b2:       mov    %rdx,0x30(%rdi)
    0.00 :        4007b6:       add    $0x140,%rsp
    0.00 :        4007bd:       retq   

Initially I thought that there must be some aliasing issue. But the memory
accessed is the memory of local variables. The program doesn't use volatile
qualifier either. And aliasing does not explain why compiler did two writes to
the same memory location in a row. As you can see this function does not little
more than copying data from one memory location to another.

It turned out that clang generates much better code for the function:

 Percent |      Source code & Disassembly of a.out for cycles:pp
----------------------------------------------------------------
   31.86 :        400670:       movq   $0x2,0x28(%rdi)
   32.27 :        400678:       movl   $0x0,0x30(%rdi)
    0.23 :        40067f:       mov    %rdi,%rax
   35.65 :        400682:       retq   

Unfortunately I didn't manage to make a small snippet to reproduce the issue.
The attached file is quite big (about 500 lines long), still I hope it allows
reproducing the issue.

The command line I used is:
g++ -pthread -Wall -O2 -g -DNDEBUG -fvisibility=hidden -std=gnu++14
gcc-eggs-variant-missing-opt.cpp

The compiler version was 6.2.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]