This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug lto/69678] New: Missed function specialization + partial devirtualization opportunity


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69678

            Bug ID: 69678
           Summary: Missed function specialization + partial
                    devirtualization opportunity
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: lto
          Assignee: unassigned at gcc dot gnu.org
          Reporter: wschmidt at gcc dot gnu.org
                CC: dje at gcc dot gnu.org, hubicka at gcc dot gnu.org
  Target Milestone: ---
            Target: powerpc64le-*

Created attachment 37583
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37583&action=edit
Tarball with source, object, executable

This is a missed optimization that requires a combination of LTO, function
specialization, profiling, partial devirtualization, and inlining, so there are
plenty of places where we can go wrong.  However, another compiler manages it,
and GCC is over 2x slower on this simple test as a result.

In the attachment, there are two source files, disp.c and dispf.c.  dispf.c
contains two functions, "one" and "two."  disp.c contains one call to "one" and
an indirect call (within a loop) that can call either "one" or "two."  Both
functions are always called using the value 3 as input.  "two" ignores this
parameter, while "one" has an early exit for the value 3.

The desired behavior with options -O3 -flto -fprofile-use would be:

(1) Specialize each of "one" and "two" for a parameter value of 3;
(2) Perform partial devirtualization based on profile data for the indirect
call, resulting in conditional calls to "one-prime" and "two-prime" in that
order prior to falling back to the indirect call; and
(3) Inlining the specialized functions at the three direct call sites.

Currently it appears that GCC will do neither step (1) nor step (2), making
step (3) impossible.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]