Bug 82625 - lower-optimization are not inlined with symbol multiversioning
Summary: lower-optimization are not inlined with symbol multiversioning
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: ipa (show other bugs)
Version: 7.2.1
: P3 normal
Target Milestone: ---
Assignee: Martin Liška
URL:
Keywords: missed-optimization
: 90403 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-10-19 20:07 UTC by Maciej Piechotka
Modified: 2019-05-09 12:52 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2017-10-20 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Maciej Piechotka 2017-10-19 20:07:57 UTC
Consider following toy example:

__attribute__ ((target ("default")))
static uint32_t foo(const char *buf, size_t size) {
  return 1;
}

__attribute__ ((target ("avx")))
static uint32_t foo(const char *buf, size_t size) {
  return 2;
}

__attribute__ ((target ("default")))
uint32_t bar() {
  char buf[4096];
  uint32_t acc = 0;
  for (int i = 0; i < sizeof(buf); i++) {
    acc += foo(&buf[i], 1);
  }
  return acc;
}

__attribute__ ((target ("avx")))
uint32_t bar() {
  char buf[4096];
  uint32_t acc = 0;
  for (int i = 0; i < sizeof(buf); i++) {
    acc += foo(&buf[i], 1);
  }
  return acc;
}

bar.avx is correctly optimized to mov:

bar() [clone .avx]:
        movl    $8192, %eax
        ret

However even though default bar could be optimized to mov as well it goes through loop and dispatch:

bar():
        pushq   %r12
        pushq   %rbp
        xorl    %ebp, %ebp
        pushq   %rbx
        subq    $4096, %rsp
        leaq    4096(%rsp), %r12
        movq    %rsp, %rbx
.L10:
        movq    %rbx, %rdi
        movl    $1, %esi
        addq    $1, %rbx
        call    _ZL3fooPKcm._GLOBAL____tmp_compiler_explorer_compiler117919_59_b8onwy.b8iqhyqfr_example.cpp_00000000_0x82e640d209aabe90.ifunc(char const*, unsigned long)
        addl    %eax, %ebp
        cmpq    %r12, %rbx
        jne     .L10
        addq    $4096, %rsp
        movl    %ebp, %eax
        popq    %rbx
        popq    %rbp
        popq    %r12
        ret

Possibly overlapping with bug #71990.
Comment 1 Richard Biener 2017-10-20 09:26:37 UTC
I _think_ that's somehow an implementation artifact in that we do not handle calls in "default" implementations different from calls in random non-versioned functions.

Confirmed.
Comment 2 Martin Liška 2017-10-20 09:30:54 UTC
Let me take a look.
Comment 3 Segher Boessenkool 2018-06-26 15:17:30 UTC
Author: segher
Date: Tue Jun 26 15:16:58 2018
New Revision: 262152

URL: https://gcc.gnu.org/viewcvs?rev=262152&root=gcc&view=rev
Log:
rs6000: Set up ieee128_float_type_node correctly (PR82625)

We shouldn't init __ieee128 to be the same as long double if the
latter is not even a 128-bit type.

This also reorders the nearby __ibm128 code so both types use similar
logic.


	PR target/82625
	* config/rs6000/rs6000.c (rs6000_init_builtins): Do not set
	ieee128_float_type_node to long_double_type_node unless
	TARGET_LONG_DOUBLE_128 is set.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/rs6000/rs6000.c
Comment 4 Segher Boessenkool 2018-06-26 16:13:43 UTC
Please ignore comment 3 (wrong PR #).  Sorry.
Comment 5 Martin Liška 2018-10-04 14:37:27 UTC
Author: marxin
Date: Thu Oct  4 14:36:55 2018
New Revision: 264845

URL: https://gcc.gnu.org/viewcvs?rev=264845&root=gcc&view=rev
Log:
Redirect call within specific target attribute among MV clones (PR ipa/82625).

2018-10-04  Martin Liska  <mliska@suse.cz>

	PR ipa/82625
	* multiple_target.c (redirect_to_specific_clone): New function.
	(ipa_target_clone): Use it.
	* tree-inline.c: Fix comment.
2018-10-04  Martin Liska  <mliska@suse.cz>

	PR ipa/82625
	* g++.dg/ext/pr82625.C: New test.

Added:
    trunk/gcc/testsuite/g++.dg/ext/pr82625.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/multiple_target.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-inline.c
Comment 6 Martin Liška 2018-10-04 14:39:08 UTC
Fixed.
Comment 7 Martin Liška 2019-05-09 07:00:20 UTC
*** Bug 90403 has been marked as a duplicate of this bug. ***
Comment 8 Shawn Landden 2019-05-09 12:52:28 UTC
Included in gcc 9