Bug 89929 - __attribute__((target("avx512bw"))) doesn't work on non avx512bw systems
Summary: __attribute__((target("avx512bw"))) doesn't work on non avx512bw systems
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 8.3.1
: P3 normal
Target Milestone: 9.0
Assignee: H.J. Lu
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-04-02 16:48 UTC by Nikolay Bogoychev
Modified: 2024-01-17 07:03 UTC (History)
7 users (show)

See Also:
Host:
Target: x86_64-*-*, i?86-*-*
Build:
Known to work: 9.0
Known to fail: 8.3.0
Last reconfirmed: 2019-04-03 00:00:00


Attachments
testcase for attribute avx512bw (185 bytes, text/plain)
2019-04-02 16:48 UTC, Nikolay Bogoychev
Details
multiple attributes weirdnes (170 bytes, text/x-csrc)
2019-04-03 08:59 UTC, Nikolay Bogoychev
Details
target("arch=foo") doesn't work (202 bytes, text/x-csrc)
2019-04-17 11:00 UTC, Nikolay Bogoychev
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nikolay Bogoychev 2019-04-02 16:48:42 UTC
Created attachment 46075 [details]
testcase for attribute avx512bw

Hey,

I was trying to use function multi-versioning and it turns out that if I specify the attribute to __attribute__((target("avx512bw"))), I get a compilation error on non-avx512bw systems that reads:

g++ test.cpp                                :(
test.cpp: In function ‘_Z3fooi.resolver’:
test.cpp:1:41: error: No dispatcher found for avx512bw
 __attribute__((target("avx512bw"))) int foo(int i) {

This works fine if I specify -mavx512f in a sense that it compiles, but dispatches to the wrong function at runtime (as it probably ignores the avx2 target in the compilation)

I change avx512bw to avx512f the program compiles correctly and at runtime dispatches to the correct function version for my processor (avx2). The problem doesn't occur in clang v8, not that it is relevant.

In addition, it seems that gcc recognizes this as valid syntax:

__attribute__((target("avx512bw", "avx512f")))

But actually ignores everything after the comma in target's arguments. Not sure if I should open another bug for that. Please find a small testcase attached.
Comment 1 Martin Liška 2019-04-03 07:13:59 UTC
Let me take a look at the issue.
Comment 2 Martin Liška 2019-04-03 08:23:17 UTC
Confirmed, we probably miss all:

      {"avx512vl",F_AVX512VL},
      {"avx512bw",F_AVX512BW},
      {"avx512dq",F_AVX512DQ},
      {"avx512cd",F_AVX512CD},
      {"avx512er",F_AVX512ER},
      {"avx512pf",F_AVX512PF},
      {"avx512vbmi",F_AVX512VBMI},
      {"avx512ifma",F_AVX512IFMA},
      {"avx5124vnniw",F_AVX5124VNNIW},
      {"avx5124fmaps",F_AVX5124FMAPS},
      {"avx512vpopcntdq",F_AVX512VPOPCNTDQ},
      {"avx512vbmi2", F_AVX512VBMI2},
      {"avx512vnni", F_AVX512VNNI},
      {"avx512bitalg", F_AVX512BITALG}

I can add all of these, buy I would like to have a comment of an i386 port maintainer. Does it make sense to add all of them? And how should I prioritize among them?
Comment 3 Martin Liška 2019-04-03 08:30:14 UTC
> In addition, it seems that gcc recognizes this as valid syntax:
> 
> __attribute__((target("avx512bw", "avx512f")))
> 
> But actually ignores everything after the comma in target's arguments. Not
> sure if I should open another bug for that. Please find a small testcase
> attached.

No, that's documented behavior:

```
Multiple target back ends implement the target attribute to specify that a function is to be compiled with different target options than specified on the command line. One or more strings can be provided as arguments. Each string consists of one or more comma-separated suffixes to the -m prefix jointly forming the name of a machine-dependent option. See Machine-Dependent Options.
```

It allows you to specify multiple target options.
Comment 4 Nikolay Bogoychev 2019-04-03 08:58:08 UTC
(In reply to Martin Liška from comment #3)
> > In addition, it seems that gcc recognizes this as valid syntax:
> > 
> > __attribute__((target("avx512bw", "avx512f")))
> > 
> > But actually ignores everything after the comma in target's arguments. Not
> > sure if I should open another bug for that. Please find a small testcase
> > attached.
> 
> No, that's documented behavior:
> 
> ```
> Multiple target back ends implement the target attribute to specify that a
> function is to be compiled with different target options than specified on
> the command line. One or more strings can be provided as arguments. Each
> string consists of one or more comma-separated suffixes to the -m prefix
> jointly forming the name of a machine-dependent option. See
> Machine-Dependent Options.
> ```
> 
> It allows you to specify multiple target options.

Hey Martin,

Something fishy is going on with multiple attributes. Eg:

__attribute__((target("avx512bw", "avx512f"))) 

Doesn't compile (expected)

__attribute__((target("avx512f", "avx512bw")))

This actually compiles, but only seems to target the 'f'. The only difference is reversing the order of the arguments

__attribute__((target("avx512f"), target("avx512bw")))

This doesn't compile again (as expected).

Please look at the second testcase, but you need to uncomment individual examples to see to see the behaviour. (only leave the default option and the one being tested)

Cheers,

Nick
Comment 5 Nikolay Bogoychev 2019-04-03 08:59:07 UTC
Created attachment 46080 [details]
multiple attributes weirdnes
Comment 6 Martin Liška 2019-04-03 10:15:28 UTC
> 
> Hey Martin,
> 
> Something fishy is going on with multiple attributes. Eg:
> 
> __attribute__((target("avx512bw", "avx512f"))) 

"avx512bw" argument is broken right now (and will be fixed).
Using a different one works, e.g.:

$ cat pr89929-3.cc
__attribute__((target("sse2", "avx512f"))) int foo(int i) {
	return 1;
}

__attribute__((target("sse3"))) int foo(int i) {
	return 2;
}

__attribute__((target("default"))) int foo(int i) {
	return 4;
}

int main()
{
	return foo(2);
}

$ g++ pr89929-3.cc
[no output]
Comment 7 H.J. Lu 2019-04-03 16:28:36 UTC
__attribute__((target("foo"))) can be used in 2 different ways:

1. Enable FOO, which works for both C and C++.
2. Function versioning with FOO, which works only for C++.

2 is a subset of 1.  We should improve error message when target
is in 1, but outside of 2.
Comment 8 Martin Liška 2019-04-04 07:50:57 UTC
Ok, let me first focus on the functional part of the patch.
If I'm correct feature_list in get_builtin_code_for_version function should be basically aligned with isa_names_table in fold_builtin_cpu. Difference is following:

+"avx5124fmaps"
+"avx5124vnniw"
+"avx512bitalg"
+"avx512bw"
+"avx512cd"
+"avx512dq"
+"avx512er"
+"avx512ifma"
+"avx512pf"
+"avx512vbmi"
+"avx512vbmi2"
+"avx512vl"
+"avx512vnni"
+"avx512vpopcntdq"
+"cmov"
+"gfni"
+"vpclmulqdq"

Adding that should be possible, but one needs to define a priorities of these as seen here:

```
  /* Priority of i386 features, greater value is higher priority.   This is
     used to decide the order in which function dispatch must happen.  For
     instance, a version specialized for SSE4.2 should be checked for dispatch
     before a version for SSE3, as SSE4.2 implies SSE3.  */
  enum feature_priority
```

H.J. can you please help me with the priorities?
Comment 9 H.J. Lu 2019-04-04 20:18:04 UTC
(In reply to Martin Liška from comment #8)
> Ok, let me first focus on the functional part of the patch.
> If I'm correct feature_list in get_builtin_code_for_version function should
> be basically aligned with isa_names_table in fold_builtin_cpu. Difference is
> following:
> 
> +"avx5124fmaps"
> +"avx5124vnniw"
> +"avx512bitalg"
> +"avx512bw"
> +"avx512cd"
> +"avx512dq"
> +"avx512er"
> +"avx512ifma"
> +"avx512pf"
> +"avx512vbmi"
> +"avx512vbmi2"
> +"avx512vl"
> +"avx512vnni"
> +"avx512vpopcntdq"
> +"cmov"
> +"gfni"
> +"vpclmulqdq"
> 
> Adding that should be possible, but one needs to define a priorities of
> these as seen here:
> 
> ```
>   /* Priority of i386 features, greater value is higher priority.   This is
>      used to decide the order in which function dispatch must happen.  For
>      instance, a version specialized for SSE4.2 should be checked for
> dispatch
>      before a version for SSE3, as SSE4.2 implies SSE3.  */
>   enum feature_priority
> ```
> 
> H.J. can you please help me with the priorities?

What do we gain with these extra target attributes for function
multiversioning?
Comment 10 Nikolay Bogoychev 2019-04-04 21:11:25 UTC
(In reply to H.J. Lu from comment #9)
> (In reply to Martin Liška from comment #8)
> > Ok, let me first focus on the functional part of the patch.
> > If I'm correct feature_list in get_builtin_code_for_version function should
> > be basically aligned with isa_names_table in fold_builtin_cpu. Difference is
> > following:
> > 
> > +"avx5124fmaps"
> > +"avx5124vnniw"
> > +"avx512bitalg"
> > +"avx512bw"
> > +"avx512cd"
> > +"avx512dq"
> > +"avx512er"
> > +"avx512ifma"
> > +"avx512pf"
> > +"avx512vbmi"
> > +"avx512vbmi2"
> > +"avx512vl"
> > +"avx512vnni"
> > +"avx512vpopcntdq"
> > +"cmov"
> > +"gfni"
> > +"vpclmulqdq"
> > 
> > Adding that should be possible, but one needs to define a priorities of
> > these as seen here:
> > 
> > ```
> >   /* Priority of i386 features, greater value is higher priority.   This is
> >      used to decide the order in which function dispatch must happen.  For
> >      instance, a version specialized for SSE4.2 should be checked for
> > dispatch
> >      before a version for SSE3, as SSE4.2 implies SSE3.  */
> >   enum feature_priority
> > ```
> > 
> > H.J. can you please help me with the priorities?
> 
> What do we gain with these extra target attributes for function
> multiversioning?

Hey,

tl;dr We are able to target specific processors and not crash on Knight's Mill and Knight's landing.

The problem is that AVX-512 has a 10000 subversions https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512 

Some of them completely overlap (eg VL DQ and BW), however others are limited to specific processors. We are developing an application that uses a lot of intrinsics and we are targetting several different architectures. We rely on instructions that are included in AVX512BW and if we target the closest available working thing (AVX512F), we crash with illegal instruction on Knight's Landing and Knight's Mill processors (which should use the AVX2 codepath instead).

We are also about to add some VNNI code for upcoming Intel processors and we would need a function version for those, because AVX512F is too broad.

Cheers,

Nick
Comment 11 Martin Liška 2019-04-05 08:17:36 UTC
(In reply to Nikolay Bogoychev from comment #10)
> (In reply to H.J. Lu from comment #9)
> > (In reply to Martin Liška from comment #8)
> > > Ok, let me first focus on the functional part of the patch.
> > > If I'm correct feature_list in get_builtin_code_for_version function should
> > > be basically aligned with isa_names_table in fold_builtin_cpu. Difference is
> > > following:
> > > 
> > > +"avx5124fmaps"
> > > +"avx5124vnniw"
> > > +"avx512bitalg"
> > > +"avx512bw"
> > > +"avx512cd"
> > > +"avx512dq"
> > > +"avx512er"
> > > +"avx512ifma"
> > > +"avx512pf"
> > > +"avx512vbmi"
> > > +"avx512vbmi2"
> > > +"avx512vl"
> > > +"avx512vnni"
> > > +"avx512vpopcntdq"
> > > +"cmov"
> > > +"gfni"
> > > +"vpclmulqdq"
> > > 
> > > Adding that should be possible, but one needs to define a priorities of
> > > these as seen here:
> > > 
> > > ```
> > >   /* Priority of i386 features, greater value is higher priority.   This is
> > >      used to decide the order in which function dispatch must happen.  For
> > >      instance, a version specialized for SSE4.2 should be checked for
> > > dispatch
> > >      before a version for SSE3, as SSE4.2 implies SSE3.  */
> > >   enum feature_priority
> > > ```
> > > 
> > > H.J. can you please help me with the priorities?
> > 
> > What do we gain with these extra target attributes for function
> > multiversioning?

Agree with Nick, one should be able to have clones with specific AVX512 flavors.
I can prepare patch for it, only issues is the priority as I already mentioned.

> 
> Hey,
> 
> tl;dr We are able to target specific processors and not crash on Knight's
> Mill and Knight's landing.
> 
> The problem is that AVX-512 has a 10000 subversions
> https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512 
> 
> Some of them completely overlap (eg VL DQ and BW), however others are
> limited to specific processors. We are developing an application that uses a
> lot of intrinsics and we are targetting several different architectures. We
> rely on instructions that are included in AVX512BW and if we target the
> closest available working thing (AVX512F), we crash with illegal instruction
> on Knight's Landing and Knight's Mill processors (which should use the AVX2
> codepath instead).
> 
> We are also about to add some VNNI code for upcoming Intel processors and we
> would need a function version for those, because AVX512F is too broad.
> 
> Cheers,
> 
> Nick
Comment 12 H.J. Lu 2019-04-05 12:49:09 UTC
(In reply to Martin Liška from comment #11)
> Agree with Nick, one should be able to have clones with specific AVX512
> flavors.
> I can prepare patch for it, only issues is the priority as I already
> mentioned.
> 

Priorities are used to choose processors for multi-versioned functions.
Please see how to use avx512XXX to distinguish:

const wide_int_bitmask PTA_SKYLAKE_AVX512 = PTA_SKYLAKE | PTA_AVX512F
const wide_int_bitmask PTA_CASCADELAKE = PTA_SKYLAKE_AVX512 | PTA_AVX512VNNI;
const wide_int_bitmask PTA_CANNONLAKE = PTA_SKYLAKE | PTA_AVX512F
const wide_int_bitmask PTA_ICELAKE_CLIENT = PTA_CANNONLAKE | PTA_AVX512VNNI
const wide_int_bitmask PTA_ICELAKE_SERVER = PTA_ICELAKE_CLIENT | PTA_PCONFIG
const wide_int_bitmask PTA_KNL = PTA_BROADWELL | PTA_AVX512PF | PTA_AVX512ER
const wide_int_bitmask PTA_KNM = PTA_KNL | PTA_AVX5124VNNIW
Comment 13 Martin Liška 2019-04-12 11:42:19 UTC
Situation is more complicated, deferring to GCC 10:
https://gcc.gnu.org/ml/gcc-patches/2019-04/msg00495.html
Comment 14 H.J. Lu 2019-04-13 14:35:57 UTC
Since all AVX512BW processors also have AVX512DQ and AVX512VL, we shouldn't
optimize a function with just  AVX512BW, but without AVX512DQ and AVX512VL.
We should add -misa=AVX512-subset to enable a subset of AVX512XX:

1. PTA_AVX512 = PTA_AVX512F | PTA_AVX512CD
2. PTA_AVX512SKYLAKE = PTA_AVX512 | PTA_AVX512VL | PTA_AVX512BW | PTA_AVX512DQ
....
Comment 15 Martin Liška 2019-04-17 08:47:41 UTC
@Nikolay:

As discussed in https://gcc.gnu.org/ml/gcc-patches/2019-04/msg00416.html email thread, we reached the following consensus with H.J:

- As any AVX512 extensions (apart from AVX512F) can be enabled individually, it's difficult to come up with priorities in dispatcher.
- We don't have a syntax for target_clone attribute where one would say e.g. avx512f+avx512cd+avx512er.
- So that we would reject these (AVX512* except AVX512F) in target_clone attribute and we recommend to use rather.
target_clones(arch=skylake,arch=skylake-avx512,arch=cannonlake,arch=icelake-client,arch=icelake-server, ..)
- Using that one can cover used AVX512 ISA combinations for existing CPUs

Does it work for you Nikolay?
Comment 16 Nikolay Bogoychev 2019-04-17 11:00:01 UTC
Created attachment 46187 [details]
target("arch=foo") doesn't work

(In reply to Martin Liška from comment #15)
> @Nikolay:
> 
> As discussed in https://gcc.gnu.org/ml/gcc-patches/2019-04/msg00416.html
> email thread, we reached the following consensus with H.J:
> 
> - As any AVX512 extensions (apart from AVX512F) can be enabled individually,
> it's difficult to come up with priorities in dispatcher.
> - We don't have a syntax for target_clone attribute where one would say e.g.
> avx512f+avx512cd+avx512er.
> - So that we would reject these (AVX512* except AVX512F) in target_clone
> attribute and we recommend to use rather.
> target_clones(arch=skylake,arch=skylake-avx512,arch=cannonlake,arch=icelake-
> client,arch=icelake-server, ..)
> - Using that one can cover used AVX512 ISA combinations for existing CPUs
> 
> Does it work for you Nikolay?

@Martin:

Thank you for the detailed answer. This could work for now. I have a few questions about it:

Wouldn't that create issues in the future if AMD decide to release avx512 for their CPUs?

In case we are using C style target annotation (and not function multi-versioning), should we also use target(arch=skylake-avx512) instead of target(avx512bw)? 

Also it seems that target("arch=foo") fails for my simple example with target specific option mismatch error (but works in clang).

 If I change target to avx2 it compiles again.

Cheers,

Nick
Comment 17 Martin Liška 2019-04-17 11:22:45 UTC
> 
> @Martin:
> 
> Thank you for the detailed answer. This could work for now. I have a few
> questions about it:
> 
> Wouldn't that create issues in the future if AMD decide to release avx512
> for their CPUs?

No, that will only require to add target(arch=amd-name-with-avx512).

> 
> In case we are using C style target annotation (and not function
> multi-versioning), should we also use target(arch=skylake-avx512) instead of
> target(avx512bw)?

Yes. Let me discuss that with H.J.

> 
> Also it seems that target("arch=foo") fails for my simple example with
> target specific option mismatch error (but works in clang).
> 
>  If I change target to avx2 it compiles again.  
> 
> Cheers,
> 
> Nick

Let me investigate that.
Comment 18 Martin Liška 2019-04-17 12:11:05 UTC
(In reply to Martin Liška from comment #17)
> > 
> > @Martin:
> > 
> > Thank you for the detailed answer. This could work for now. I have a few
> > questions about it:
> > 
> > Wouldn't that create issues in the future if AMD decide to release avx512
> > for their CPUs?
> 
> No, that will only require to add target(arch=amd-name-with-avx512).
> 
> > 
> > In case we are using C style target annotation (and not function
> > multi-versioning), should we also use target(arch=skylake-avx512) instead of
> > target(avx512bw)?

For C style, the functionality will be preserved as is.

> 
> Yes. Let me discuss that with H.J.
> 
> > 
> > Also it seems that target("arch=foo") fails for my simple example with
> > target specific option mismatch error (but works in clang).
> > 
> >  If I change target to avx2 it compiles again.  

This looks to me a bug, I'll create a separated PR for that.

Thanks

> > 
> > Cheers,
> > 
> > Nick
> 
> Let me investigate that.
Comment 19 Nikolay Bogoychev 2019-04-17 12:58:32 UTC
(In reply to Martin Liška from comment #18)
> (In reply to Martin Liška from comment #17)
> > > 
> > > @Martin:
> > > 
> > > Thank you for the detailed answer. This could work for now. I have a few
> > > questions about it:
> > > 
> > > Wouldn't that create issues in the future if AMD decide to release avx512
> > > for their CPUs?
> > 
> > No, that will only require to add target(arch=amd-name-with-avx512).
> > 

Does this mean that if I have an avx512bw+dq function, I'd have to have two identical versions of it that I have to target with arch=canonlake and arch=amd-something-with-avx512? Seems a bit... unellegant.

> 
> > > 
> > > Cheers,
> > > 
> > > Nick
> > 
> > Let me investigate that.

Thanks for opening the bug

Cheers,

Nick
Comment 20 Martin Liška 2019-04-17 13:32:13 UTC
> 
> Does this mean that if I have an avx512bw+dq function, I'd have to have two
> identical versions of it that I have to target with arch=canonlake and
> arch=amd-something-with-avx512? Seems a bit... unellegant.
> 

If you use target_clone attribute of target attribute in C++ (with automatically generated resolver function), then yes. You'll need 2 functions, but you can use alias as seen here:

void xxx () { __builtin_printf ("haswell or skylake CPU\n"); }

void __attribute__ ((target("arch=haswell"),alias("_Z3xxxv"))) foo ();
void __attribute__ ((target("arch=skylake-avx512"),alias("_Z3xxxv"))) foo ();
void __attribute__ ((target("arch=skylake"))) foo () {}
void __attribute__ ((target("default"))) foo () {}

int main()
{
  foo ();
  return 0;
}
Comment 21 Martin Liška 2019-04-25 08:43:11 UTC
I'm assigning that to H.J. as he provided the final version of the patch.
Comment 22 Nikolay Bogoychev 2019-04-25 11:23:26 UTC
Hey,

I was reading through the mailing list discussion ( https://gcc.gnu.org/ml/gcc-patches/2019-04/msg00757.html ) and I want to say that currently code like 

void __attribute__ ((target("avx512dq"))) foo ()

Compiles as long as function multi-versioning is not used. Making this syntax invalid, puts us in a very annoying position, because we can't use the new recommended syntax due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90129

That would mean we would have to do some hacky ifdefs to match different compiler versions.

Is there no other way.

Cheers,

Nick
Comment 23 Martin Liška 2019-04-25 12:02:01 UTC
(In reply to Nikolay Bogoychev from comment #22)
> Hey,
> 
> I was reading through the mailing list discussion (
> https://gcc.gnu.org/ml/gcc-patches/2019-04/msg00757.html ) and I want to say
> that currently code like 
> 
> void __attribute__ ((target("avx512dq"))) foo ()
> 
> Compiles as long as function multi-versioning is not used. Making this
> syntax invalid, puts us in a very annoying position, because we can't use
> the new recommended syntax due to
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90129

No, this will be working as it is now. Suggested changes will only touch C++ features of target attribute (so having multiple functions of the same name with a different target attribute) and target_clone attribute.

> 
> That would mean we would have to do some hacky ifdefs to match different
> compiler versions.
> 
> Is there no other way.
> 
> Cheers,
> 
> Nick
Comment 24 Nikolay Bogoychev 2019-04-25 12:31:24 UTC
(In reply to Martin Liška from comment #23)
> (In reply to Nikolay Bogoychev from comment #22)
> > Hey,
> > 
> > I was reading through the mailing list discussion (
> > https://gcc.gnu.org/ml/gcc-patches/2019-04/msg00757.html ) and I want to say
> > that currently code like 
> > 
> > void __attribute__ ((target("avx512dq"))) foo ()
> > 
> > Compiles as long as function multi-versioning is not used. Making this
> > syntax invalid, puts us in a very annoying position, because we can't use
> > the new recommended syntax due to
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90129
> 
> No, this will be working as it is now. Suggested changes will only touch C++
> features of target attribute (so having multiple functions of the same name
> with a different target attribute) and target_clone attribute.
> 
> > 
> > That would mean we would have to do some hacky ifdefs to match different
> > compiler versions.
> > 
> > Is there no other way.
> > 
> > Cheers,
> > 
> > Nick

Ok, thank you for the clarification!

Cheers,

Nick
Comment 25 hjl@gcc.gnu.org 2019-04-25 17:01:16 UTC
Author: hjl
Date: Thu Apr 25 17:00:28 2019
New Revision: 270578

URL: https://gcc.gnu.org/viewcvs?rev=270578&root=gcc&view=rev
Log:
x86: Update message for target_clones and unsupported ISAs

Before AVX512F, processors with the newer ISAs also support the older
ISAs, i.e., AVX2 processors also support AVX and SSE4, SSE4 processors
also support SSSE3, ...   After AVX512F, an AVX512XX processor may not
support AVX512YY.  It means AVX512XX features, except for AVX512F, can't
be used to decide priority in target_clones.

This patch updates error message for ISAs with P_ZERO priority.  It also
merges _feature_list into _isa_names_table and marks ISAs, which have
unknown priority, with P_ZERO so that we only need to update one place
to add a new ISA feature.

gcc/

2019-04-25  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/89929
	* config/i386/i386.c (feature_priority): Moved to file scope.
	(processor_features): Likewise.
	(processor_model): Likewise.
	(_arch_names_table): Likewise.
	(arch_names_table): Likewise.
	(_feature_list): Removed.
	(feature_list): Likewise.
	(_isa_names_table): Moved to file scope.  Add priority.
	(isa_names_table): Likewise.
	(get_builtin_code_for_version): Replace feature_list with
	isa_names_table.  Update error message for P_ZERO priority.

gcc/testsuite/

2019-04-25  Martin Liska  <mliska@suse.cz>
	    H.J. Lu  <hongjiu.lu@intel.com>

	PR target/89929
	* g++.target/i386/mv28.C: New test.
	* gcc.target/i386/mvc14.c: Likewise.
	* g++.target/i386/pr57362.C: Updated.

Added:
    trunk/gcc/testsuite/g++.target/i386/mv28.C
    trunk/gcc/testsuite/gcc.target/i386/mvc14.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/g++.target/i386/pr57362.C
Comment 26 Martin Liška 2019-04-25 20:25:40 UTC
Are you planning H.J. to backport that?
Comment 27 H.J. Lu 2019-04-25 20:28:29 UTC
(In reply to Martin Liška from comment #26)
> Are you planning H.J. to backport that?

Please feel free to backport it.  I have no plan to do it myself.
Comment 28 Martin Liška 2019-04-26 07:36:17 UTC
I'm fine with having that in GCC 9.1 which will be released soon. Thus closing ..
Comment 29 Chris Elrod 2022-05-30 23:29:37 UTC
"RESOLVED FIXED". I haven't tried this with `target`, but avx512bw does not work with target_clones with gcc 11.2, but it does with clang 14.
Comment 30 Chris Elrod 2022-05-31 00:28:52 UTC
> #if defined(__clang__)
> #define MULTIVERSION                                                           \
>     __attribute__((target_clones("avx512dq", "avx2", "default")))
> #else
> #define MULTIVERSION                                                           \
>     __attribute__((target_clones(                                              \
>         "arch=skylake-avx512,arch=cascadelake,arch=icelake-client,arch="       \
>         "tigerlake,"                                                           \
>         "arch=icelake-server,arch=sapphirerapids,arch=cooperlake",             \
>         "avx2", "default")))
> #endif

For example, I can do something like this, but gcc produces a ton of unnecessary duplicates for each of the avx512dq architectures. There must be a better way.
Comment 31 Hongtao.liu 2022-05-31 00:56:50 UTC
(In reply to Chris Elrod from comment #30)
> > #if defined(__clang__)
> > #define MULTIVERSION                                                           \
> >     __attribute__((target_clones("avx512dq", "avx2", "default")))
> > #else
> > #define MULTIVERSION                                                           \
> >     __attribute__((target_clones(                                              \
> >         "arch=skylake-avx512,arch=cascadelake,arch=icelake-client,arch="       \
> >         "tigerlake,"                                                           \
> >         "arch=icelake-server,arch=sapphirerapids,arch=cooperlake",             \
> >         "avx2", "default")))
> > #endif
> 
> For example, I can do something like this, but gcc produces a ton of
> unnecessary duplicates for each of the avx512dq architectures. There must be
> a better way.

Maybe you can use __attribute__((target_clones("arch=x86-64-v4","avx2", "default"))), oh it works only for GCC12.1 and trunk, not for GCC11.2
Comment 32 Chris Elrod 2022-05-31 01:05:22 UTC
Ha, I accidentally misreported my gcc version. I was already using 12.1.1.

Using x86-64-v4 worked, excellent! Thanks.