Bug 100005 - undefined reference to `_rdrand64_step'
Summary: undefined reference to `_rdrand64_step'
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 11.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
: 61417 102526 110258 (view as bug list)
Depends on:
Blocks:
 
Reported: 2021-04-09 17:10 UTC by Thiago Macieira
Modified: 2023-06-14 19:11 UTC (History)
6 users (show)

See Also:
Host:
Target: x86_64-linux-gnu
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Thiago Macieira 2021-04-09 17:10:37 UTC
$ cat rdrand.c
#include <immintrin.h>

#define NUM_RANDOM_NUMBERS_TO_GENERATE  1024

typedef int (*Generator)(unsigned long long *);

int fill_array(Generator generator, unsigned long long *rand_array)
{
    for (int i = 0; i < NUM_RANDOM_NUMBERS_TO_GENERATE; i++) {
        // fast attempt once:
        if (__builtin_expect(generator(&rand_array[i]), 1))
            continue;

        // retry up to 16 times
        int j;
        for (j = 0; j < 16; ++j) {
            if (generator(&rand_array[i]))
                break;
        }
        if (j == 16) {
            // failed, the RNG is out of entropy
            return -1;
        }
    }

    return 0;
}

int main()
{
    unsigned long long rand_array[NUM_RANDOM_NUMBERS_TO_GENERATE];
    fill_array(_rdrand64_step, rand_array);
}

$ ~/dev/gcc/bin/gcc -march=haswell -O2 rdrand.c 
/usr/bin/ld: /tmp/ccTlQIsV.o: in function `main':
rdrand.c:(.text.startup+0x8): undefined reference to `_rdrand64_step'
collect2: error: ld returned 1 exit status

$ ~/dev/gcc/bin/gcc --version                  
gcc (GCC) 11.0.1 20210325 (experimental)
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Happens in C++ too, including passing as a template parameter.
Comment 1 Jonathan Wakely 2021-04-09 17:15:40 UTC
Because it's declared as:

extern __inline int
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
_rdrand64_step (unsigned long long *__P)


The artificial attribute prevents taking the address.
Comment 2 Segher Boessenkool 2021-04-09 17:20:12 UTC
So the only bug here is that we should give a better error message?  One
when taking the address, already.
Comment 3 Segher Boessenkool 2021-04-09 17:27:35 UTC
I'm not sure how/why "artificial" should prevent taking the address though?
Comment 4 Thiago Macieira 2021-04-09 18:05:45 UTC
That's an artificial (pun intended) limitation.

In C++:

template <typename Generator>
int fill_array(Generator generator, unsigned long long *rand_array)

Also errors out with the same error, but works if you do:

    fill_array([](auto x) { return _rdrand64_step(x); }, rand_array);

The extra indirection shouldn't be required.

PS: clang compiles the same code just fine.
Comment 5 Jakub Jelinek 2021-04-09 18:07:10 UTC
Neither always_inline nor artificial attribute means that you can't take addresses of those inlines, but
1) I don't think anything implies the intrinsics must be implemented as inline functions, after all, gcc implements hundreds of intrinsics as preprocessor macros e.g. at -O0
2) as those intrinsics that are implemented as inline functions are implemented by gcc as
extern inline __attribute__((gnu_inline, artificial, always_inline)),
they have the GNU extern inline semantics, i.e. the header provides definitions for inlining purposes and when it can't be inline, something different must supply the definitions somewhere else (either the user, or perhaps GCC in some library; but GCC doesn't do that).
Now, GCC could instead define them as static inline __attribute((artificial, always_inline)) and then one would get an out of line copy when taking their address, but it would duplicated in all the TUs that did this.

Anyway, your assumption that intrinsics can be used the way you expect them is just wrong.
Comment 6 Thiago Macieira 2021-04-09 18:11:48 UTC
(In reply to Jakub Jelinek from comment #5)
> then one would get an out of line copy when taking their address, but it would 
> duplicated in all the TUs that did this.

That's not a problem, since that's only for debug mode builds. In release builds, they should get properly inlined.

> Anyway, your assumption that intrinsics can be used the way you expect them
> is just wrong.

If you say so, then please close as WONTFIX or NOTABUG. And indeed the ones that are implemented as macros can't have their address taken anyway, since macros don't have address.

I would suggest a better error message, though, if "just works" is not possible.
Comment 7 Jakub Jelinek 2021-04-09 18:13:34 UTC
CCing H.J. on this.
Comment 8 Jakub Jelinek 2021-04-09 18:23:29 UTC
Looking at clang, they have significantly more intrinsics than GCC implemented as macros (GCC typically only implements those that have to be macros at -O0 for immediates, while I can't find any particular pattern on why some clang intrinsics are macros and others inlines), but they do use static inline rather than extern inline __attribute__((gnu_inline)).  So for some intrinsics you might be lucky and it will work, but for many others it won't work.
Using wrappers, whether lambdas for C++ or something else, is IMNSHO the only portable way for the intrinsics if you want to take their addresses.
Comment 9 H.J. Lu 2021-04-09 18:59:49 UTC
I don't think we need to support taking address of intrinsic.
By definition, there is no intrinsic address to take.
Comment 10 Jakub Jelinek 2021-04-09 20:27:41 UTC
Andrew, this isn't really specific to a single target, rough numbers of
extern inline __gnu_inline__ intrinsics are:
     14 config/s390
    120 config/sparc
    603 config/rs6000
   4044 config/aarch64
   5659 config/i386
   7338 config/arm
and rough numbers of static inline intrinsics are:
     24 config/c6x
     26 config/rs6000
     69 config/arm
     99 config/mips
    112 config/aarch64
Comment 11 Richard Biener 2021-04-12 08:05:08 UTC
Invalid.  Note we can't really diagnose GNU extern inline address-taking since
by definition that's allowed (just the definition needs to come from elsewhere).
Comment 12 Thiago Macieira 2021-04-12 15:19:44 UTC
(In reply to Richard Biener from comment #11)
> Invalid.  Note we can't really diagnose GNU extern inline address-taking
> since
> by definition that's allowed (just the definition needs to come from
> elsewhere).

Understood. Thanks for looking into the report.

Out of curiosity, how does one provide an extern inline's out-of-line copy in C++?
Comment 13 Jakub Jelinek 2021-04-12 15:26:43 UTC
The same like in C.
I.e.
extern inline __attribute__((gnu_inline, always_inline, artificial)) int foo (int x) { return x; }
// The above is typically from some header
int foo (int x) { return x; }
// The above is the out of line function definition
Comment 14 Thiago Macieira 2021-04-12 15:37:26 UTC
(In reply to Jakub Jelinek from comment #13)
> The same like in C.
> I.e.
> extern inline __attribute__((gnu_inline, always_inline, artificial)) int foo
> (int x) { return x; }
> // The above is typically from some header
> int foo (int x) { return x; }
> // The above is the out of line function definition

Thanks, Jakub. At first sight that's not valid C++, but then since it's an extension it doesn't have to be. ICC even accepts the same syntax and generates the same non-weak symbol.

Any way to do that without repeating the body, thus potentially causing an ODR violation? I'm not likely to use this feature, but asking for a rainy day.

https://gcc.godbolt.org/z/96qW9ExcG
Comment 15 Jakub Jelinek 2021-04-12 15:42:13 UTC
No, no way.
It is not an ODR violation, as it is an extension, it is perfectly fine if the inline and out of line definitions differ and they quite often do, e.g. in glibc.
Comment 16 Andrew Pinski 2021-09-29 08:36:43 UTC
*** Bug 102526 has been marked as a duplicate of this bug. ***
Comment 17 Andrew Pinski 2021-09-29 08:36:49 UTC
*** Bug 61417 has been marked as a duplicate of this bug. ***
Comment 18 Andrew Pinski 2023-06-14 19:11:36 UTC
*** Bug 110258 has been marked as a duplicate of this bug. ***