Bug 72736 - warning: switch -mcpu=cortex-a53 conflicts with -march=armv8-a switch
Summary: warning: switch -mcpu=cortex-a53 conflicts with -march=armv8-a switch
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.9.2
: P3 minor
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
: 72737 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-07-28 07:44 UTC by Jeffrey Walton
Modified: 2016-07-28 10:33 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jeffrey Walton 2016-07-28 07:44:02 UTC
This looks like an issue similar to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57907. My apologies if it has already been fixed.

Raspberry recently released the Raspberry Pi 3. Its an ARMv8 SoC by Broadcom based on the A-53 core. Raspberry is resisting an 64-bit Aarch64 image, so we have to use an 32-bit image and toolchain. Fortunately, Aarch32 has the necessary instructions, too (but with a different encoding). Also see https://community.arm.com/groups/android-community/blog/2015/03/27/arm-neon-programming-quick-reference .

Attempting to compile in this configuration results in:

$ cat test.cc
int main(int argc, char* argv[])
{
  return 0;
}

$ gcc -march=armv8-a -mcpu=cortex-a53 test.cc -o test.exe
test.cc:1:0: warning: switch -mcpu=cortex-a53 conflicts with -march=armv8-a switch
 int main(int argc, char* argv[])
 ^

The same thing applies as in the 57907. The manual seems to indicate its a supported configuration. And the comments apply as well. There are some optional instructions that may not be present in a baseline A53.

**********

What I am ultimately after here is access to the PMULL, PMULL2, AES, SHA1 and SHA2 intrinsics. But alas, that does not want to compile:

$ gcc -march=armv8-a+crc+crypto -mcpu=cortex-a53 test.cc -o test.exe
gcc: error: unrecognized argument in option ‘-march=armv8-a+crc+crypto’
gcc: note: valid arguments to ‘-march=’ are: armv2 armv2a armv3 armv3m armv4 armv4t armv5 armv5e armv5t armv5te armv6 armv6-m armv6j armv6k armv6s-m armv6t2 armv6z armv6zk armv7 armv7-a armv7-m armv7-r armv7e-m armv7ve armv8-a armv8-a+crc iwmmxt iwmmxt2 native

**********

$ gcc --version
gcc (Raspbian 4.9.2-10) 4.9.2
Copyright (C) 2014 Free Software Foundation, Inc.

**********

And here's another attempt to enable CRC and Crypto. I'm probably going to have to take this to the GCC mailing list for help.

raspberrypi:cryptopp-gcm$ gcc -D__ARM_FEATURE_CRYPTO -D__ARM_FEATURE_CRC -march=armv8-a -mcpu=cortex-a53 -mfpu=neon test.cc -o test.exe
test.cc:1:0: warning: switch -mcpu=cortex-a53 conflicts with -march=armv8-a switch
 #include <arm_neon.h>
 ^
In file included from test.cc:1:0:
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h: In function ‘uint8x16_t vaeseq_u8(uint8x16_t, uint8x16_t)’:
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h:13707:50: error: ‘__builtin_arm_crypto_aese’ was not declared in this scope
   return __builtin_arm_crypto_aese (__data, __key);
                                                  ^
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h: In function ‘uint8x16_t vaesdq_u8(uint8x16_t, uint8x16_t)’:
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h:13713:50: error: ‘__builtin_arm_crypto_aesd’ was not declared in this scope
   return __builtin_arm_crypto_aesd (__data, __key);
                                                  ^
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h: In function ‘uint8x16_t vaesmcq_u8(uint8x16_t)’:
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h:13719:44: error: ‘__builtin_arm_crypto_aesmc’ was not declared in this scope
   return __builtin_arm_crypto_aesmc (__data);
                                            ^
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h: In function ‘uint8x16_t vaesimcq_u8(uint8x16_t)’:
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h:13725:45: error: ‘__builtin_arm_crypto_aesimc’ was not declared in this scope
   return __builtin_arm_crypto_aesimc (__data);
                                             ^
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h: In function ‘uint32_t vsha1h_u32(uint32_t)’:
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h:13733:40: error: ‘__builtin_arm_crypto_sha1h’ was not declared in this scope
   __t = __builtin_arm_crypto_sha1h (__t);
                                        ^
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h: In function ‘uint32x4_t vsha1cq_u32(uint32x4_t, uint32_t, uint32x4_t)’:
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h:13742:60: error: ‘__builtin_arm_crypto_sha1c’ was not declared in this scope
   return __builtin_arm_crypto_sha1c (__hash_abcd, __t, __wk);
                                                            ^
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h: In function ‘uint32x4_t vsha1pq_u32(uint32x4_t, uint32_t, uint32x4_t)’:
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h:13750:60: error: ‘__builtin_arm_crypto_sha1p’ was not declared in this scope
   return __builtin_arm_crypto_sha1p (__hash_abcd, __t, __wk);
                                                            ^
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h: In function ‘uint32x4_t vsha1mq_u32(uint32x4_t, uint32_t, uint32x4_t)’:
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h:13758:60: error: ‘__builtin_arm_crypto_sha1m’ was not declared in this scope
   return __builtin_arm_crypto_sha1m (__hash_abcd, __t, __wk);
                                                            ^
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h: In function ‘uint32x4_t vsha1su0q_u32(uint32x4_t, uint32x4_t, uint32x4_t)’:
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h:13764:63: error: ‘__builtin_arm_crypto_sha1su0’ was not declared in this scope
   return __builtin_arm_crypto_sha1su0 (__w0_3, __w4_7, __w8_11);
                                                               ^
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h: In function ‘uint32x4_t vsha1su1q_u32(uint32x4_t, uint32x4_t)’:
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h:13770:57: error: ‘__builtin_arm_crypto_sha1su1’ was not declared in this scope
   return __builtin_arm_crypto_sha1su1 (__tw0_3, __w12_15);
                                                         ^
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h: In function ‘uint32x4_t vsha256hq_u32(uint32x4_t, uint32x4_t, uint32x4_t)’:
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h:13776:70: error: ‘__builtin_arm_crypto_sha256h’ was not declared in this scope
   return __builtin_arm_crypto_sha256h (__hash_abcd, __hash_efgh, __wk);
                                                                      ^
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h: In function ‘uint32x4_t vsha256h2q_u32(uint32x4_t, uint32x4_t, uint32x4_t)’:
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h:13782:71: error: ‘__builtin_arm_crypto_sha256h2’ was not declared in this scope
   return __builtin_arm_crypto_sha256h2 (__hash_abcd, __hash_efgh, __wk);
                                                                       ^
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h: In function ‘uint32x4_t vsha256su0q_u32(uint32x4_t, uint32x4_t)’:
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h:13788:56: error: ‘__builtin_arm_crypto_sha256su0’ was not declared in this scope
   return __builtin_arm_crypto_sha256su0 (__w0_3, __w4_7);
                                                        ^
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h: In function ‘uint32x4_t vsha256su1q_u32(uint32x4_t, uint32x4_t, uint32x4_t)’:
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h:13794:68: error: ‘__builtin_arm_crypto_sha256su1’ was not declared in this scope
   return __builtin_arm_crypto_sha256su1 (__tw0_3, __w8_11, __w12_15);
                                                                    ^
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h: In function ‘poly128_t vmull_p64(poly64_t, poly64_t)’:
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h:13800:83: error: ‘__builtin_arm_crypto_vmullp64’ was not declared in this scope
   return (poly128_t) __builtin_arm_crypto_vmullp64 ((uint64_t) __a, (uint64_t) __b);
                                                                                   ^
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h: In function ‘poly128_t vmull_high_p64(poly64x2_t, poly64x2_t)’:
/usr/lib/gcc/arm-linux-gnueabihf/4.9/include/arm_neon.h:13809:85: error: ‘__builtin_arm_crypto_vmullp64’ was not declared in this scope
   return (poly128_t) __builtin_arm_crypto_vmullp64 ((uint64_t) __t1, (uint64_t) __t2);
                                                                                     ^
raspberrypi:cryptopp-gcm$
Comment 1 Martin Liška 2016-07-28 08:51:14 UTC
*** Bug 72737 has been marked as a duplicate of this bug. ***
Comment 2 ktkachov 2016-07-28 09:32:06 UTC
The crypto instructions use the Advanced SIMD registers and on aarch32 (the arm gcc port) are therefore enabled by the -mfpu option (unlike on aarch64 where you can just add +crypto to an -mcpu or -march argument)

To enable them use -mfpu=crypto-neon-fp-armv8.

Note that the -mcpu option is used as a shorthand for an -march + -mtune option combination.

-mcpu=cortex-a53 is in this case equivalent to -march=armv8-a+crc -mtune=cortex-a53 which is why you get the complaint about the conflict with -march=armv8-a
Comment 3 James Greenhalgh 2016-07-28 09:49:17 UTC
I see several misunderstandings wrapped up in this bug - but as far as I can see everything is working as documented (even if that documentation may not be the easiest to decipher), so I'm resolving the report as invalid.

In short, try:

  $CC -mcpu=cortex-a53 -mfpu=crypto-neon-fp-armv8 (and -mfloat-abi=hard/softfp if that is not preconfigured in your compiler)

See https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html for options applicable to the ARM port.

  -march=armv8-a+crc+crypto

As I'm sure you've spotted from the error message given by the compiler, this is not a valid value to pass to -march. The closest is -march=armv8-a+crc . This would ask the compiler to generate instructions for the ARMv8-A architecture profile, with the optional CRC32 extensions. That gets you halfway to what you actually want (which is for the compiler to enable both the CRC32 extensions and the Crypto extensions).

  warning: switch -mcpu=cortex-a53 conflicts with -march=armv8-a switch

As you've correctly identified, the CRC32 extension is optional for ARMv8-A implementations, thus -march=armv8-a does not enable support for it. However, the Cortex-A53 always provides the CRC32 extension, and -mcpu=cortex-a53 would enable support for it. This is the conflict GCC is warning about; -march=armv8-a turns off the CRC32 extension, but -mcpu=cortex-a53 would turn on the CRC32 extension.

Note that -mcpu=$foo is a shorthand for -mtune=$foo -march=`arch_of ($foo)` so you only need -mcpu=cortex-a53 on your command line to enable support for the CRC32 extensions.

  -mfpu=neon

As you might know; historically ARM floating-point and vector hardware units were implemented as coprocessors and were to some extent interchangeable. Thus, the ARM port of GCC decouples the choice of floating-point and vector unit (under the -mfpu= option) and architecture (under the -march= option).

Here you are asking the compiler to generate code for "neon". This option refers to the Advanced SIMD instructions as defined in ARMv7-A and implemented in Cortex-A8 and similar processors.

The other bit of knowledge you need is that the Cryptographic extensions use the Advanced SIMD registers, and are therefore considered by GCC to be something that should be enabled by an -mfpu option. Checking the list of valid options, you will want crypto-neon-fp-armv8 .

   -D__ARM_FEATURE_CRYPTO -D__ARM_FEATURE_CRC 

These macros are predefined by the compiler when it thinks you have support for these features. Defining them yourself is unlikely to work well, as you've discovered.
Comment 4 Jeffrey Walton 2016-07-28 10:10:57 UTC
Thanks, and sorry to waste your time with it.

Before I spin up another report that wastes a lot time, does this look like another issue with me (likely), or an issue with GCC (unlikely):

internal compiler error: in expand_shift_1, at expmed.c:2318
   result = (r1 != r2);

Here's a quick copy/paste of the relevant pieces. I'll do the minimal working example if needed.


    // Executed with a sig_handler guard
    volatile bool result = true;
    ...

    const poly64_t a1={1}, b1={2};
    const poly64x2_t a2={1}, b2={2};
    const poly128_t r1 = vmull_p64(a1, b1);
    const poly128_t r2 = vmull_high_p64(a2, b2);

    result = (r1 != r2);
Comment 5 James Greenhalgh 2016-07-28 10:14:32 UTC
(In reply to Jeffrey Walton from comment #4)
> Thanks, and sorry to waste your time with it.
> 
> Before I spin up another report that wastes a lot time, does this look like
> another issue with me (likely), or an issue with GCC (unlikely):
> 
> internal compiler error: in expand_shift_1, at expmed.c:2318
>    result = (r1 != r2);
> 
> Here's a quick copy/paste of the relevant pieces. I'll do the minimal
> working example if needed.
> 
> 
>     // Executed with a sig_handler guard
>     volatile bool result = true;
>     ...
> 
>     const poly64_t a1={1}, b1={2};
>     const poly64x2_t a2={1}, b2={2};
>     const poly128_t r1 = vmull_p64(a1, b1);
>     const poly128_t r2 = vmull_high_p64(a2, b2);
> 
>     result = (r1 != r2);

Internal compiler errors are almost never *your* fault, the compiler should gracefully handle most of what you throw at it. If you can build a minimum working example that would be handy (though note that GCC 4.9 is getting old now, and the branch is due to close permanently in the very near future, so unless this reproduces on 5/6/trunk we're probably past the point where it would get looked at and fixed).
Comment 6 Jeffrey Walton 2016-07-28 10:33:25 UTC
(In reply to James Greenhalgh from comment #5)
> (In reply to Jeffrey Walton from comment #4)
> > Thanks, and sorry to waste your time with it.
> > 
> > Before I spin up another report that wastes a lot time, does this look like
> > another issue with me (likely), or an issue with GCC (unlikely):
> > 
> > internal compiler error: in expand_shift_1, at expmed.c:2318
> >    result = (r1 != r2);
> > 
> > Here's a quick copy/paste of the relevant pieces. I'll do the minimal
> > working example if needed.
> > 
> > 
> >     // Executed with a sig_handler guard
> >     volatile bool result = true;
> >     ...
> > 
> >     const poly64_t a1={1}, b1={2};
> >     const poly64x2_t a2={1}, b2={2};
> >     const poly128_t r1 = vmull_p64(a1, b1);
> >     const poly128_t r2 = vmull_high_p64(a2, b2);
> > 
> >     result = (r1 != r2);
> 
> Internal compiler errors are almost never *your* fault, the compiler should
> gracefully handle most of what you throw at it. If you can build a minimum
> working example that would be handy ...

OK, thanks. Done at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72738 .

> (though note that GCC 4.9 is getting old
> now, and the branch is due to close permanently in the very near future, so
> unless this reproduces on 5/6/trunk we're probably past the point where it
> would get looked at and fixed).

Yeah, it is a bit dated. Debian is still supplying 4.9 with Jessie stable, so its fairly ubiquitous. Its on nearly every test machine I have, including IoT gadgets like BeagleBones, CunieTrucks, Raspberry Pi's, Banana Pi's, and HiKeys.