This is the mail archive of the
mailing list for the GCC project.
Re: AARCH64 vs SLOW_BYTE_ACCESS
- From: Andrew Pinski <pinskia at gmail dot com>
- To: "Richard Earnshaw (lists)" <Richard dot Earnshaw at arm dot com>
- Cc: GCC Mailing List <gcc at gcc dot gnu dot org>
- Date: Tue, 11 Jul 2017 08:06:32 -0700
- Subject: Re: AARCH64 vs SLOW_BYTE_ACCESS
- Authentication-results: sourceware.org; auth=none
- References: <CA+=Sn1=06c=9zRCAt-f3Xe4PjqftJJyivKmHc4F6CR_KKytShA@mail.gmail.com> <email@example.com>
On Tue, Jul 11, 2017 at 3:09 AM, Richard Earnshaw (lists)
> On 11/07/17 05:16, Andrew Pinski wrote:
>> I was looking into some bitfield code for aarch64 and was wondering
>> why SLOW_BYTE_ACCESS is set to 0. I can't seem to figure out why
>> The header says:
>> Although there's no difference in instruction count or cycles,
>> in AArch64 we don't want to expand to a sub-word to a 64-bit access
>> if we don't have to, for power-saving reasons. */
>> But that does not make sense because with SLOW_BYTE_ACCESS to 0, GCC
>> expands a sub-word access to a 64bit access.
>>> When I set to SLOW_BYTE_ACCESS to 1, I get between 38% to 208% speed
>> up for accesses of a bitfields inside a loop on ThunderX CN88xx.
> What's the test case?
>> Should we change SLOW_BYTE_ACCESS (or maybe better yet get rid of it)?
> The documentation for SLOW_BYTE_ACCESS is just plain confusing, IMO.
> And your comment above seems to be contrary to the documentation as well.
Here is the testcase which shows the issue:
typedef unsigned long long u64;
void setting(s_t *a)
a->a = 0x2AA;
a->b = 0x2AA;
a->c = 0x155;
a->d = 0x2A;
a->e = 0x2AAA;
a->f = 0x2AAA;
void set(s_t *a, int b, int c, int d, int e, int f, int g)
a->a = b;
a->b = c;
a->c = d;
a->d = e;
a->e = f;
a->f = g;
--- CUT ---
If SLOW_BYTE_ACCESS is set to 0, we get many more instructions. See
the logic in bit_field_mode_iterator::next_mode (which calls
bit_field_mode_iterator::prefer_smaller_modes which checks
Note the only other place which checks SLOW_BYTE_ACCESS is dojump.c
and I think that code might be dead due to expand directly from SSA.