This is the mail archive of the
gcc-help@gcc.gnu.org
mailing list for the GCC project.
Re: Aarch64 / simd / ld1r question
- From: Steve Ellcey <sellcey at cavium dot com>
- To: Kyrill Tkachov <kyrylo dot tkachov at foss dot arm dot com>, "gcc-help at gcc dot gnu dot org" <gcc-help at gcc dot gnu dot org>
- Date: Wed, 25 Apr 2018 08:43:24 -0700
- Subject: Re: Aarch64 / simd / ld1r question
- References: <5ADF5DC6.3000200@foss.arm.com>
- Reply-to: sellcey at cavium dot com
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
On Tue, 2018-04-24 at 17:39 +0100, Kyrill Tkachov wrote:
>
> Hmm, do you have any patches in your tree that affect this part of
> GCC?
> For me the code:
> __Float64x2_t foo1(_Float64 *x)
> {
> __Float64x2_t a = (__Float64x2_t) { *x, *x};
> return a;
> }
>
> generates with current trunk at -O2:
> foo1:
> ld1r {v0.2d}, [x0]
> ret
Interesting, if I have a pointer to a double and do the assigment, I do
get ld1r. If I have a global variable of type double and do the
assignment, I get ldr/dup. I guess that is because of the limited
addressing modes supported by ld1r. With a global double value, I
could do adrp/adrp to get the address of x into a register and then
do an ld1r or I could do adrp/ldr to get the value of x into a register
and then use dup to duplicate it. GCC chose to do the latter instead
of the former but they are both 3 instructions.
Steve Ellcey
sellcey@cavium.com
#include <math.h>
_Float64 *p1;
_Float64 x = 1.35;
__Float64x2_t foo1(void)
{
__Float64x2_t a = (__Float64x2_t) {x, x}; /* ldr/dup */
return a;
}
__Float64x2_t foo2(_Float64 *p2)
{
__Float64x2_t a = (__Float64x2_t) {*p2, *p2}; /* ldr1 */
return a;
}
__Float64x2_t foo3(void)
{
__Float64x2_t a = (__Float64x2_t) {*p1, *p1}; /* ldr1 */
return a;
}