Created attachment 53040 [details] reduced testcase Output: $ powerpc64le-unknown-linux-gnu-gcc -fexpensive-optimizations -flive-range-shrinkage testcase.c -static $ qemu-ppc64le -- ./a.out qemu: uncaught target signal 6 (Aborted) - core dumped Aborted (gdb) p/x a $5 = {0x0, 0x0, 0x0, 0x0} (gdb) p/x b $6 = {0x0, 0x0, 0x0, 0x0} (gdb) p/x c $7 = {0xffff00, 0x0, 0x0, 0x0} (gdb) p/x d $8 = {0x0, 0x0, 0x0, 0x0} (gdb) p/x e $9 = {0x0, 0x0, 0x0, 0x0} (gdb) p/x u $10 = {0x1 <repeats 32 times>} $ powerpc64le-unknown-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-powerpc64le/bin/powerpc64le-unknown-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r13-771-20220526001630-g3dff965cae6-checking-yes-rtl-df-extra-powerpc64le/bin/../libexec/gcc/powerpc64le-unknown-linux-gnu/13.0.0/lto-wrapper Target: powerpc64le-unknown-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --with-cloog --with-ppl --with-isl --with-sysroot=/usr/powerpc64le-unknown-linux-gnu --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=powerpc64le-unknown-linux-gnu --with-ld=/usr/bin/powerpc64le-unknown-linux-gnu-ld --with-as=/usr/bin/powerpc64le-unknown-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r13-771-20220526001630-g3dff965cae6-checking-yes-rtl-df-extra-powerpc64le Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 13.0.0 20220526 (experimental) (GCC)
Can be reproduced without cross build compiler.
This exposes one bug in glibc strncpy power9 implementation In https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/powerpc/powerpc64/le/power9/strncpy.S lbz r0,0(r4) stb r0,0(r3) addi r11,r3,1 addi r5,r5,-1 vspltisb v18,0 /* Zeroes in v18 */ ... L(zero_padding_end): sldi r10,r5,56 /* stxvl wants size in top 8 bits */ stxvl v18,r11,r10 /* Partial store */ blr The code at label "zero_padding_end" is supposed to use v18, but the stxvl will take the 18 as vsx No. instead of vr No, so it ends up to use the wrong register vs18 instead of v18 for the store. The reason why the optimization option matters is that some optimization happen to generate some sequence to modify the value of vs18 then it's not expected to be zero any more.
Hi Zdenek, Could you please double check the strncpy implementation on your side? and help to file one glibc issue if so. One further reduced test case: #define N 3 char a[N]; char c[N]; int main (void) { asm volatile("xxspltib 18, 0xf" : : :"vs18"); __builtin_strncpy (c, a, N); if (c[0] || c[1]) __builtin_abort (); return 0; }
(In reply to Kewen Lin from comment #3) > Hi Zdenek, > > Could you please double check the strncpy implementation on your side? and > help to file one glibc issue if so. > > One further reduced test case: > > #define N 3 > char a[N]; > char c[N]; > > int > main (void) > { > asm volatile("xxspltib 18, 0xf" : : :"vs18"); > __builtin_strncpy (c, a, N); > if (c[0] || c[1]) > __builtin_abort (); > return 0; > } Hello Kewen, thank you for the simple testcase. I can confirm it fails for me, and I am using the problematic strncpy implementation: ... 0x0000000010022f10 <+16>: lbz r0,0(r4) 0x0000000010022f14 <+20>: stb r0,0(r3) 0x0000000010022f18 <+24>: addi r11,r3,1 0x0000000010022f1c <+28>: addi r5,r5,-1 0x0000000010022f20 <+32>: vspltisb v18,0 ... 0x000000001002319c <+668>: rldicr r10,r5,56,7 0x00000000100231a0 <+672>: stxvl vs18,r11,r10 0x00000000100231a4 <+676>: blr The code was added to the glibc tree only 18 months ago, so it might explain why this wasn't triggered before. I will open a glibc PR for this. Many thanks, Zdenek
> > I will open a glibc PR for this. > Nice, thanks Zdenek! Changing the status ...