Bug 105744 - [11/12/13 Regression] wrong code with -fexpensive-optimizations -flive-range-shrinkage on powerpc64le-unknown-linux-gnu
Summary: [11/12/13 Regression] wrong code with -fexpensive-optimizations -flive-range-...
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 13.0
: P3 normal
Target Milestone: 11.4
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2022-05-26 18:08 UTC by Zdenek Sojka
Modified: 2022-05-30 09:36 UTC (History)
3 users (show)

See Also:
Host: x86_64-pc-linux-gnu
Target: powerpc64le-unknown-linux-gnu
Build:
Known to work:
Known to fail: 11.3.1, 12.1.1, 13.0
Last reconfirmed: 2022-05-27 00:00:00


Attachments
reduced testcase (194 bytes, text/plain)
2022-05-26 18:08 UTC, Zdenek Sojka
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Zdenek Sojka 2022-05-26 18:08:32 UTC
Created attachment 53040 [details]
reduced testcase

Output:
$ powerpc64le-unknown-linux-gnu-gcc -fexpensive-optimizations -flive-range-shrinkage testcase.c -static
$ qemu-ppc64le -- ./a.out 
qemu: uncaught target signal 6 (Aborted) - core dumped
Aborted

(gdb) p/x a
$5 = {0x0, 0x0, 0x0, 0x0}
(gdb) p/x b
$6 = {0x0, 0x0, 0x0, 0x0}
(gdb) p/x c
$7 = {0xffff00, 0x0, 0x0, 0x0}
(gdb) p/x d
$8 = {0x0, 0x0, 0x0, 0x0}
(gdb) p/x e
$9 = {0x0, 0x0, 0x0, 0x0}
(gdb) p/x u
$10 = {0x1 <repeats 32 times>}


$ powerpc64le-unknown-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-powerpc64le/bin/powerpc64le-unknown-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r13-771-20220526001630-g3dff965cae6-checking-yes-rtl-df-extra-powerpc64le/bin/../libexec/gcc/powerpc64le-unknown-linux-gnu/13.0.0/lto-wrapper
Target: powerpc64le-unknown-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --with-cloog --with-ppl --with-isl --with-sysroot=/usr/powerpc64le-unknown-linux-gnu --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=powerpc64le-unknown-linux-gnu --with-ld=/usr/bin/powerpc64le-unknown-linux-gnu-ld --with-as=/usr/bin/powerpc64le-unknown-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r13-771-20220526001630-g3dff965cae6-checking-yes-rtl-df-extra-powerpc64le
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.0.0 20220526 (experimental) (GCC)
Comment 1 Kewen Lin 2022-05-27 07:12:41 UTC
Can be reproduced without cross build compiler.
Comment 2 Kewen Lin 2022-05-27 07:47:48 UTC
This exposes one bug in glibc strncpy power9 implementation

In https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/powerpc/powerpc64/le/power9/strncpy.S

	lbz	r0,0(r4)
	stb	r0,0(r3)
	addi	r11,r3,1
	addi	r5,r5,-1
	vspltisb v18,0		/* Zeroes in v18  */

...

L(zero_padding_end):
	sldi	r10,r5,56	/* stxvl wants size in top 8 bits  */
	stxvl	v18,r11,r10	/* Partial store  */
	blr

The code at label "zero_padding_end" is supposed to use v18, but the stxvl will take the 18 as vsx No. instead of vr No, so it ends up to use the wrong register vs18 instead of v18 for the store.

The reason why the optimization option matters is that some optimization happen to generate some sequence to modify the value of vs18 then it's not expected to be zero any more.
Comment 3 Kewen Lin 2022-05-27 07:55:00 UTC
Hi Zdenek,

Could you please double check the strncpy implementation on your side? and help to file one glibc issue if so.

One further reduced test case:

#define N 3
char a[N];
char c[N];

int
main (void)
{
  asm volatile("xxspltib 18, 0xf" : : :"vs18");
  __builtin_strncpy (c, a, N);
  if (c[0] || c[1])
    __builtin_abort ();
  return 0;
}
Comment 4 Zdenek Sojka 2022-05-27 08:09:05 UTC
(In reply to Kewen Lin from comment #3)
> Hi Zdenek,
> 
> Could you please double check the strncpy implementation on your side? and
> help to file one glibc issue if so.
> 
> One further reduced test case:
> 
> #define N 3
> char a[N];
> char c[N];
> 
> int
> main (void)
> {
>   asm volatile("xxspltib 18, 0xf" : : :"vs18");
>   __builtin_strncpy (c, a, N);
>   if (c[0] || c[1])
>     __builtin_abort ();
>   return 0;
> }

Hello Kewen,

thank you for the simple testcase. I can confirm it fails for me, and I am using the problematic strncpy implementation:

...
   0x0000000010022f10 <+16>:      lbz     r0,0(r4)
   0x0000000010022f14 <+20>:      stb     r0,0(r3)
   0x0000000010022f18 <+24>:      addi    r11,r3,1
   0x0000000010022f1c <+28>:      addi    r5,r5,-1
   0x0000000010022f20 <+32>:      vspltisb v18,0
...
   0x000000001002319c <+668>:     rldicr  r10,r5,56,7
   0x00000000100231a0 <+672>:     stxvl   vs18,r11,r10
   0x00000000100231a4 <+676>:     blr

The code was added to the glibc tree only 18 months ago, so it might explain why this wasn't triggered before.

I will open a glibc PR for this.

Many thanks,
Zdenek
Comment 5 Kewen Lin 2022-05-27 08:23:08 UTC
> 
> I will open a glibc PR for this.
> 

Nice, thanks Zdenek! Changing the status ...