Bug 98692 - Uninitialized values reported only with -Os
Summary: Uninitialized values reported only with -Os
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 10.2.1
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: diagnostic
Depends on:
Blocks:
 
Reported: 2021-01-14 21:11 UTC by Nick Child
Modified: 2023-04-12 08:58 UTC (History)
8 users (show)

See Also:
Host: powerpc64le-linux-gnu
Target: powerpc64le-linux-gnu
Build: powerpc64le-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2021-01-15 00:00:00


Attachments
Preprocesssed File (7.36 KB, text/plain)
2021-01-14 21:11 UTC, Nick Child
Details
source (721 bytes, text/x-csrc)
2021-01-14 21:11 UTC, Nick Child
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nick Child 2021-01-14 21:11:15 UTC
Created attachment 49970 [details]
Preprocesssed File

Hello all,

This is my first go at something like this so I apologize for any cringes now.
Recently, I implemented memory checks into our CI using valgrind. Everything works fine on x86 with the same commands but gives issues when running on POWER. Specifically, only when using the `-Os` size optimizer flag. Things like `O2` and `O3` don't bring about any errors. While the code runs fine, these valgrind errors are a bit alarming. I have run into the issue using the following compiler versions and OS's:
    gcc version 9.3.1 20200408 (Red Hat 9.3.1-2)
    gcc version 10.2.1 20200723 (Red Hat 10.2.1-1)
    gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)

The errors revolve around an conditional jump on unitialized variables. I am thinking it is a stack allocation issue. I have tried to narrow down my issue to as simple a program as possible (attached). 
I then build using:
   $ gcc -Os issue.c -o issue
Compilation works and binary executes normally but valgrind complains with:
   $ valgrind ./issue

This could very well be a valgrind mistake, with too many things being optimized off of the executable for valgrind to accurately keep track of memory.
 
Here is the output from valgrind:

==3557285== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==3557285== Command: leak
==3557285==
==3557285== Conditional jump or move depends on uninitialised value(s)
==3557285==    at 0x416E9B8: __vfprintf_internal (vfprintf-internal.c:1332)
==3557285==    by 0x415E33B: printf@@GLIBC_2.17 (printf.c:33)
==3557285==    by 0x100005D3: main (in /home/jenkins/testPlace/test2/leak)
==3557285==
==3557285== Use of uninitialised value of size 8
==3557285==    at 0x40A14DC: strchrnul (vg_replace_strmem.c:1394)
==3557285==    by 0x416E9D3: __find_specmb (printf-parse.h:111)
==3557285==    by 0x416E9D3: __vfprintf_internal (vfprintf-internal.c:1365)
==3557285==    by 0x415E33B: printf@@GLIBC_2.17 (printf.c:33)
==3557285==    by 0x100005D3: main (in /home/jenkins/testPlace/test2/leak)
==3557285==
==3557285== Use of uninitialised value of size 8
==3557285==    at 0x40A14F0: strchrnul (vg_replace_strmem.c:1394)
==3557285==    by 0x416E9D3: __find_specmb (printf-parse.h:111)
==3557285==    by 0x416E9D3: __vfprintf_internal (vfprintf-internal.c:1365)
==3557285==    by 0x415E33B: printf@@GLIBC_2.17 (printf.c:33)
==3557285==    by 0x100005D3: main (in /home/jenkins/testPlace/test2/leak)
==3557285==
==3557285== Conditional jump or move depends on uninitialised value(s)
==3557285==    at 0x418BABC: _IO_file_xsputn@@GLIBC_2.17 (fileops.c:1204)
==3557285==    by 0x416EA33: __vfprintf_internal (vfprintf-internal.c:1373)
==3557285==    by 0x415E33B: printf@@GLIBC_2.17 (printf.c:33)
==3557285==    by 0x100005D3: main (in /home/jenkins/testPlace/test2/leak)
==3557285==
==3557285== Conditional jump or move depends on uninitialised value(s)
==3557285==    at 0x418BC94: _IO_new_file_xsputn (fileops.c:1253)
==3557285==    by 0x418BC94: _IO_file_xsputn@@GLIBC_2.17 (fileops.c:1197)
==3557285==    by 0x416EA33: __vfprintf_internal (vfprintf-internal.c:1373)
==3557285==    by 0x415E33B: printf@@GLIBC_2.17 (printf.c:33)
==3557285==    by 0x100005D3: main (in /home/jenkins/testPlace/test2/leak)
==3557285==
...
...
...
==3558101==
==3558101== HEAP SUMMARY:
==3558101==     in use at exit: 0 bytes in 0 blocks
==3558101==   total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==3558101==
==3558101== All heap blocks were freed -- no leaks are possible
==3558101==
==3558101== Use --track-origins=yes to see where uninitialised values come from
==3558101== For lists of detected and suppressed errors, rerun with: -s
==3558101== ERROR SUMMARY: 44 errors from 31 contexts (suppressed: 0 from 0)
Comment 1 Nick Child 2021-01-14 21:11:42 UTC
Created attachment 49971 [details]
source
Comment 2 Martin Liška 2021-01-15 08:58:01 UTC
Confirmed, working on that..
Comment 3 Martin Liška 2021-01-15 09:24:13 UTC
A bit reduced test-case:

$ cat pr9862.C
#define variables  (const char* []){ "PK", "KEK", "db"}

int ret, len;

void isVariable(char *var)
{
  for (int v = 0; v < 2; v++)
    if (__builtin_strncmp(var,variables[v], 2) == 0)
      ret = 0;
}

int main(int argc, char* argv[])
{
  __builtin_printf ("argv[0]=%s\n", argv[0]);
  isVariable(argv[0]);			
  len = __builtin_strlen (argv[0]);
  return 0;
}

$ g++ pr9862.C -g -Os && valgrind --expensive-definedness-checks=yes ./a.out
==54436== Memcheck, a memory error detector
==54436== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==54436== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==54436== Command: ./a.out
==54436== 
argv[0]=./a.out
==54436== Use of uninitialised value of size 8
==54436==    at 0x100005F8: main (pr9862.C:16)

Can please anybody from Power folks take a look?
Comment 4 Segher Boessenkool 2021-01-15 23:58:05 UTC
Are you sure that target is correct?!
Comment 5 Segher Boessenkool 2021-01-16 00:01:06 UTC
Have you tried a new valgrind?

Either this is (or was) a known problem in valgrind, or it is related to
one.  Cc:ing Aaron, he might know more (he wrote the GCC optimisations
that expose the problem).
Comment 6 Martin Liška 2021-01-18 08:05:11 UTC
Fixed target, it's ppc64le and I can reproduce it on gcc112.fsffrance.org
machine. It contains valgrind 3.15 which is pretty new.
Comment 7 acsawdey 2021-01-18 15:11:10 UTC
The inline expansion should be disabled by -Os, the patterns for cmpstr[n]si both have this:

  if (optimize_insn_for_size_p ())
    FAIL;
Comment 8 Nick Child 2021-01-19 15:03:54 UTC
Hello all,
Thanks for taking a look at this. In response to some questions:
I am using Valgrind 3.16.1 (latest release).

And I have encountered this issue on a Power 8 and Power 9 LE machine with multiple versions of RHEL and on Ubuntu.

Thanks again,
Nick
Comment 9 Will Schmidt 2021-02-08 21:29:23 UTC
(In reply to Segher Boessenkool from comment #5)
> Have you tried a new valgrind?
> 
> Either this is (or was) a known problem in valgrind, or it is related to
> one.  Cc:ing Aaron, he might know more (he wrote the GCC optimisations
> that expose the problem).


I've recreated against new (built out of upstream git) valgrind:
with --track-origins=yes


==37507== 
argv[0]=./a.out
==37507== Use of uninitialised value of size 8
==37507==    at 0x10000618: main (pr9862.C:16)
==37507==  Uninitialised value was created by a stack allocation
==37507==    at 0x100007D4: isVariable(char*) (pr9862.C:5)
Comment 10 Mark Wielaard 2021-02-08 21:40:20 UTC
(In reply to Will Schmidt from comment #9)
> (In reply to Segher Boessenkool from comment #5)
> > Have you tried a new valgrind?
> > 
> > Either this is (or was) a known problem in valgrind, or it is related to
> > one.  Cc:ing Aaron, he might know more (he wrote the GCC optimisations
> > that expose the problem).
> 
> 
> I've recreated against new (built out of upstream git) valgrind:
> with --track-origins=yes
> 
> 
> ==37507== 
> argv[0]=./a.out
> ==37507== Use of uninitialised value of size 8
> ==37507==    at 0x10000618: main (pr9862.C:16)
> ==37507==  Uninitialised value was created by a stack allocation
> ==37507==    at 0x100007D4: isVariable(char*) (pr9862.C:5)

Trying to get hold of a ppc64 setup. But could you try with --vgdb-error=0 and then (in another window) gdb ./a.out and target remote | vgdb and continue till you get the TRAP. Then disassamble so we can see the exact instruction that generates the use of uninitialised value?
Comment 11 Will Schmidt 2021-02-08 22:00:15 UTC
(In reply to Mark Wielaard from comment #10)
> (In reply to Will Schmidt from comment #9)
> > (In reply to Segher Boessenkool from comment #5)
> > > Have you tried a new valgrind?
> > > 
> > > Either this is (or was) a known problem in valgrind, or it is related to
> > > one.  Cc:ing Aaron, he might know more (he wrote the GCC optimisations
> > > that expose the problem).
> > 
> > 
> > I've recreated against new (built out of upstream git) valgrind:
> > with --track-origins=yes
> > 
> > 
> > ==37507== 
> > argv[0]=./a.out
> > ==37507== Use of uninitialised value of size 8
> > ==37507==    at 0x10000618: main (pr9862.C:16)
> > ==37507==  Uninitialised value was created by a stack allocation
> > ==37507==    at 0x100007D4: isVariable(char*) (pr9862.C:5)
> 
> Trying to get hold of a ppc64 setup. But could you try with --vgdb-error=0
> and then (in another window) gdb ./a.out and target remote | vgdb and
> continue till you get the TRAP. Then disassamble so we can see the exact
> instruction that generates the use of uninitialised value?

Yes.  so this traps on a ld instruction upon the return from the isVariable call.  As seen below here:


Window #1:
==79608== 
argv[0]=./a.out
==79608== Use of uninitialised value of size 8
==79608==    at 0x10000618: main (pr9862.C:16)
==79608==  Uninitialised value was created by a stack allocation
==79608==    at 0x100007D4: isVariable(char*) (pr9862.C:5)
==79608== 
==79608== (action on error) vgdb me ... 

Window #2:
(gdb) target remote | vgdb
Remote debugging using | vgdb
relaying data between gdb and process 79608
warning: remote target does not support file transfer, attempting to access files from local filesystem.
Reading symbols from /lib64/ld64.so.2...
(No debugging symbols found in /lib64/ld64.so.2)
0x0000000004001720 in ?? () from /lib64/ld64.so.2
(gdb) 
(gdb) set disassemble-next-line on 
(gdb) c
Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.
main (argc=<optimized out>, argv=0x1fff00e8a8) at pr9862.C:16
16	  len = __builtin_strlen (argv[1]);
=> 0x0000000010000618 <main(int, char**)+56>:	08 00 7f e8	ld      r3,8(r31)
   0x000000001000061c <main(int, char**)+60>:	a5 ff ff 4b	bl      0x100005c0 <00000037.plt_call.strlen@@GLIBC_2.17>
   0x0000000010000620 <main(int, char**)+64>:	18 00 41 e8	ld      r2,24(r1)
   0x0000000010000624 <main(int, char**)+68>:	00 00 00 60	nop
   0x0000000010000628 <main(int, char**)+72>:	70 00 21 38	addi    r1,r1,112
   0x000000001000062c <main(int, char**)+76>:	50 81 62 90	stw     r3,-32432(r2)

(gdb) disas
Dump of assembler code for function main(int, char**):
   0x00000000100005e0 <+0>:	lis     r2,4098
   0x00000000100005e4 <+4>:	addi    r2,r2,32512
   0x00000000100005e8 <+8>:	mflr    r0
   0x00000000100005ec <+12>:	std     r31,-8(r1)
   0x00000000100005f0 <+16>:	addis   r3,r2,-2
   0x00000000100005f4 <+20>:	mr      r31,r4
   0x00000000100005f8 <+24>:	addi    r3,r3,-29882
   0x00000000100005fc <+28>:	std     r0,16(r1)
   0x0000000010000600 <+32>:	stdu    r1,-112(r1)
   0x0000000010000604 <+36>:	ld      r4,0(r4)
   0x0000000010000608 <+40>:	bl      0x10000580 <00000037.plt_call.printf@@GLIBC_2.17>
   0x000000001000060c <+44>:	ld      r2,24(r1)
   0x0000000010000610 <+48>:	ld      r3,0(r31)
   0x0000000010000614 <+52>:	bl      0x100007c4 <isVariable(char*)+8>
=> 0x0000000010000618 <+56>:	ld      r3,8(r31)
   0x000000001000061c <+60>:	bl      0x100005c0 <00000037.plt_call.strlen@@GLIBC_2.17>
   0x0000000010000620 <+64>:	ld      r2,24(r1)
   0x0000000010000624 <+68>:	nop
   0x0000000010000628 <+72>:	addi    r1,r1,112
   0x000000001000062c <+76>:	stw     r3,-32432(r2)
   0x0000000010000630 <+80>:	li      r3,0
   0x0000000010000634 <+84>:	b       0x1000098c <_restgpr0_31>
   0x0000000010000638 <+88>:	.long 0x0
   0x000000001000063c <+92>:	.long 0x1000900
   0x0000000010000640 <+96>:	.long 0x180
End of assembler dump.
(gdb)  




WIndow#1:
Comment 12 Mark Wielaard 2021-02-09 09:34:46 UTC
OK, so according to memcheck the load uses a pointer value that isn't initialized properly. And it thinks that value originated from a stack value in the isVariable call. Unfortunately my powerpc assembly is not strong enough to know how to read this precisely, what the calling conventions are and how the address is setup in isVariable.
Comment 13 Mark Wielaard 2021-02-09 09:44:12 UTC
I could replicate this with gcc 9.1.1 with the following source:

#define variables  (const char* []){ "PK", "KEK", "db"}

int ret, len;

void isVariable(char *var)
{
  for (int v = 0; v < 2; v++)
    if (__builtin_strncmp(var,variables[v], 2) == 0)
      ret = 0;
}

int main(int argc, char* argv[])
{
//  __builtin_printf ("argv[0]=%s\n", argv[0]);
  isVariable(argv[0]);
  len = __builtin_strlen (argv[0]);
  return 0;
}

compiled with gcc -g -Os and valgrind from git trunk with --vgdb=full --track-origins=yes:

==25741== Command: ./a.out
==25741== 
==25741== Use of uninitialised value of size 8
==25741==    at 0x10000504: main (pr9862.C:16)
==25741==  Uninitialised value was created by a stack allocation
==25741==    at 0x100006C4: isVariable(char*) (pr9862.C:6)
==25741== 

Disassambly of main and isVariable

Dump of assembler code for function main(int, char**):
   0x00000000100004e0 <+0>:	lis     r2,4098
   0x00000000100004e4 <+4>:	addi    r2,r2,32512
   0x00000000100004e8 <+8>:	mflr    r0
   0x00000000100004ec <+12>:	std     r31,-8(r1)
   0x00000000100004f0 <+16>:	ld      r3,0(r4)
   0x00000000100004f4 <+20>:	mr      r31,r4
   0x00000000100004f8 <+24>:	std     r0,16(r1)
   0x00000000100004fc <+28>:	stdu    r1,-48(r1)
   0x0000000010000500 <+32>:	bl      0x100006b4 <isVariable(char*)+8>
   0x0000000010000504 <+36>:	ld      r3,0(r31)
   0x0000000010000508 <+40>:	bl      0x100004a0 <00000023.plt_call.strlen@@GLIBC_2.17>
   0x000000001000050c <+44>:	ld      r2,24(r1)
   0x0000000010000510 <+48>:	nop
   0x0000000010000514 <+52>:	addi    r1,r1,48
   0x0000000010000518 <+56>:	stw     r3,-32452(r2)
   0x000000001000051c <+60>:	li      r3,0
   0x0000000010000520 <+64>:	b       0x1000086c <_restgpr0_31>
   0x0000000010000524 <+68>:	.long 0x0
   0x0000000010000528 <+72>:	.long 0x1000900
   0x000000001000052c <+76>:	.long 0x180
End of assembler dump.

Dump of assembler code for function isVariable(char*):
   0x00000000100006ac <+0>:	lis     r2,4098
   0x00000000100006b0 <+4>:	addi    r2,r2,32512
   0x00000000100006b4 <+8>:	mflr    r0
   0x00000000100006b8 <+12>:	addis   r9,r2,-2
   0x00000000100006bc <+16>:	addi    r9,r9,-30152
   0x00000000100006c0 <+20>:	bl      0x10000820 <_savegpr0_25>
   0x00000000100006c4 <+24>:	stdu    r1,-128(r1)
   0x00000000100006c8 <+28>:	ld      r29,0(r9)
   0x00000000100006cc <+32>:	ld      r28,8(r9)
   0x00000000100006d0 <+36>:	nop
   0x00000000100006d4 <+40>:	mr      r30,r3
   0x00000000100006d8 <+44>:	ld      r27,16(r9)
   0x00000000100006dc <+48>:	li      r31,0
   0x00000000100006e0 <+52>:	addi    r25,r2,-32456
   0x00000000100006e4 <+56>:	addi    r26,r1,32
   0x00000000100006e8 <+60>:	std     r29,32(r1)
   0x00000000100006ec <+64>:	std     r28,40(r1)
   0x00000000100006f0 <+68>:	rldicr  r9,r31,3,60
   0x00000000100006f4 <+72>:	li      r5,2
   0x00000000100006f8 <+76>:	std     r27,48(r1)
   0x00000000100006fc <+80>:	mr      r3,r30
   0x0000000010000700 <+84>:	ldx     r4,r26,r9
   0x0000000010000704 <+88>:	bl      0x100004c0 <00000023.plt_call.strncmp@@GLIBC_2.17>
   0x0000000010000708 <+92>:	ld      r2,24(r1)
   0x000000001000070c <+96>:	mr.     r9,r3
   0x0000000010000710 <+100>:	bne     0x10000718 <isVariable(char*)+108>
   0x0000000010000714 <+104>:	stw     r9,0(r25)
   0x0000000010000718 <+108>:	cmpldi  r31,1
   0x000000001000071c <+112>:	bne     0x10000728 <isVariable(char*)+124>
   0x0000000010000720 <+116>:	addi    r1,r1,128
   0x0000000010000724 <+120>:	b       0x10000844 <_restgpr0_25>
   0x0000000010000728 <+124>:	li      r31,1
   0x000000001000072c <+128>:	b       0x100006e8 <isVariable(char*)+60>
   0x0000000010000730 <+132>:	.long 0x0
   0x0000000010000734 <+136>:	.long 0x1000900
   0x0000000010000738 <+140>:	.long 0x780
End of assembler dump.
Comment 14 Will Schmidt 2021-02-09 21:04:52 UTC

Using gdb/vgdb to view the valgrind shadow register values, it looks like
the uninitialized values are being loaded in via the
_restgpr0_25 call being made at the end of isVariable().

Dump of assembler code for function isVariable(char*):
   0x00000000100006ac <+0>:	lis     r2,4098
   0x00000000100006b0 <+4>:	addi    r2,r2,32512
   0x00000000100006b4 <+8>:	mflr    r0
   0x00000000100006b8 <+12>:	addis   r9,r2,-2
   0x00000000100006bc <+16>:	addi    r9,r9,-30152

(here: r1 is 0x1fff00e440 ; shadow registers r24..r31 show zero).

   0x00000000100006c0 <+20>:	bl      0x10000820 <_savegpr0_25>

savegpr0_25 stores r25 thru r31 into the stack (offset of r1)

   0x00000000100006c4 <+24>:	stdu    r1,-128(r1)

And stack pointer updates itself.   r1 is now 0x1fff00e3c0


   0x00000000100006c8 <+28>:	ld      r29,0(r9)
   0x00000000100006cc <+32>:	ld      r28,8(r9)
   0x00000000100006d0 <+36>:	nop
   0x00000000100006d4 <+40>:	mr      r30,r3
   0x00000000100006d8 <+44>:	ld      r27,16(r9)
   0x00000000100006dc <+48>:	li      r31,0
   0x00000000100006e0 <+52>:	addi    r25,r2,-32456
   0x00000000100006e4 <+56>:	addi    r26,r1,32
   0x00000000100006e8 <+60>:	std     r29,32(r1)
   0x00000000100006ec <+64>:	std     r28,40(r1)
   0x00000000100006f0 <+68>:	rldicr  r9,r31,3,60
   0x00000000100006f4 <+72>:	li      r5,2
   0x00000000100006f8 <+76>:	std     r27,48(r1)
   0x00000000100006fc <+80>:	mr      r3,r30
   0x0000000010000700 <+84>:	ldx     r4,r26,r9
   0x0000000010000704 <+88>:	bl      0x100004c0 <00000023.plt_call.strncmp@@GLIBC_2.17>
   0x0000000010000708 <+92>:	ld      r2,24(r1)
   0x000000001000070c <+96>:	mr.     r9,r3
   0x0000000010000710 <+100>:	bne     0x10000718 <isVariable(char*)+108>
   0x0000000010000714 <+104>:	stw     r9,0(r25)
   0x0000000010000718 <+108>:	cmpldi  r31,1
   0x000000001000071c <+112>:	bne     0x10000728 <isVariable(char*)+124>

r1 still 0x1fff00e3c0

   0x0000000010000720 <+116>:	addi    r1,r1,128

now r1 is 0x1fff00e440

   0x0000000010000724 <+120>:	b       0x10000844 <_restgpr0_25>

and as we progress through restoring the regs, the valgrind shadow registers are indicating the values are uninitialized as they are restored off the stack (r1).

   0x10000964 <_restgpr0_25>:	ld      r25,-56(r1)

(gdb) p/x $r25s1
$29 = 0xffffffffffffffff

(gdb) p/x $r26s1        
$31 = 0x0
=> 0x10000968 <_restgpr0_26>:	ld      r26,-48(r1)
(gdb) p/x $r26s1
$32 = 0xffffffffffffffff

   0x1000096c <_restgpr0_27>:	ld      r27,-40(r1)
   0x10000970 <_restgpr0_28>:	ld      r28,-32(r1)
   0x10000974 <_restgpr0_29>:	ld      r0,16(r1)



The _restgpr* and _savegpr* functions are not referenced when the test is built at other optimization levels.  (I've looked at disassembly from -O0 .. -O4).

I do note that the _restgpr and _savegpr functions are called differently.  savegpr is called with bl while the restgpr is called via a direct branch; i can't immediately tell if this is by design or if it is in error.


    100007d0:	71 01 00 48 	bl      10000940 <_savegpr0_25>
    10000834:	30 01 00 48 	b       10000964 <_restgpr0_25>
Comment 15 Segher Boessenkool 2021-02-09 22:12:42 UTC
(In reply to Will Schmidt from comment #14)
> The _restgpr* and _savegpr* functions are not referenced when the test is
> built at other optimization levels.  (I've looked at disassembly from -O0 ..
> -O4).

Right, it is a size optimisation.

> I do note that the _restgpr and _savegpr functions are called differently. 
> savegpr is called with bl while the restgpr is called via a direct branch; i
> can't immediately tell if this is by design or if it is in error.

It is by design: these are special functions defined by the ABI, specifically
to save some code space.
Comment 16 Segher Boessenkool 2021-02-09 22:21:21 UTC
(In reply to Mark Wielaard from comment #13)
> ==25741== Use of uninitialised value of size 8
> ==25741==    at 0x10000504: main (pr9862.C:16)

r4 is argv here
>    0x00000000100004f0 <+16>:	ld      r3,0(r4)
r3 = argv[0];
>    0x00000000100004f4 <+20>:	mr      r31,r4
r31 = argv; // because we need it after the call, save it in a non-volatile reg
>    0x00000000100004f8 <+24>:	std     r0,16(r1)
>    0x00000000100004fc <+28>:	stdu    r1,-48(r1)
>    0x0000000010000500 <+32>:	bl      0x100006b4 <isVariable(char*)+8>
The call; after this we have to load argv[0] again, the called function might
have changed it.
>    0x0000000010000504 <+36>:	ld      r3,0(r31)
r3 = argv[0];

So it is funny that the exact same insn four insns earlier (in the program text)
worked fine, but this one fails.

The ABI says (taken from the ELFv1 ABI, the ELFv2 doc is not nice for copy/paste):


Here is a sample implementation of _savegpr0_N and _restgpr0_N.

  _savegpr0_14:  std  r14,-144(r1)
  _savegpr0_15:  std  r15,-136(r1)
  _savegpr0_16:  std  r16,-128(r1)
  _savegpr0_17:  std  r17,-120(r1)
  _savegpr0_18:  std  r18,-112(r1)
  _savegpr0_19:  std  r19,-104(r1)
  _savegpr0_20:  std  r20,-96(r1)
  _savegpr0_21:  std  r21,-88(r1)
  _savegpr0_22:  std  r22,-80(r1)
  _savegpr0_23:  std  r23,-72(r1)
  _savegpr0_24:  std  r24,-64(r1)
  _savegpr0_25:  std  r25,-56(r1)
  _savegpr0_26:  std  r26,-48(r1)
  _savegpr0_27:  std  r27,-40(r1)
  _savegpr0_28:  std  r28,-32(r1)
  _savegpr0_29:  std  r29,-24(r1)
  _savegpr0_30:  std  r30,-16(r1)
  _savegpr0_31:  std  r31,-8(r1)
                 std  r0, 16(r1)
                 blr


  _restgpr0_14:  ld   r14,-144(r1)
  _restgpr0_15:  ld   r15,-136(r1)
  _restgpr0_16:  ld   r16,-128(r1)
  _restgpr0_17:  ld   r17,-120(r1)
  _restgpr0_18:  ld   r18,-112(r1)
  _restgpr0_19:  ld   r19,-104(r1)
  _restgpr0_20:  ld   r20,-96(r1)
  _restgpr0_21:  ld   r21,-88(r1)
  _restgpr0_22:  ld   r22,-80(r1)
  _restgpr0_23:  ld   r23,-72(r1)
  _restgpr0_24:  ld   r24,-64(r1)
  _restgpr0_25:  ld   r25,-56(r1)
  _restgpr0_26:  ld   r26,-48(r1)
  _restgpr0_27:  ld   r27,-40(r1)
  _restgpr0_28:  ld   r28,-32(r1)
  _restgpr0_29:  ld   r0, 16(r1)
                 ld   r29,-24(r1)
                 mtlr r0
                 ld   r30,-16(r1)
                 ld   r31,-8(r1)
                 blr
  _restgpr0_30:  ld   r30,-16(r1)
  _restgpr0_31:  ld   r0, 16(r1)
                 ld   r31,-8(r1)
                 mtlr r0
                 blr

So this is one function with many entry points you could say.  Maybe that is
what confused valgrind?
Comment 17 Mark Wielaard 2021-02-10 09:38:18 UTC
Thanks for the step-by-step explanation of the assembly instructions and calling conventions.

(In reply to Segher Boessenkool from comment #16)
> (In reply to Mark Wielaard from comment #13)
> > ==25741== Use of uninitialised value of size 8
> > ==25741==    at 0x10000504: main (pr9862.C:16)
> 
> r4 is argv here
> >    0x00000000100004f0 <+16>:	ld      r3,0(r4)
> r3 = argv[0];
> >    0x00000000100004f4 <+20>:	mr      r31,r4
> r31 = argv; // because we need it after the call, save it in a non-volatile
> reg
> >    0x00000000100004f8 <+24>:	std     r0,16(r1)
> >    0x00000000100004fc <+28>:	stdu    r1,-48(r1)
> >    0x0000000010000500 <+32>:	bl      0x100006b4 <isVariable(char*)+8>
> The call; after this we have to load argv[0] again, the called function might
> have changed it.
> >    0x0000000010000504 <+36>:	ld      r3,0(r31)
> r3 = argv[0];
> 
> So it is funny that the exact same insn four insns earlier (in the program
> text)
> worked fine, but this one fails.

The different (according to valgrind) is that r4 has a defined value, while it believes r31 has an undefined value after the isVariable call.

> The ABI says (taken from the ELFv1 ABI, the ELFv2 doc is not nice for
> copy/paste):
> 
> 
> Here is a sample implementation of _savegpr0_N and _restgpr0_N.
> 
>   _savegpr0_14:  std  r14,-144(r1)
>   _savegpr0_15:  std  r15,-136(r1)
>   _savegpr0_16:  std  r16,-128(r1)
>   _savegpr0_17:  std  r17,-120(r1)
>   _savegpr0_18:  std  r18,-112(r1)
>   _savegpr0_19:  std  r19,-104(r1)
>   _savegpr0_20:  std  r20,-96(r1)
>   _savegpr0_21:  std  r21,-88(r1)
>   _savegpr0_22:  std  r22,-80(r1)
>   _savegpr0_23:  std  r23,-72(r1)
>   _savegpr0_24:  std  r24,-64(r1)
>   _savegpr0_25:  std  r25,-56(r1)
>   _savegpr0_26:  std  r26,-48(r1)
>   _savegpr0_27:  std  r27,-40(r1)
>   _savegpr0_28:  std  r28,-32(r1)
>   _savegpr0_29:  std  r29,-24(r1)
>   _savegpr0_30:  std  r30,-16(r1)
>   _savegpr0_31:  std  r31,-8(r1)
>                  std  r0, 16(r1)
>                  blr
>   _restgpr0_14:  ld   r14,-144(r1)
>   _restgpr0_15:  ld   r15,-136(r1)
>   _restgpr0_16:  ld   r16,-128(r1)
>   _restgpr0_17:  ld   r17,-120(r1)
>   _restgpr0_18:  ld   r18,-112(r1)
>   _restgpr0_19:  ld   r19,-104(r1)
>   _restgpr0_20:  ld   r20,-96(r1)
>   _restgpr0_21:  ld   r21,-88(r1)
>   _restgpr0_22:  ld   r22,-80(r1)
>   _restgpr0_23:  ld   r23,-72(r1)
>   _restgpr0_24:  ld   r24,-64(r1)
>   _restgpr0_25:  ld   r25,-56(r1)
>   _restgpr0_26:  ld   r26,-48(r1)
>   _restgpr0_27:  ld   r27,-40(r1)
>   _restgpr0_28:  ld   r28,-32(r1)
>   _restgpr0_29:  ld   r0, 16(r1)
>                  ld   r29,-24(r1)
>                  mtlr r0
>                  ld   r30,-16(r1)
>                  ld   r31,-8(r1)
>                  blr
>   _restgpr0_30:  ld   r30,-16(r1)
>   _restgpr0_31:  ld   r0, 16(r1)
>                  ld   r31,-8(r1)
>                  mtlr r0
>                  blr
> 
> So this is one function with many entry points you could say.  Maybe that is
> what confused valgrind?

So for some reason valgrind doesn't believe the stack value for -8(r1) is valid when r31 is restored.

What seems to confuse valgrind is:

   0x00000000100006c0 <+20>:	bl      0x10000820 <_savegpr0_25>
   0x00000000100006c4 <+24>:  	stdu    r1,-128(r1)
   [...]
   0x0000000010000720 <+116>:	addi    r1,r1,128
   0x0000000010000724 <+120>:	b       0x10000844 <_restgpr0_25>

It looks like it interprets those stack pointer moves as invalidating the values stored on the stack.
Comment 18 Mark Wielaard 2021-02-10 14:09:54 UTC
The current thinking (Julian did the thinking, I am just repeating) is that this is caused by the way the _savegpr and/or restgpr functions return using blr.

PPC has two special "lets zap the red zone" (the 288 bytes below the stack pointer) cases:

#  define VG_STACK_REDZONE_SZB    288  // number of addressable bytes below R1
                                       // from 64-bit PowerPC ELF ABI 
                                       // Supplement 1.7

   guest_ppc_zap_RZ_at_blr
      guest is ppc64-linux                ==> True
      guest is ppc32-linux                ==> False
      guest is other                      ==> inapplicable

   guest_ppc_zap_RZ_at_bl
      guest is ppc64-linux                ==> const True
      guest is ppc32-linux                ==> const False
      guest is other                      ==> inapplicable
   guest_stack_redzone_size
      guest is ppc32-linux                ==> 0
      guest is ppc64-linux                ==> 288
      guest is amd64-linux                ==> 128
      guest is other                      ==> inapplicable

      /* PPC and AMD64 GUESTS only: how many bytes below the 
         stack pointer are validly addressible? */
      Int guest_stack_redzone_size;

      /* PPC GUESTS only: should we zap the stack red zone at a 'blr'
         (function return) ? */
      Bool guest_ppc_zap_RZ_at_blr;

      /* PPC GUESTS only: should we zap the stack red zone at a 'bl'
         (function call) ?  Is supplied with the guest address of the
         target of the call since that may be significant.  If NULL,
         is assumed equivalent to a fn which always returns False. */
      Bool (*guest_ppc_zap_RZ_at_bl)(Addr);

#  if defined(VGP_ppc32_linux)
   vex_abiinfo.guest_ppc_zap_RZ_at_blr        = False;
   vex_abiinfo.guest_ppc_zap_RZ_at_bl         = NULL;
#  endif

#  if defined(VGP_ppc64be_linux)
   vex_abiinfo.guest_ppc_zap_RZ_at_blr        = True;
   vex_abiinfo.guest_ppc_zap_RZ_at_bl         = const_True;
   vex_abiinfo.host_ppc_calls_use_fndescrs    = True;
#  endif

#  if defined(VGP_ppc64le_linux)
   vex_abiinfo.guest_ppc_zap_RZ_at_blr        = True;
   vex_abiinfo.guest_ppc_zap_RZ_at_bl         = const_True;
   vex_abiinfo.host_ppc_calls_use_fndescrs    = False;
#  endif

What happens on a blr function return is that, based on the guest_ppc_zap_RZ_at_blr value, the redzone is marked as containing undefined values.

And indeed, with this patch:

diff --git a/coregrind/m_translate.c b/coregrind/m_translate.c
index 332202a91..6dd01afac 100644
--- a/coregrind/m_translate.c
+++ b/coregrind/m_translate.c
@@ -1709,7 +1709,7 @@ Bool VG_(translate) ( ThreadId tid,
 #  endif
 
 #  if defined(VGP_ppc64le_linux)
-   vex_abiinfo.guest_ppc_zap_RZ_at_blr        = True;
+   vex_abiinfo.guest_ppc_zap_RZ_at_blr        = False;
    vex_abiinfo.guest_ppc_zap_RZ_at_bl         = const_True;
    vex_abiinfo.host_ppc_calls_use_fndescrs    = False;
 #  endif

The warning goes away.

But is that the right thing to do always? It seems to mask issues where a function is using the red zone values set by another function.
Comment 19 jseward 2021-02-10 14:35:29 UTC
(In reply to Mark Wielaard from comment #18)

(expanding marginally on Mark's comment)

Currently the Memcheck ppc64be and ppc64le ports assume that the redzone
(the 288 bytes below SP) is volatile across both calls and returns, and it
enforces this behaviour by painting that area of memory as "undefined" for
both calls and returns.  But the optimisation discussed here appears to carry
live data across a return (that's what a "blr" is, right?)

So in effect the problem occurs because this optimisation changes the
ABI semantics that Memcheck has thus far assumed.  As Mark says, we can 
"fix" this just by disabling the zap-on-return instrumentation behaviour,
but that comes at the cost of completely losing the ability to detect 
genuinely incorrect uses of the redzone across a return.
Comment 20 Jakub Jelinek 2021-02-10 14:44:17 UTC
Can you disable it just for these magic entrypoints (either by name or by content)?
Comment 21 jseward 2021-02-10 14:51:37 UTC
(In reply to Jakub Jelinek from comment #20)
> Can you disable it just for these magic entrypoints (either by name or by
> content)?

In principle yes.  I prefer by-content rather than by-name, in the case
where all symbol names have disappeared or changed, etc.  However, this
would require having a reliable mechanism for detecting the entry points
by inspecting their content.  It would also require a certain amount of
plumbing work, basically making `VexAbiInfo::guest_ppc_zap_RZ_at_blr` be
a function rather than a Bool, in the same way that 
`VexAbiInfo::guest_ppc_zap_RZ_at_bl` already is.  It might be a day or two's
work to set up and test, once we had a reliable identify-by-content
routine available.
Comment 22 jseward 2021-02-10 14:53:54 UTC
Looking back at the above, it's now clearer what the problem is:

  # Park potentially live data in the red zone
  _savegpr0_14:  std  r14,-144(r1)
  _savegpr0_15:  std  r15,-136(r1)
  _savegpr0_16:  std  r16,-128(r1)
  _savegpr0_17:  std  r17,-120(r1)
  _savegpr0_18:  std  r18,-112(r1)
  _savegpr0_19:  std  r19,-104(r1)
  _savegpr0_20:  std  r20,-96(r1)
  _savegpr0_21:  std  r21,-88(r1)
  _savegpr0_22:  std  r22,-80(r1)
  _savegpr0_23:  std  r23,-72(r1)
  _savegpr0_24:  std  r24,-64(r1)
  _savegpr0_25:  std  r25,-56(r1)
  _savegpr0_26:  std  r26,-48(r1)
  _savegpr0_27:  std  r27,-40(r1)
  _savegpr0_28:  std  r28,-32(r1)
  _savegpr0_29:  std  r29,-24(r1)
  _savegpr0_30:  std  r30,-16(r1)
  _savegpr0_31:  std  r31,-8(r1)
                 std  r0, 16(r1)

  # And ka-zap!  Memcheck paints -288(r1) .. -1(r1) as undefined.
                 blr

  # So now they're all "unusable".
Comment 23 Segher Boessenkool 2021-02-10 16:31:55 UTC
savegpr/restgpr are special ABI-defined functions that do not have all the same ABI
calling conventions as normal functions.  They indeed write into the parent's frame
(red zone, in this case).

Maybe you should allow this always when a function has not established a new frame?
That always has to be done with a stdu 1,...(1) insn (in 64-bit; stwu in 32-bit, but
the 32-bit Linux ABI has no red zone anyway) so it probably isn't too hard to detect.
Only leaf functions will not establish a new frame normally (but that can happen
later in the function, esp. with shrink-wrapping).

Unstacking a frame is most other things that write to r1, often addi 1,1,... and
sometimes ld 1,0(1) (there probably are other cases too that I am forgetting here).
Maybe you should invalidate the red zone whenever r1 is changed, instead?
Comment 24 Segher Boessenkool 2021-02-10 16:34:32 UTC
I do see the problems for savegpr/restgpr with that suggestion, but maybe something
in that vein can be done.