Bug 107498

Summary: Wrong optimization leads to unaligned access when compiling OpenLDAP
Product: gcc Reporter: John Paul Adrian Glaubitz <glaubitz>
Component: middle-endAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED INVALID    
Severity: normal CC: ebotcazou, glaubitz, jrtc27, matorola, sjames, slyfox
Priority: P3 Keywords: wrong-code
Version: 12.2.0   
Target Milestone: ---   
See Also: https://bugs.openldap.org/show_bug.cgi?id=9916
Host: Target: sparc64-*-*
Build: Known to work:
Known to fail: Last reconfirmed: 2022-11-01 00:00:00
Attachments: Pre-processed source of mdb.c (gzip-compressed)
Pre-processed source of unmodified mdb.c (gzip-compressed)

Description John Paul Adrian Glaubitz 2022-11-01 16:54:24 UTC
Building OpenLDAP on sparc64-linux-gnu using gcc-12 results in a slapd binary that crashes with unaligned access ("Bus error").

After investigation, it turned out that this is not a bug in the OpenLDAP code but gcc-12 which tries to optimize a access to two 2-byte fields in a struct using a 4-byte store instruction. This fails because the 2-byte fields have only 2-byte alignment.

More details on the issue can be obtained from the corresponding OpenLDAP bug report, in particular comment #10 [1].

> [1] https://bugs.openldap.org/show_bug.cgi?id=9916#c10
Comment 1 Andrew Pinski 2022-11-01 16:57:35 UTC
Can you attach the preprocessed source for the file which is failing?
And give the exact commands you used to compile it?

Note this could still be a bug in the sources of openldap where the alignment of the pointer type is one thing but the user expects something different.
Comment 2 Jessica Clarke 2022-11-01 17:04:26 UTC
#define mp_lower	mp_pb.pb.pb_lower
#define mp_upper	mp_pb.pb.pb_upper
#define mp_pages	mp_pb.pb_pages
	union {
		struct {
			indx_t		pb_lower;		/**< lower bound of free space */
			indx_t		pb_upper;		/**< upper bound of free space */
		} pb;
		uint32_t	pb_pages;	/**< number of overflow pages */
	} mp_pb;

That's the code. GCC is perfectly ok to optimise that to do a 32-bit load. If it has an alignment fault that's OpenLDAP's problem, the uint32_t there means it must be 32-bit aligned if you don't want UB.
Comment 3 John Paul Adrian Glaubitz 2022-11-01 17:08:57 UTC
Created attachment 53814 [details]
Pre-processed source of mdb.c (gzip-compressed)

Source code file in OpenLDAP git tree is »libraries/liblmdb/mdb.c« and was compiled with:

$ /bin/sh ../../../libtool --tag=disable-shared --mode=compile cc -g -O2 -I../../../include        -I../../../include -I.. -I./.. -I./../../../libraries/liblmdb     -c ./../../../libraries/liblmdb/mdb.c

from the sub-directory »servers/slapd/back-mdb«.

Attaching the preprocessed source.
Comment 4 Andrew Pinski 2022-11-01 17:18:43 UTC
I am suspecting the sub-allocator mdb_page_alloc is not doing the alignment correctly.
Comment 5 Eric Botcazou 2022-11-01 20:23:23 UTC
Note that the fields are marked volatile in the source:

 union {
  struct {
   volatile indx_t pb_lower;
   volatile indx_t pb_upper;
  } pb;
  uint32_t pb_pages;
 } mp_pb;

But, yes, it looks like something went wrong earlier: mp is a pointer to MDB_page, which is a structure containing a union containing a uint32_t field, which gives it an alignment of 4 bytes, so mp must be a multiple of 4.  mp_pb is a union containing a uint32_t field so it is at an offset multiple of 4.  So &mp->mp_pb.pb must be a multiple of 4 and, assuming that it is contained in %l0, the instruction:

  0x00000100000c8aec <+300>:	st  %l3, [ %l0 + 0xc ]

would therefore be legal.

Can anyone print the value of mp in the debugger?
Comment 6 John Paul Adrian Glaubitz 2022-11-01 21:53:29 UTC
(In reply to Eric Botcazou from comment #5)
> Note that the fields are marked volatile in the source:
> 
>  union {
>   struct {
>    volatile indx_t pb_lower;
>    volatile indx_t pb_upper;
>   } pb;
>   uint32_t pb_pages;
>  } mp_pb;

My bad. I actually modified the source to see whether marking them "volatile" would alleviate the issue as suggested by Howard in the OpenLDAP bug report.

I'll attach the preprocessed source again, this time from the unmodified »mdb.c«.
Comment 7 John Paul Adrian Glaubitz 2022-11-01 21:55:46 UTC
Created attachment 53818 [details]
Pre-processed source of unmodified mdb.c (gzip-compressed)

Here's the preprocessed source for mdb.c without marking the variables as volatile.
Comment 8 John Paul Adrian Glaubitz 2022-11-12 10:14:50 UTC
(In reply to Eric Botcazou from comment #5)
> Can anyone print the value of mp in the debugger?

glaubitz@gcc202:~/openldap/tests$ gdb --args /home/glaubitz/openldap/servers/slapd/slapd -Ta -d 0 -f /home/glaubitz/openldap/tests/testrun/slapadd.conf -l ./testdata/test-ordered.ldif
GNU gdb (Debian 12.1-4) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "sparc64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/glaubitz/openldap/servers/slapd/slapd...
(gdb) r
Starting program: /home/glaubitz/openldap/servers/slapd/slapd -Ta -d 0 -f /home/glaubitz/openldap/tests/testrun/slapadd.conf -l ./testdata/test-ordered.ldif

Program received signal SIGILL, Illegal instruction.
0xfff8000100e44320 in ?? ()
(gdb) c
Continuing.

Program received signal SIGBUS, Bus error.
0x00000100000ceda4 in mdb_node_add (mc=0x100004327b8, indx=<optimized out>, key=0x7feffffe0a0, data=0x7feffffe090, pgno=0, flags=0) at ./../../../libraries/liblmdb/mdb.c:7366
7366            mp->mp_lower += sizeof(indx_t);
(gdb) p mp
$1 = (MDB_page *) 0x10000463caa
(gdb)
Comment 9 Eric Botcazou 2022-11-13 10:14:58 UTC
> Program received signal SIGBUS, Bus error.
> 0x00000100000ceda4 in mdb_node_add (mc=0x100004327b8, indx=<optimized out>,
> key=0x7feffffe0a0, data=0x7feffffe090, pgno=0, flags=0) at
> ./../../../libraries/liblmdb/mdb.c:7366
> 7366            mp->mp_lower += sizeof(indx_t);
> (gdb) p mp
> $1 = (MDB_page *) 0x10000463caa

Thanks.  So that's definitely *not* a compiler bug but a programming error as per the 6.5.3.2(4) clause of the ISO C standard:

"The unary * operator denotes indirection. If the operand points to a function,
the result is a function designator; if it points to an object, the result is
an lvalue designating the object. If the operand has type "pointer to type",
the result has type "type". If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.(106)"

(106)
Among the invalid values for dereferencing a pointer by the unary * operator are a null pointer, an address inappropriately aligned for the type of object pointed to, and the address of an object after the end of its lifetime.
Comment 10 John Paul Adrian Glaubitz 2022-11-13 10:33:52 UTC
(In reply to Eric Botcazou from comment #9)
> > Program received signal SIGBUS, Bus error.
> > 0x00000100000ceda4 in mdb_node_add (mc=0x100004327b8, indx=<optimized out>,
> > key=0x7feffffe0a0, data=0x7feffffe090, pgno=0, flags=0) at
> > ./../../../libraries/liblmdb/mdb.c:7366
> > 7366            mp->mp_lower += sizeof(indx_t);
> > (gdb) p mp
> > $1 = (MDB_page *) 0x10000463caa
> 
> Thanks.  So that's definitely *not* a compiler bug but a programming error
> as per the 6.5.3.2(4) clause of the ISO C standard:

FWIW, the issue does not show when compiling with clang.
Comment 11 Eric Botcazou 2022-11-13 10:56:43 UTC
> FWIW, the issue does not show when compiling with clang.

This probably means that GCC generates better/smaller code then.