Bug 36043 - gcc reads 8 bytes for a struct of size 6 which leads to sigsegv
gcc reads 8 bytes for a struct of size 6 which leads to sigsegv
Status: ASSIGNED
Product: gcc
Classification: Unclassified
Component: middle-end
4.3.0
: P3 normal
: ---
Assigned To: Alan Modra
: wrong-code
: 50499 (view as bug list)
Depends on:
Blocks: 37954
  Show dependency treegraph
 
Reported: 2008-04-25 09:28 UTC by bartoschek
Modified: 2015-03-19 13:44 UTC (History)
22 users (show)

See Also:
Host: x86_64-unknown-linux-gnu
Target: x86_64-unknown-linux-gnu, i?86-*-*
Build: x86_64-unknown-linux-gnu
Known to work:
Known to fail: 2.95.2, 4.1.2, 4.3.3, 4.4.0, 4.5.0
Last reconfirmed: 2008-04-25 14:55:39


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description bartoschek 2008-04-25 09:28:44 UTC
For the following programm gcc produces wrong code. To pass the struct colour by value gcc reads 8 bytes although the struct has a size of 6. This causes reads after allocated memory. In the example program the memory is mmap'ed and the last element passed. The 8 byte read crosses page boundaries and causes a segmentation violation. The code in question is:

        movq    -8(%rbp), %rax
        addq    $12282, %rax
        movq    (%rax), %rdi
        call    print_colour

The error also occurs if you malloc the memory in question but it is harder to provoke a segmentation violation.

The error occurs only in 64bit mode. It does not depend on the optimization level. However if you want to reproduce it with higher optimization, you have to move print_colour to a different compilation unit.

It could be reproduced on different gcc versions up to 4.3.0

#include <sys/mman.h>
#include <stdio.h>

struct colour
{
  unsigned short red;
  unsigned short green;
  unsigned short blue;
};

void print_colour(struct colour col)
{
  printf( "colour is %d %d %d\n",
          col.red, col.green, col.blue);
}

int main(void)
{
  struct colour * c = mmap(NULL, sizeof(struct colour) * 2048,
                           PROT_READ|PROT_WRITE,
                           MAP_PRIVATE|MAP_ANONYMOUS, 0, 0);
  if (c != MAP_FAILED) {
    c[2047].red = 115;
    c[2047].green = 122;
    c[2047].blue = 98;
    print_colour(c[2047]);
    munmap(c, sizeof(struct colour) * 2048);
  }
  return 0;
}

Here is the whole command line:

-bash-3.00$ LANG=C /lfs/user/bartosch/software/gcc/bin/gcc -v -save-temps -Wall -W -pedantic  gls.c
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.3.0/configure --prefix=/lfs/user/bartosch/software/gcc --enable-languages=c,c++ --with-mpfr=/lfs/user/bartosch/software/gcc
Thread model: posix
gcc version 4.3.0 (GCC)
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-W' '-pedantic' '-mtune=generic'
 /lfs/user/bartosch/software/gcc/libexec/gcc/x86_64-unknown-linux-gnu/4.3.0/cc1 -E -quiet -v gls.c -mtune=generic -Wall -W -pedantic -fpch-preprocess -o gls.i
ignoring nonexistent directory "/lfs/user/bartosch/software/gcc/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../x86_64-unknown-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/include
 /lfs/user/bartosch/software/gcc/include
 /lfs/user/bartosch/software/gcc/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/include
 /lfs/user/bartosch/software/gcc/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/include-fixed
 /usr/include
End of search list.
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-W' '-pedantic' '-mtune=generic'
 /lfs/user/bartosch/software/gcc/libexec/gcc/x86_64-unknown-linux-gnu/4.3.0/cc1 -fpreprocessed gls.i -quiet -dumpbase gls.c -mtune=generic -auxbase gls -Wall -W -pedantic -version -o gls.s
GNU C (GCC) version 4.3.0 (x86_64-unknown-linux-gnu)
        compiled by GNU C version 4.3.0, GMP version 4.2.1, MPFR version 2.3.0.
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 261e5a68fb24a56eb3beaa6eb43384e4
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-W' '-pedantic' '-mtune=generic'
 as -V -Qy -o gls.o gls.s
GNU assembler version 2.15.92.0.2 (x86_64-redhat-linux) using BFD version 2.15.92.0.2 20040927
COMPILER_PATH=/lfs/user/bartosch/software/gcc/libexec/gcc/x86_64-unknown-linux-gnu/4.3.0/:/lfs/user/bartosch/software/gcc/libexec/gcc/x86_64-unknown-linux-gnu/4.3.0/:/lfs/user/bartosch/software/gcc/libexec/gcc/x86_64-unknown-linux-gnu/:/lfs/user/bartosch/software/gcc/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/:/lfs/user/bartosch/software/gcc/lib/gcc/x86_64-unknown-linux-gnu/
LIBRARY_PATH=/lfs/user/bartosch/software/gcc/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/:/lfs/user/bartosch/software/gcc/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/lfs/user/bartosch/software/gcc/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-W' '-pedantic' '-mtune=generic'
 /lfs/user/bartosch/software/gcc/libexec/gcc/x86_64-unknown-linux-gnu/4.3.0/collect2 --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 /usr/lib/../lib64/crt1.o /usr/lib/../lib64/crti.o /lfs/user/bartosch/software/gcc/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/crtbegin.o -L/lfs/user/bartosch/software/gcc/lib/gcc/x86_64-unknown-linux-gnu/4.3.0 -L/lfs/user/bartosch/software/gcc/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/lfs/user/bartosch/software/gcc/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../.. gls.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /lfs/user/bartosch/software/gcc/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/crtend.o /usr/lib/../lib64/crtn.o
-bash-3.00$ ./a.out
Speicherzugriffsfehler
-bash-3.00$
Comment 1 Richard Biener 2008-04-25 14:55:39 UTC
Hmm, didn't we fix this? ...

        movw    $115, (%rax)
        movw    $122, 2(%rax)
        movw    $98, 4(%rax)
        movq    (%rax), %rdi
        call    print_colour
Comment 2 Richard Biener 2008-04-25 14:59:26 UTC
Ahm, not exactly a dup of PR31309.

Shorter (non-runtime) testcase:

struct colour
{
  unsigned short red;
  unsigned short green;
  unsigned short blue;
};

void print_colour(struct colour col);

void foo(struct colour *c)
{
  print_colour(*c);
}
Comment 3 Richard Biener 2008-04-25 15:08:51 UTC
The problem is that struct colour is laid out like

    arg 0 <parm_decl 0x2b4f8506e2d0 c
        type <pointer_type 0x2b4f85160780 type <record_type 0x2b4f85160540 colour>
            unsigned DI
            size <integer_cst 0x2b4f85068b70 constant invariant 64>
            unit size <integer_cst 0x2b4f85068ba0 constant invariant 8>
            align 64 symtab 0 alias set -1 canonical type 0x2b4f85160780>
        used unsigned DI file t.c line 10 col 25 size <integer_cst 0x2b4f85068b70 64> unit size <integer_cst 0x2b4f85068ba0 8>
        align 64 context <function_decl 0x2b4f8515fea0 foo> initial <pointer_type 0x2b4f85160780>
        (reg/v/f:DI 58 [ c ]) arg-type <pointer_type 0x2b4f85160780>
        incoming-rtl (reg:DI 5 di [ c ])>>

which doesn't match its natural alignment nor size.
Comment 4 Michael Matz 2008-04-25 15:12:25 UTC
That's the layout of 'c', a pointer variable.  We don't see the layout
of the record_type here.
Comment 5 Richard Biener 2008-04-25 15:17:32 UTC
Forget that.  load_register_parameters converts

(mem/s:BLK (reg/v/f:DI 58 [ c ]) [0 S6 A16])

to

(mem/s:BLK (reg/v/f:DI 58 [ c ]) [0 S6 A16])

here:

          else if (partial == 0 || args[i].pass_on_stack)
            {
              rtx mem = validize_mem (args[i].value);
Comment 6 Richard Biener 2008-04-25 15:29:27 UTC
Errm.  change_address_1 simply builds a DImode mem (with alignment
info properly retained)

#0  0x00000000005e80c2 in adjust_address_1 (memref=0x2b0751d81520, 
    mode=DImode, offset=0, validate=0, adjust=1)
    at /space/rguenther/src/svn/gcc-4_3-branch/gcc/emit-rtl.c:1910
#1  0x00000000005e396c in operand_subword (op=0x2b0751d81520, offset=0, 
    validate_address=1, mode=BLKmode)
    at /space/rguenther/src/svn/gcc-4_3-branch/gcc/emit-rtl.c:1346
#2  0x00000000005e3a28 in operand_subword_force (op=0x2b0751d81520, offset=0, 
    mode=BLKmode)
    at /space/rguenther/src/svn/gcc-4_3-branch/gcc/emit-rtl.c:1374
#3  0x000000000060fd27 in move_block_to_reg (regno=5, x=0x2b0751d81520, 
    nregs=1, mode=BLKmode)
    at /space/rguenther/src/svn/gcc-4_3-branch/gcc/expr.c:1560
#4  0x000000000054f7b9 in load_register_parameters (args=0x7fff59711810, 
    num_actuals=1, call_fusage=0x7fff597119d0, flags=0, is_sibcall=0, 
    sibcall_failure=0x7fff597119ac)
    at /space/rguenther/src/svn/gcc-4_3-branch/gcc/calls.c:1681
#5  0x00000000005526b4 in expand_call (exp=0x2b075142d460, target=0x0, 
    ignore=1) at /space/rguenther/src/svn/gcc-4_3-branch/gcc/calls.c:2763

no idea if that is supposed to work (and thus *movdi_1_rex64 shouldn't match)
or if not ...
Comment 7 Richard Biener 2008-04-25 15:43:07 UTC
So, the problem is in move_block_to_reg that copies the argument piecewise
to regs like

  for (i = 0; i < nregs; i++)
    emit_move_insn (gen_rtx_REG (word_mode, regno + i),
                    operand_subword_force (x, i, mode));

not accounting for the fact that the last reg may be only partially filled
from the memory.  But the necessary information to deal with this is also
not available in this function currently.
Comment 8 Michael Matz 2008-04-25 16:15:51 UTC
FWIW, I think the error is in the caller of move_block_to_reg. 
move_block_to_reg can make use of a load_multiple instruction, which really
loads full regs.  I.e. it would be unreasonable to require changes in
move_block_to_reg to handle non-power-of-2 sizes.  Hence the caller
(load_register_parameters) needs to handle this.  I'm not sure if the
n_aligned_regs thingy could be misused for this, or if one simply should
opencode the special case of the last register being partial.
Comment 9 Richard Biener 2008-04-25 16:20:19 UTC
Index: calls.c
===================================================================
--- calls.c     (revision 134659)
+++ calls.c     (working copy)
@@ -2708,7 +2708,7 @@ expand_call (tree exp, rtx target, int i
         and whose alignment does not permit a direct copy into registers,
         make a group of pseudos that correspond to each register that we
         will later fill.  */
-      if (STRICT_ALIGNMENT)
+      if (1 || STRICT_ALIGNMENT)
        store_unaligned_arguments_into_pseudos (args, num_actuals);
 
       /* Now store any partially-in-registers parm.

also fixes this.
Comment 10 Eric Rannaud 2009-03-17 22:48:25 UTC
Witnessed in g++ 4.3.2

g++ (GCC) 4.3.2 20081105 (Red Hat 4.3.2-7)
Comment 11 Richard Biener 2009-03-18 09:11:49 UTC
Still broken on the trunk as well.
Comment 12 Johannes Schickel 2009-09-15 21:24:47 UTC
Happens for me too on Linux/AMD64 with:

gcc (Debian 4.3.4-2) 4.3.4

and

gcc (Debian 4.4.1-4) 4.4.1

It also happens with structs of sizes 3, 5 and 7 for me.

An easy (non-runtime) test case for different struct sizes can be made via:

struct test
{
  unsigned char size[TEST_SIZE];
};

void bar(struct test);

void foo(struct test *t)
{
  bar(*t);
}

Just define the struct size via "-DTEST_SIZE=size", where "size" is the desired size of the struct, on the gcc command line.

One can witness that for struct sizes of 1, 2, 4, 8 and everything above 8 it works just fine.

Here's the assembly output, generated via 4.3.4, for the working struct sizes:

struct size of 1:

	movq	-8(%rbp), %rax
	movzbl	(%rax), %edi
	call	bar

struct size of 2:

	movq	-8(%rbp), %rax
	movzwl	(%rax), %edi
	call	bar


struct size of 4:

	movq	-8(%rbp), %rax
	movl	(%rax), %edi
	call	bar


struct size of 8:

	movq	-8(%rbp), %rax
	movq	(%rax), %edi
	call	bar

For all sizes above 8, it does also generate correct assembly, that means it will only read the correct number of bytes. Here is an example with a struct size of 11:

	movq	-8(%rbp), %rdx
	movq	(%rdx), %rdi
	movzbl	8(%rdx), %ecx
	movzbl	9(%rdx), %eax
	salq	$8, %rax
	orq	%rax, %rcx
	movzbl	10(%rdx), %eax
	salq	$16, %rax
	movq	%rax, %rsi
	orq	%rcx, %rsi
	call	bar

As you can see it uses one "movq" to read eight bytes at once and then three "movzbl" to read the remaining three bytes. The assembly for this case was generated with -O0. It will look differently for other optimization levels, but is still using the same basic logic.

For struct sizes of 3, 5, 6 or 7 the compiler always generates:

	movq	-8(%rbp), %rax
	movq	(%rax), %rdi
	call	bar

for me, which is the reported incorrect behavior.
Comment 13 Andre Heider 2009-09-15 21:54:09 UTC
Still present on a recent GCC snapshot:

Johannes' Code compiled with "(Ubuntu 20090912-1ubuntu2) 4.5.0 20090912 (experimental) [trunk revision 151650]" and -O0 -DTEST_SIZE=5:

	subq	$16, %rsp
	movq	%rdi, -8(%rbp)
	movq	-8(%rbp), %rax
	movq	(%rax), %rdi
	call	bar
Comment 14 Matt Hargett 2010-01-03 20:30:10 UTC
Still happening on 4.5.0 20091228 (and gcc 4.4.1) on Ubuntu 9.10/amd64). I found that it goes away in 3 separate instances when turning on -finline-functions. The last 2 instances of memory corruption I am running into are functions in vtables, which can't be inlined (and therefore work around this bug) until de-virtualization is eliminated.

This seems like a pretty serious and common problem; is there any chance it will be fixed in 4.5?
Comment 15 Richard Biener 2010-01-03 22:32:12 UTC
The issue is in a very twisted piece of GCC where the chances to break sth are
bigger that to fix sth ;)

And it isn't a regression, so it's not on too many peoples radar (nor does
it seem to happen in practice and thus affect those who may be able to fix it).

But, let's CC some more people ;)
Comment 16 Richard Biener 2011-03-28 11:57:36 UTC
Also broken at least since GCC 2.95 on i?86 with

struct colour
{
  unsigned char red;
  unsigned char green;
  unsigned char blue;
};

void print_colour(struct colour col) __attribute__((regparm(3)));

void
foo(struct colour *c)
{
  print_colour(*c);
}
Comment 17 Richard Biener 2011-03-28 11:58:50 UTC
PR37954 looks like a dup for arm.
Comment 18 Peter Bergner 2012-02-02 15:07:34 UTC
*** Bug 50499 has been marked as a duplicate of this bug. ***
Comment 19 Peter Bergner 2012-02-02 15:46:09 UTC
From PR50499, we have this similar test case that fails on powerpc64-linux:

struct S50 { char i[9]; };
extern void bar (struct S50);
void
foo (struct S50 *p)
{
  bar (*p);
}

Gives (using -m64 -O1):

        ...
	ld 3,0(3)
	ld 4,8(9)
	bl bar
Comment 20 David Fries 2014-02-22 21:09:39 UTC
I just ran into this and verified that it exists in gcc svn trunk
gcc (GCC) 4.9.0 20140221 (experimental)
Comment 21 support 2014-02-23 00:02:55 UTC
__________________________________
Type your response ABOVE THIS LINE to reply

------------------------------------------------------------------------------------------------------------
Customer: tromey at gcc dot gnu.org
Subject: [Bug middle-end/36043] gcc reads 8 bytes for a struct of size 6 which leads to sigsegv
------------------------------------------------------------------------------------------------------------

Jamie Forrest | FEB 23, 2014 | 12:02AM UTC



------------------------------------------------------------------------------------------------------------

tromey | FEB 22, 2014 | 09:10PM UTC  | Original message 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36043

David Fries <david at fries dot net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |david at fries dot net

--- Comment #20 from David Fries <david at fries dot net> ---
I just ran into this and verified that it exists in gcc svn trunk
gcc (GCC) 4.9.0 20140221 (experimental)


------------------------------------------------------------------------------------------------------------
This message was sent to gcc-bugzilla@gcc.gnu.org in reference to Case #: 4164.
------------------------------------------------------------------------------------------------------------




[[0958b287b2c175eb67793a138c0536bfc6a0b07d-206330137]]
Comment 22 Arthur O'Dwyer 2014-02-25 19:45:43 UTC
Here's another segfaulting test case, for x86-64. This fails with GCC 4.6.2 and GCC 4.8.1.
This GCC bug is really hard to work around, since it can strike anywhere without warning. Is it possible that one of the patches above (e.g., Richard Biener's comment #9) could be adopted into 4.8.x or 4.9.x?


#include <cstdio>
#include <stdint.h>
#include <sys/mman.h>

struct int24_t { uint8_t m_Internal[3]; };

void foo(int24_t a) { puts("2"); }

int main() {
    const int OS_PAGE_SIZE = 0x1000;
    const int COUNT = (OS_PAGE_SIZE) / sizeof(int24_t);
    printf("sizeof int24_t = %d\n", (int)sizeof(int24_t));

    void* arena1 = mmap(0, OS_PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    int24_t* ptr = (int24_t*)mmap(0, OS_PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    munmap(arena1, OS_PAGE_SIZE);

    puts("1");
    foo(ptr[COUNT-1]);
    puts("3");
}
Comment 23 Richard Biener 2014-02-26 09:32:52 UTC
(In reply to Michael Matz from comment #8)
> FWIW, I think the error is in the caller of move_block_to_reg. 
> move_block_to_reg can make use of a load_multiple instruction, which really
> loads full regs.  I.e. it would be unreasonable to require changes in
> move_block_to_reg to handle non-power-of-2 sizes.  Hence the caller
> (load_register_parameters) needs to handle this.  I'm not sure if the
> n_aligned_regs thingy could be misused for this, or if one simply should
> opencode the special case of the last register being partial.

That would be sth like

Index: gcc/calls.c
===================================================================
--- gcc/calls.c (revision 208124)
+++ gcc/calls.c (working copy)
@@ -1984,7 +1984,26 @@ load_register_parameters (struct arg_dat
                    emit_move_insn (ri, x);
                }
              else
-               move_block_to_reg (REGNO (reg), mem, nregs, args[i].mode);
+               {
+                 if (size % UNITS_PER_WORD == 0
+                     || MEM_ALIGN (mem) % BITS_PER_WORD == 0)
+                   move_block_to_reg (REGNO (reg), mem, nregs, args[i].mode);
+                 else
+                   {
+                     if (nregs > 1)
+                       move_block_to_reg (REGNO (reg), mem,
+                                          nregs - 1, args[i].mode);
+                     rtx dest = gen_rtx_REG (word_mode,
+                                             REGNO (reg) + nregs - 1);
+                     rtx src = operand_subword_force (mem,
+                                                      nregs - 1, args[i].mode);
+                     rtx tem = extract_bit_field (src, size * BITS_PER_UNIT,
+                                                  0, 1, dest, word_mode,
+                                                  word_mode);
+                     if (tem != dest)
+                       convert_move (dest, tem, 1);
+                   }
+               }
            }
 
          /* When a parameter is a block, and perhaps in other cases, it is

it's similar to what store_unaligned_arguments_into_pseudos would end up
doing but only for the last register (so it's probably easier to dispatch
to that and handle !STRICT_ALIGNMENT targets there).

Anyway, the generated code is of course "horrible".

foo:
.LFB0:
        .cfi_startproc
        movq    %rdi, %rcx
        movzwl  (%rdi), %edx
        movzwl  2(%rdi), %edi
        salq    $16, %rdi
        movq    %rdi, %rax
        movzwl  4(%rcx), %edi
        orq     %rdx, %rax
        salq    $32, %rdi
        orq     %rax, %rdi
        jmp     print_colour

for some reason extract_bit_field doesn't consider using a 4-byte load
for the first part.  With AVX one could also use a masked load (and thus
implement the extv/insv pattern family?  not sure if it is valid to
reject non-byte boundary variants).  But if we end up using
extract_bit_field more and more it's worth optimizing it further to
avoid the above mess... (we end up using extract_split_bit_field).