[Bug middle-end/36043] gcc reads 8 bytes for a struct of size 6 which leads to sigsegv
rguenth at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Wed Feb 26 09:33:00 GMT 2014
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36043
--- Comment #23 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Michael Matz from comment #8)
> FWIW, I think the error is in the caller of move_block_to_reg.
> move_block_to_reg can make use of a load_multiple instruction, which really
> loads full regs. I.e. it would be unreasonable to require changes in
> move_block_to_reg to handle non-power-of-2 sizes. Hence the caller
> (load_register_parameters) needs to handle this. I'm not sure if the
> n_aligned_regs thingy could be misused for this, or if one simply should
> opencode the special case of the last register being partial.
That would be sth like
Index: gcc/calls.c
===================================================================
--- gcc/calls.c (revision 208124)
+++ gcc/calls.c (working copy)
@@ -1984,7 +1984,26 @@ load_register_parameters (struct arg_dat
emit_move_insn (ri, x);
}
else
- move_block_to_reg (REGNO (reg), mem, nregs, args[i].mode);
+ {
+ if (size % UNITS_PER_WORD == 0
+ || MEM_ALIGN (mem) % BITS_PER_WORD == 0)
+ move_block_to_reg (REGNO (reg), mem, nregs, args[i].mode);
+ else
+ {
+ if (nregs > 1)
+ move_block_to_reg (REGNO (reg), mem,
+ nregs - 1, args[i].mode);
+ rtx dest = gen_rtx_REG (word_mode,
+ REGNO (reg) + nregs - 1);
+ rtx src = operand_subword_force (mem,
+ nregs - 1,
args[i].mode);
+ rtx tem = extract_bit_field (src, size * BITS_PER_UNIT,
+ 0, 1, dest, word_mode,
+ word_mode);
+ if (tem != dest)
+ convert_move (dest, tem, 1);
+ }
+ }
}
/* When a parameter is a block, and perhaps in other cases, it is
it's similar to what store_unaligned_arguments_into_pseudos would end up
doing but only for the last register (so it's probably easier to dispatch
to that and handle !STRICT_ALIGNMENT targets there).
Anyway, the generated code is of course "horrible".
foo:
.LFB0:
.cfi_startproc
movq %rdi, %rcx
movzwl (%rdi), %edx
movzwl 2(%rdi), %edi
salq $16, %rdi
movq %rdi, %rax
movzwl 4(%rcx), %edi
orq %rdx, %rax
salq $32, %rdi
orq %rax, %rdi
jmp print_colour
for some reason extract_bit_field doesn't consider using a 4-byte load
for the first part. With AVX one could also use a masked load (and thus
implement the extv/insv pattern family? not sure if it is valid to
reject non-byte boundary variants). But if we end up using
extract_bit_field more and more it's worth optimizing it further to
avoid the above mess... (we end up using extract_split_bit_field).
More information about the Gcc-bugs
mailing list