This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH] PR/14262 (ABI corner case confusion)
- From: Ulrich Weigand <weigand at i1 dot informatik dot uni-erlangen dot de>
- To: wilson at specifixinc dot com (Jim Wilson)
- Cc: weigand at i1 dot informatik dot uni-erlangen dot de (Ulrich Weigand), gcc-patches at gcc dot gnu dot org
- Date: Thu, 11 Mar 2004 15:06:30 +0100 (CET)
- Subject: Re: [PATCH] PR/14262 (ABI corner case confusion)
Jim Wilson wrote:
> This does not look right to me. If you look at gcc-3.3, before the
> BLOCK_REG_PADDING support went in, you can see that the code has always
> called move_block_to_reg here.
Yes, and it didn't work on 3.3 either.
> Changing this code to do something different may break other targets.
Well, seeing as the current behaviour is self-inconsistent in that
gcc generates code on the caller side that is incompatible with the
code it generates on the callee side, it would appear unlikely that
we can break targets by fixing this ...
> This also seems like an incomplete solution. What happens if you have a
> 12-byte structure? If this gets passed into a register-pair, then you
> still have the same problem, the last 4 bytes get loaded in the wrong
> part of the second register. Perhaps s390 does not have this problem,
> but other similar targets might.
As I understand it the default is 'usually pad upward, but pad short
arguments downward on big-endian machines' (comment before DEFAULT_
FUNCTION_ARG_PADDING in expr.h). This means that a 12-byte structure
would deliberately have different padding that a 4-byte one. (And
for 12-byte structures caller and callee would agree, b.t.w.)
> To solve this problem, you either need
> to define BLOCK_REG_PADDING, or we need a different fix.
We need BLOCK_REG_PADDING if the padding requirements differ from the
default. However, if BLOCK_REG_PADDING is not defined, gcc should
default at least to self-consistent behaviour (which it already does
for 'long' arguments, and it also does for short arguments if
STRICT_ALIGNMENT is defined, and it of course always does on little
endian machines). The only case that leads to self-inconsistent
behaviour is: short arguments on big-endian machines where STRICT_
ALIGNMENT is not defined.
> The real problem here seems to be that we have code to handle loading
> BLKmode parameters, but the code is conditional on STRICT_ALIGNMENT.
> This is store_unaligned_arguments_into_pseudos. Since most targets that
> pass args in regs also define STRICT_ALIGNMENT, we never noticed that
> the same code was also needed in the STRICT_ALIGNMENT case. If I ifdef
> out the STRICT_ALIGNMENT test, I get correct code.
Yes, I noticed that as well.
> Looking at this some more, I think there is another subtle issue here.
> The BLOCK_REG_PADDING code here is reading past the end of the BLKmode
> object, which could be a problem if the object is at the end of mapped
> memory. I don't think this BLOCK_REG_PADDING code is quite right, and I
> am not sure why it is needed here. Ah, I see, this code was added for
> rs6000 (powerpc), and this is another target that does not define
> STRICT_ALIGNMENT. This code is safe only if the STRICT_ALIGNMENT test
> is removed, in which case we can only reach here for a BLKmode value
> which has at least word_mode alignment, in which case reading an entire
> word can not read past the end of mapped memory.
Hmm, good point.
> The attached patch needs a bit of cleanup, and the comments need to be
> rewritten, but otherwise I think this is the right approach. Another
> benefit of this is that you get better RTL in the .rtl file with my patch.
Well, I guess for s390x this would also be OK, but I wonder whether
this was really the intention of store_unaligned_arguments_into_pseudos ...
> By the way, the alignment tests in
> store_unaligned_arguments_into_pseudos looks wrong to me too. We should
> not be loading BLKmode arguments into registers differently depending on
> their alignment.
But I thought that was the whole original point of
store_unaligned_arguments_into_pseudos: provide a second, generally slower,
way to load such arguments that the standard way cannot load (because the
standard way makes alignment assumptions that these arguments do not match).
You only want to go the 'slow path' if you have to -- thus the alignment
If we want store_unaligned_arguments_into_pseudos to handle a second
problem as well, we need to adjust the 'slow path' tests to take this
into account. (E.g. handle arguments that are either unaligned or
BLKmode shorter than word size; something like this.)
Dr. Ulrich Weigand