This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC] Fixing expansion of misaligned MEM_REFs on strict-alignment targets
- From: Richard Guenther <rguenther at suse dot de>
- To: Martin Jambor <mjambor at suse dot cz>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Eric Botcazou <ebotcazou at adacore dot com>
- Date: Tue, 24 Jan 2012 13:08:13 +0100 (CET)
- Subject: Re: [RFC] Fixing expansion of misaligned MEM_REFs on strict-alignment targets
- References: <20120106165109.GA21457@virgil.arch.suse.de>
On Fri, 6 Jan 2012, Martin Jambor wrote:
> Hi,
>
> I'm trying to teach our expander how to deal with misaligned MEM_REFs
> on strict alignment targets. We currently generate code which leads
> to bus error signals due to misaligned accesses.
>
> I admit my motivation is not any target in particular but simply being
> able to produce misaligned MEM_REFs in SRA, currently we work-around
> that by producing COMPONENT_REFs which causes quite a few headaches.
> Nevertheless, I started by following Richi's advice and set out to fix
> the following two simple testcases on a strict-alignment platform, a
> sparc64 in the compile farm. If I understood him correctly, Richi
> claimed they have been failing "since forever:"
>
> ----- test case 1: -----
>
> extern void abort ();
>
> typedef unsigned int myint __attribute__((aligned(1)));
>
> /* even without the attributes we get bus error */
> unsigned int __attribute__((noinline, noclone))
> foo (myint *p)
> {
> return *p;
> }
>
> struct blah
> {
> char c;
> myint i;
> };
>
> struct blah g;
>
> #define cst 0xdeadbeef
>
> int
> main (int argc, char **argv)
> {
> int i;
> g.i = cst;
> i = foo (&g.i);
> if (i != cst)
> abort ();
> return 0;
> }
>
> ----- test case 2: -----
>
> extern void abort ();
>
> typedef unsigned int myint __attribute__((aligned(1)));
>
> void __attribute__((noinline, noclone))
> foo (myint *p, unsigned int i)
> {
> *p = i;
> }
>
> struct blah
> {
> char c;
> myint i;
> };
>
> struct blah g;
>
> #define cst 0xdeadbeef
>
> int
> main (int argc, char **argv)
> {
> foo (&g.i, cst);
> if (g.i != cst)
> abort ();
> return 0;
> }
>
> ------------------------
>
> I dug in expr.c and found two places which handle misaligned MEM_REfs
> loads and stores respectively but only if there is a special
> movmisalign_optab operation available for the given mode. My approach
> therefore was to append calls to extract_bit_field and store_bit_field
> which seem to be the part of expander capable of dealing with
> misaligned memory accesses. The patch is below, it fixes both
> testcases on sparc64--linux-gnu.
>
> Is this approach generally the right thing to do? And of course,
> since my knowledge of RTL and expander is very limited I expect that
> there will by quite many suggestions about its various particular
> aspects. I have run the c and c++ testsuite with the second hunk in
> place without any problems, the same test of the whole patch is under
> way right now but it takes quite a lot of time, therefore most
> probably I won't have the results today. Of course I plan to do a
> bootstrap and at least Fortran checking on this platform too but that
> is really going to take some time and I'd like to hear any comments
> before that.
>
> One more question: I'd like to be able to handle misaligned loads of
> stores of SSE vectors this way too but then of course I cannot use
> STRICT_ALIGNMENT as the guard but need a more elaborate predicate. I
> assume it must already exist, which one is it?
>
> Thanks a lot in advance for any feedback,
One issue that I am running into now is that we need to robustify/change
expand_assignment quite a bit. I have a patch for SRA that makes sure
to create properly aligned MEM_REFs but then we have, for example
MEM[p].m = ...
and in the handled_component_p path of expand_assignment we happily
expand MEM[p] via expand_normal ("for multiple use"). That of course
breaks, as doing so might go to the rvalue-producing movmisalign
path, miscompiling the store. Even if we handle this case specially
in expand_normal, the UNSPEC x86 for example might produce most certainly
isn't safe for "reuse". Similar for the path that would use
extract_bit_field (and would need to use store_bitfield).
Which means that, 1) we need to avoid to call expand_normal (tem)
in those cases, and we probably need to fall back to a stack
temporary for non-trivial cases?
Note that the SRA plus SSE-mode aggregate is probably a latent
pre-existing issue as get_object_or_type_alignment might
discover misalignment if we happen to know about an explicit
misalignment.
So, I am going to try to address this issue first.
Richard.