This is the mail archive of the
mailing list for the GCC project.
Re: [patch] Fix powerpc 64 alignment problem for lwa instruction
- From: Andrew Pinski <pinskia at physics dot uc dot edu>
- To: hm dot chang at apple dot com (Hui-May Chang)
- Cc: gcc-patches at gcc dot gnu dot org, pinskia at physics dot uc dot edu (Andrew Pinski)
- Date: Tue, 7 Nov 2006 17:13:47 -0500 (EST)
- Subject: Re: [patch] Fix powerpc 64 alignment problem for lwa instruction
> On Nov 7, 2006, at 12:56 PM, Andrew Pinski wrote:
> >> We, at Apple, found a ppc64 code generation problem where lwa_operand
> >> routine didn't check the alignment of a memory operand being 32 bits
> >> aligned or not.
> >> The following patch has been tested on ppc MacOS with "make all", "--
> >> enable-languages=c,c++,objc,obj-c++", and regression tested with a
> >> top-
> >> level "make check-gcc" with no regression.
> >> gcc/ChangeLog:
> >> * gcc/config/rs6000/predicates.md (lwa_operand): Check the
> >> alignment of
> >> a memory operand is 32 bits aligned or not.
> > This is the wrong fix, the memory alignment is not the issue here
> > but the offset
> > field has to be multiple of 4. This is according to the ISA
> > documents.
> > Can you give more information about what is going wrong? Because as
> > far as I can
> > tell we check the offset to make sure it is a multiple of 4.
> > || GET_CODE (XEXP (XEXP (inner, 0), 1)) != CONST_INT
> > || INTVAL (XEXP (XEXP (inner, 0), 1)) % 4 == 0));
> > Thanks,
> > Andrew Pinski
> For the following memory operand,
> (gdb) p debug_rtx(op)
> (mem/s/j:SI (lo_sum:DI (reg:DI 121)
> (const:DI (minus:DI (symbol_ref:DI ("mybox") [flags 0x382]
> <var_decl 0x41687e80 mybox>)
> (symbol_ref:DI ("<pic base>") [flags 0x180])))) [0
> mybox.left+0 S4 A16])
> GET_CODE (XEXP (inner, 0)) == LO_SUM
> GET_CODE (XEXP (XEXP (inner, 0), 1)) == CONST
> The alignment is A16, i.e., 16 bits.
What about rejecting all low_sum instead? The alignment is still the correct
check. Because we can have alignment of 16 and still have an offset which is
a multiple of 4.
Hmm, the other thing is that constraint m seems wrong for the lwa instruction,
I think it should be changed to Y like the load doubleword case.