This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Patch to fix gcc.c-torture/compile/20010102-1.c on IA64 HP-UX
- From: Luis Machado <luisgpm at linux dot vnet dot ibm dot com>
- To: Jeff Law <law at redhat dot com>
- Cc: Andrew Pinski <pinskia at gmail dot com>, sje at cup dot hp dot com, Richard Henderson <rth at redhat dot com>, Peter Bergner <bergner at vnet dot ibm dot com>, gcc-patches at gcc dot gnu dot org, paolo dot carlini at oracle dot com
- Date: Thu, 30 Oct 2008 10:41:28 -0200
- Subject: Re: Patch to fix gcc.c-torture/compile/20010102-1.c on IA64 HP-UX
- References: <200809161651.m8GGpEM19079@lucas.cup.hp.com> <1221596846.17787.28.camel@hpsje.cup.hp.com> <48D0248E.5030505@redhat.com> <1221601143.17787.40.camel@hpsje.cup.hp.com> <48D2B834.7040701@redhat.com> <1221847674.4972.27.camel@hpsje.cup.hp.com> <48D953DC.6080200@redhat.com> <1222202669.19545.1.camel@hpsje.cup.hp.com> <1223062006.612.13.camel@gargoyle> <48E6BB7D.8000809@redhat.com> <de8d50360810031746j7137c6a4gda8d5b8f437cccd2@mail.gmail.com> <1224182072.8846.94.camel@gargoyle> <48F78BD3.5080907@redhat.com>
- Reply-to: luisgpm at linux dot vnet dot ibm dot com
On Thu, 2008-10-16 at 12:45 -0600, Jeff Law wrote:
> Luis Machado wrote:
> > On Fri, 2008-10-03 at 17:46 -0700, Andrew Pinski wrote:
> >
> >> On Fri, Oct 3, 2008 at 5:40 PM, Jeff Law <law@redhat.com> wrote:
> >>
> >>> I'd be hard pressed to see how this patch could cause any kind of
> >>> performance regression since it's merely propagating information to
> >>>
> >> the
> >>
> >>> backend that was getting lost in regrename and I don't see that the
> >>>
> >> ppc uses
> >>
> >>> REG_POINTER in its backend.
> >>>
> >> PPC does not use it but the way lwx is used is [base+index] which
> >> should really be done that way, otherwise it will have some stalls in
> >> some cases (on Power 6).
> >>
> >> Thanks,
> >> Andrew Pinski
> >>
> >
> > After further investigation, there is some light to this problem.
> >
> > There appears to be a bad interaction between REG_POINTER and RTL alias
> > analysis, as we see below.
> >
> > The epilogue for these two functions from vortex (Mem_GetAddr,
> > Mem_GetWord) are the following, for both revisions:
> >
> > Mem_GetAddr (revision 140615):
> >
> > .L3:
> > mr 3,29
> > mr 4,30
> > bl Ut_PrintErr
> > lwz 7,36(1)
> > lwz 26,8(1)
> > * lis 8,Test@ha
> > lwz 27,12(1)
> > lwz 28,16(1)
> > mtlr 7
> > mr 0,3
> > lwz 29,20(1)
> > lwz 30,24(1)
> > mr 3,0
> > lwz 31,28(1)
> > * stw 0,Test@l(8)
> > addi 1,1,32
> > blr
> >
> > Mem_GetAddr (revision 140616):
> >
> > .L3:
> > mr 3,29
> > mr 4,30
> > bl Ut_PrintErr
> > * lis 8,Test@ha
> > mr 0,3
> > * stw 0,Test@l(8)
> > mr 3,0
> > lwz 7,36(1)
> > lwz 26,8(1)
> > lwz 27,12(1)
> > lwz 28,16(1)
> > mtlr 7
> > lwz 29,20(1)
> > lwz 30,24(1)
> > lwz 31,28(1)
> > addi 1,1,32
> > blr
> >
> >
> > Mem_GetWord (revisions 140615 and 140616):
> >
> > .L8:
> > lwz 5,0(31)
> > li 0,1
> > cmpwi 6,5,0
> > bne- 6,.L9
> > lwz 10,36(1)
> > lwz 26,8(1)
> > * lis 11,Test@ha
> > mr 3,0
> > lwz 27,12(1)
> > lwz 28,16(1)
> > mtlr 10
> > * stw 0,Test@l(11)
> > lwz 29,20(1)
> > lwz 30,24(1)
> > lwz 31,28(1)
> > addi 1,1,32
> > blr
> >
> > Clearly, the code generation for Mem_GetWord is unaffected by the
> > changes from revision 140616.
> >
> > The problem happens with Mem_GetAddr/revision 140616. We can see that
> > "lis 8,Test@ha" and "stw 0,Test@l(8)" are very close to each other,
> > leading to possible stalls in the pipeline, which doesn't happen with
> > the code generated by revision 140615.
> >
> > The positioning of "lis" and "stw" we see in revision 140616 is due to a
> > number of dependencies created wrt the following lwz instructions, so
> > the "stw" needs to execute before that, avoiding the possibility to move
> > "stw" further down.
> >
> > So we're left with the question of why these dependencies are being
> > created.
> >
> > Digging further, revision 140616 enabled some registers to store a value
> > meaning its contents are, in fact, a pointer, and we access this data
> > with REG_POINTER(...).
> >
> > Looking at this specific code inside alias.c:find_base_term:
> >
> > if (REG_P (tmp1) && REG_POINTER (tmp1))
> > return find_base_term (tmp1);
> >
> > if (REG_P (tmp2) && REG_POINTER (tmp2))
> > return find_base_term (tmp2);
> >
> > Since up until revision 140615 we didn't store the specific information
> > mentioned above, we just flew by these conditions and returned either
> > tmp1 or tmp2, not going into the recursive call.
> >
> > With revision 140616, we actually have the pointer-in-reg information,
> > and thus we go into the recursive call to either "find_base_term (tmp1)"
> > or "find_base_term (tmp2)".
> >
> > This should work as expected, as it really does for Mem_GetWord, but for
> > Mem_GetAddr, the recursive call to "find_base_term (tmp1)" returns NULL.
> >
> > This leads us back to "init_alias_analysis", where we actually fill up
> > the reg_base_value vector that is used, inside find_base_term, to return
> > the base term we want.
> >
> > Keeping it short, the 8th entry in that vector, that should've contained
> > the correct base term, is blank due to the 8th register being defined
> > and set before the "stw" instruction in the epilogue. This specific
> > situation led to all the others, including the performance degradation.
> >
> > This also relates to the -fno-strict-aliasing flag, which sets aliases
> > to 0 and we need to do extra work to sort out the conflicting pieces.
> > Without -fno-strict-aliasing, the code is generated correctly.
> >
> > So, it seems the RTL alias analysis framework in GCC is not designed to
> > handle REG_POINTER" on hard registers. I would like to suggest that we
> > return to Peter Bergner's proposed solution.
> >
> Then it's the RTL alias analysis that needs to be fixed -- Steve's
> solution to regrename is the correct solution.
>
> Jeff
Considering that the current RTL analysis code was not thought to handle
this scenario, wouldn't a proper fix to the RTL code together with the
REG_POINTER solution be appropriate?
Steve?
Regards,
Luis