This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Patch to fix gcc.c-torture/compile/20010102-1.c on IA64 HP-UX


On Thu, 2008-10-16 at 12:45 -0600, Jeff Law wrote:
> Luis Machado wrote:
> > On Fri, 2008-10-03 at 17:46 -0700, Andrew Pinski wrote: 
> >   
> >> On Fri, Oct 3, 2008 at 5:40 PM, Jeff Law <law@redhat.com> wrote:
> >>     
> >>> I'd be hard pressed to see how this patch could cause any kind of
> >>> performance regression since it's merely propagating information to
> >>>       
> >> the
> >>     
> >>> backend that was getting lost in regrename and I don't see that the
> >>>       
> >> ppc uses
> >>     
> >>> REG_POINTER in its backend.
> >>>       
> >> PPC does not use it but the way lwx is used is [base+index] which
> >> should really be done that way, otherwise it will have some stalls in
> >> some cases (on Power 6).
> >>
> >> Thanks,
> >> Andrew Pinski
> >>     
> >
> > After further investigation, there is some light to this problem.
> >
> > There appears to be a bad interaction between REG_POINTER and RTL alias
> > analysis, as we see below.
> >
> > The epilogue for these two functions from vortex (Mem_GetAddr,
> > Mem_GetWord) are the following, for both revisions:
> >
> > Mem_GetAddr (revision 140615):
> >
> > .L3:
> >         mr 3,29
> >         mr 4,30
> >         bl Ut_PrintErr
> >         lwz 7,36(1)
> >         lwz 26,8(1)
> > *        lis 8,Test@ha
> >         lwz 27,12(1)
> >         lwz 28,16(1)
> >         mtlr 7
> >         mr 0,3
> >         lwz 29,20(1)
> >         lwz 30,24(1)
> >         mr 3,0
> >         lwz 31,28(1)
> > *        stw 0,Test@l(8)
> >         addi 1,1,32
> >         blr
> >
> > Mem_GetAddr (revision 140616):
> >
> > .L3:
> >         mr 3,29
> >         mr 4,30
> >         bl Ut_PrintErr
> > *        lis 8,Test@ha
> >         mr 0,3
> > *        stw 0,Test@l(8)
> >         mr 3,0
> >         lwz 7,36(1)
> >         lwz 26,8(1)
> >         lwz 27,12(1)
> >         lwz 28,16(1)
> >         mtlr 7
> >         lwz 29,20(1)
> >         lwz 30,24(1)
> >         lwz 31,28(1)
> >         addi 1,1,32
> >         blr
> >
> >
> > Mem_GetWord (revisions 140615 and 140616):
> >
> > .L8:
> >         lwz 5,0(31)
> >         li 0,1
> >         cmpwi 6,5,0
> >         bne- 6,.L9
> >         lwz 10,36(1)
> >         lwz 26,8(1)
> > *        lis 11,Test@ha
> >         mr 3,0
> >         lwz 27,12(1)
> >         lwz 28,16(1)
> >         mtlr 10
> > *        stw 0,Test@l(11)
> >         lwz 29,20(1)
> >         lwz 30,24(1)
> >         lwz 31,28(1)
> >         addi 1,1,32
> >         blr
> >
> > Clearly, the code generation for Mem_GetWord is unaffected by the
> > changes from revision 140616.
> >
> > The problem happens with Mem_GetAddr/revision 140616. We can see that
> > "lis 8,Test@ha" and "stw 0,Test@l(8)" are very close to each other,
> > leading to possible stalls in the pipeline, which doesn't happen with
> > the code generated by revision 140615. 
> >
> > The positioning of "lis" and "stw" we see in revision 140616 is due to a
> > number of dependencies created wrt the following lwz instructions, so
> > the "stw" needs to execute before that, avoiding the possibility to move
> > "stw" further down.
> >
> > So we're left with the question of why these dependencies are being
> > created.
> >
> > Digging further, revision 140616 enabled some registers to store a value
> > meaning its contents are, in fact, a pointer, and we access this data
> > with REG_POINTER(...).
> >
> > Looking at this specific code inside alias.c:find_base_term:
> >
> >         if (REG_P (tmp1) && REG_POINTER (tmp1))
> >           return find_base_term (tmp1);
> >
> >         if (REG_P (tmp2) && REG_POINTER (tmp2))
> >           return find_base_term (tmp2);
> >
> > Since up until revision 140615 we didn't store the specific information
> > mentioned above, we just flew by these conditions and returned either
> > tmp1 or tmp2, not going into the recursive call.
> >
> > With revision 140616, we actually have the pointer-in-reg information,
> > and thus we go into the recursive call to either "find_base_term (tmp1)"
> > or "find_base_term (tmp2)".
> >
> > This should work as expected, as it really does for Mem_GetWord, but for
> > Mem_GetAddr, the recursive call to "find_base_term (tmp1)" returns NULL.
> >
> > This leads us back to "init_alias_analysis", where we actually fill up
> > the reg_base_value vector that is used, inside find_base_term, to return
> > the base term we want.
> >
> > Keeping it short, the 8th entry in that vector, that should've contained
> > the correct base term, is blank due to the 8th register being defined
> > and set before the "stw" instruction in the epilogue. This specific
> > situation led to all the others, including the performance degradation.
> >
> > This also relates to the -fno-strict-aliasing flag, which sets aliases
> > to 0 and we need to do extra work to sort out the conflicting pieces.
> > Without -fno-strict-aliasing, the code is generated correctly.
> >
> > So, it seems the RTL alias analysis framework in GCC is not designed to
> > handle REG_POINTER" on hard registers. I would like to suggest that we
> > return to Peter Bergner's proposed solution.
> >   
> Then it's the RTL alias analysis that needs to be fixed   -- Steve's 
> solution to regrename is the correct solution. 
> 
> Jeff

Considering that the current RTL analysis code was not thought to handle
this scenario, wouldn't a proper fix to the RTL code together with the
REG_POINTER solution be appropriate?

Steve?

Regards,
Luis


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]