This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Patch to fix gcc.c-torture/compile/20010102-1.c on IA64 HP-UX


Luis Machado wrote:
On Fri, 2008-10-03 at 17:46 -0700, Andrew Pinski wrote:
On Fri, Oct 3, 2008 at 5:40 PM, Jeff Law <law@redhat.com> wrote:
I'd be hard pressed to see how this patch could cause any kind of
performance regression since it's merely propagating information to
the
backend that was getting lost in regrename and I don't see that the
ppc uses
REG_POINTER in its backend.
PPC does not use it but the way lwx is used is [base+index] which
should really be done that way, otherwise it will have some stalls in
some cases (on Power 6).

Thanks,
Andrew Pinski

After further investigation, there is some light to this problem.


There appears to be a bad interaction between REG_POINTER and RTL alias
analysis, as we see below.

The epilogue for these two functions from vortex (Mem_GetAddr,
Mem_GetWord) are the following, for both revisions:

Mem_GetAddr (revision 140615):

.L3:
        mr 3,29
        mr 4,30
        bl Ut_PrintErr
        lwz 7,36(1)
        lwz 26,8(1)
*        lis 8,Test@ha
        lwz 27,12(1)
        lwz 28,16(1)
        mtlr 7
        mr 0,3
        lwz 29,20(1)
        lwz 30,24(1)
        mr 3,0
        lwz 31,28(1)
*        stw 0,Test@l(8)
        addi 1,1,32
        blr

Mem_GetAddr (revision 140616):

.L3:
        mr 3,29
        mr 4,30
        bl Ut_PrintErr
*        lis 8,Test@ha
        mr 0,3
*        stw 0,Test@l(8)
        mr 3,0
        lwz 7,36(1)
        lwz 26,8(1)
        lwz 27,12(1)
        lwz 28,16(1)
        mtlr 7
        lwz 29,20(1)
        lwz 30,24(1)
        lwz 31,28(1)
        addi 1,1,32
        blr


Mem_GetWord (revisions 140615 and 140616):


.L8:
        lwz 5,0(31)
        li 0,1
        cmpwi 6,5,0
        bne- 6,.L9
        lwz 10,36(1)
        lwz 26,8(1)
*        lis 11,Test@ha
        mr 3,0
        lwz 27,12(1)
        lwz 28,16(1)
        mtlr 10
*        stw 0,Test@l(11)
        lwz 29,20(1)
        lwz 30,24(1)
        lwz 31,28(1)
        addi 1,1,32
        blr

Clearly, the code generation for Mem_GetWord is unaffected by the
changes from revision 140616.

The problem happens with Mem_GetAddr/revision 140616. We can see that
"lis 8,Test@ha" and "stw 0,Test@l(8)" are very close to each other,
leading to possible stalls in the pipeline, which doesn't happen with
the code generated by revision 140615.


The positioning of "lis" and "stw" we see in revision 140616 is due to a
number of dependencies created wrt the following lwz instructions, so
the "stw" needs to execute before that, avoiding the possibility to move
"stw" further down.

So we're left with the question of why these dependencies are being
created.

Digging further, revision 140616 enabled some registers to store a value
meaning its contents are, in fact, a pointer, and we access this data
with REG_POINTER(...).

Looking at this specific code inside alias.c:find_base_term:

        if (REG_P (tmp1) && REG_POINTER (tmp1))
          return find_base_term (tmp1);

        if (REG_P (tmp2) && REG_POINTER (tmp2))
          return find_base_term (tmp2);

Since up until revision 140615 we didn't store the specific information
mentioned above, we just flew by these conditions and returned either
tmp1 or tmp2, not going into the recursive call.

With revision 140616, we actually have the pointer-in-reg information,
and thus we go into the recursive call to either "find_base_term (tmp1)"
or "find_base_term (tmp2)".

This should work as expected, as it really does for Mem_GetWord, but for
Mem_GetAddr, the recursive call to "find_base_term (tmp1)" returns NULL.

This leads us back to "init_alias_analysis", where we actually fill up
the reg_base_value vector that is used, inside find_base_term, to return
the base term we want.

Keeping it short, the 8th entry in that vector, that should've contained
the correct base term, is blank due to the 8th register being defined
and set before the "stw" instruction in the epilogue. This specific
situation led to all the others, including the performance degradation.

This also relates to the -fno-strict-aliasing flag, which sets aliases
to 0 and we need to do extra work to sort out the conflicting pieces.
Without -fno-strict-aliasing, the code is generated correctly.

So, it seems the RTL alias analysis framework in GCC is not designed to
handle REG_POINTER" on hard registers. I would like to suggest that we
return to Peter Bergner's proposed solution.
Then it's the RTL alias analysis that needs to be fixed -- Steve's solution to regrename is the correct solution.

Jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]