This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug middle-end/52285] [4.7/4.8 Regression] libgcrypt _gcry_burn_stack slowdown


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52285

--- Comment #12 from Steven Bosscher <steven at gcc dot gnu.org> 2012-11-13 23:37:52 UTC ---
Created attachment 28678
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28678
Gross hack

(In reply to comment #11)
> If loops are still around at LRA time, perhaps LRA should consider putting
> it before loop if register pressure is low, or LIM could just have extra
> code for this

Unfortunately, loop are destroyed _just_ before LRA, at the end of IRA.
IRA has its own loop tree but that is destroyed before LRA, too.


> I'm not saying it must be LIM, I'm
> just looking for suggestions where to perform this.

LIM may be too early. I've experimented with the attached patch (based off
some other patch for invariant addresses that was bit-rotting on a shelf)
and I had to resort to some crude hacks to make loop-invariant even just
consider moving the bare frame_pointer_rtx, like manually setting the cost
to something high because set_src_cost(frame_pointer_rtx)==0.  The result
is this code:

foo:
        leaq    -72(%rsp), %rcx
        leaq    -8(%rsp), %rdx     // A Pyrrhic victory...
        .p2align 4,,10
        .p2align 3
.L5:
        movq    %rcx, %rax
        .p2align 4,,10
        .p2align 3
.L3:
        movb    $0, (%rax)
        addq    $1, %rax
        cmpq    %rdx, %rax
        jne     .L3
        subl    $64, %edi
        testl   %edi, %edi
        jg      .L5
        rep ret


Need to think about this a bit more, perhaps postreload-gcse can be used
for this instead of LIM...


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]