This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On 13-10-19 4:30 PM, Jan Hubicka wrote:
Jan, Does this seem reasonable to you?Oops, sorry, I missed your email. (I was travelling and I am finishing a paper now).Thanks, Igor-----Original Message----- From: Zamyatin, Igor Sent: Tuesday, October 15, 2013 3:48 PM To: Jan Hubicka Subject: RE: Honnor ix86_accumulate_outgoing_args again Jan, Now we have following prologue in, say, phi0 routine in equake 0x804aa90 1 push %ebp 0x804aa91 2 mov %esp,%ebp 0x804aa93 3 sub $0x18,%esp 0x804aa96 4 vmovsd 0x80ef7a8,%xmm0 0x804aa9e 5 vmovsd 0x8(%ebp),%xmm1 0x804aaa3 6 vcomisd %xmm1,%xmm0 <-- we see big stall somewhere here or 1-2 instructions above While earlier it was 0x804abd0 1 sub $0x2c,%esp 0x804abd3 2 vmovsd 0x30(%esp),%xmm1 0x804abd9 3 vmovsd 0x80efcc8,%xmm0 0x804abe1 4 vcomisd %xmm1,%xmm0Thanks for analysis! It is a different benchmark than for bulldozer, but apparently same case. Again we used to eliminate frame pointer here but IRS now doesn't Do you see the same regression with -fno-omit-frame-pointer -maccumulate-outgoing-args? I suppose this is a conflict in between the push instruction hanled by stack engine and initialization of EBP that isn't. That would explain why bulldozer don't seem to care about this particular benchmark (its stack engine seems to have quite different design). This is a bit sad situation - accumulate-outgoing-args is expensive code size wise and it seems we don't really need esp with -mno-accumulate-outgoing-args. The non-accumulation code path was mistakely disabled for too long ;( Vladimir, how much effort do you think it will be to fix the frame pointer elimination here?
My guess is a week. The problem I am busy and having some problems with two small projects right now which I'd like to include into gcc-4.9. But I think, this still can be fixed on stage2 as it is a PR.
I can imagine it is a quite tricky case. If so I would suggest adding m_CORE_ALL to X86_TUNE_ACCUMULATE_OUTGOING_ARGS with a comment explaining the problem and mentioning the regression on equake on core and mgrid on Bulldizer and opening an enhancement request for this... I also wonder if direct ESP use and push/pop instructions are causing so noticeable issues, I wonder if we can't "shrink wrap" this into red-zone in the 64bit compilation. It seems that even with -maccumulate-outgoing-args pushing the frame allocation as late as possible in the function would be a good idea so it is not close to the push/pop/call/ret.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |