This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] [i386] Recompute the frame layout less often
On 05/19/17 05:17, Daniel Santos wrote:
> On 05/18/2017 08:37 AM, Bernd Edlinger wrote:
>> On 05/17/17 04:01, Daniel Santos wrote:
>>>> - if (ignore_outlined && cfun->machine->call_ms2sysv
>>>> - && in_hard_reg_set_p (stub_managed_regs, DImode, regno))
>>>> - return false;
>>>> + if (ignore_outlined && cfun->machine->call_ms2sysv)
>>>> + {
>>>> + /* Registers who's save & restore will be managed by stubs
>>>> called from
>>>> + pro/epilogue. */
>>>> + HARD_REG_SET stub_managed_regs;
>>>> + xlogue_layout::compute_stub_managed_regs (stub_managed_regs);
>>>>
>>>> + if (in_hard_reg_set_p (stub_managed_regs, DImode, regno))
>>>> + return false;
>>>> + }
>>>> +
>>>> if (crtl->drap_reg
>>>> && regno == REGNO (crtl->drap_reg)
>>>> && !cfun->machine->no_drap_save_restore)
>>> This makes no sense. The entire purpose of stub_managed_regs is to
>>> cache the result of xlogue_layout::compute_stub_managed_regs() and this
>>> would unnecessarily repeat that calculation for each time
>>> ix86_save_reg() is called. Since
>>> xlogue_layout::compute_stub_managed_regs() calls ix86_save_reg many
>>> times, this makes it even worse.Which registers are being saved
>>> out-of-line and inline MUST be known at the time the stack layout is
>>> determined. So stub_managed_regsshould either be left a TU static or
>>> just moved to struct machine_function.
>>>
>>> As an aside, I've noticed that xlogue_layout::compute_stub_managed_regs
>>> is calling ix86_save_reg (which isn't trivial) more often than it really
>>> has to, so I've refactored it.
>>>
>> Well, meanwhile I think the stub_managed_regs contain zero information
>> and need not be saved at all, because it can easily be reconstructed
>> from m->call_ms2sysv_extra_regs.
>>
>> See the attached new version. Daniel does it work for you?
>
> No, I'm not at all comfortable with you making so many seemingly
> unnecessary changes to this code. (Although I wish I got this much
> feedback during my RFCs! :) I can accept the changes to
> is/count_stub_managed_reg (with some caveats), but I do not at all see
> the rationale for changing m_stub_names to a static and adding another
> dimension for the object instance -- from an OO standpoint, that's just
> bad design. Can you please share your rationale for that?
>
Hmm, sorry about that ...
I just thought it would be nice to avoid the const-cast here.
This moved the m_stub_names from all 4 instances to one static
array s_stub_names. But looking at it again, I think the extra
dimension is not even necessary, because all instances share the
same data, so removing that extra dimension again will be fine.
> Incidentally, half of the space in that array is wasted and can be
> trimmed since a given instance of xlogue_layout either supports hard
> frame pointers or doesn't, I just never got around to splitting that
> apart. (The first three enum xlogue_stub values are for without HFP and
> the last three for with.) Also, if we wanted to further reduce the
> memory footprint of xlogue_layout objects, the offset field of struct
> reginfo could be changed to int, and if we really wanted to go nuts then
> 16 bits would work for both of its fields.
>
> So for is/count_stub_managed_reg(s), you are obviously much more
> familiar with gcc, its passes and the i386 backend, but my knowledge
> level makes me very uncomfortable having the result of
> xlogue_layout::is_stub_managed_reg() determined in a way that has the
> (apparent) potential to differ from from the state at the time the last
> call to ix86_compute_frame_layout() was made; I just don't understand
I fund it technically difficult to add a HARD_REG_SET to
struct machine_function, and tried to avoid the extra overhead of
calling ix86_save_reg so often, which you also felt uncomfortable with.
So, if you look at compute_stub_managed_regs I first saw that the
first loop can never terminate thru the "return 0", because the
registers in x86_64_ms_sysv_extra_clobbered_registers are guaranteed
to be clobbered. Then I saw that the bits in stub_managed_regs
are always added in the same sequence, thus the result depends only
on the number call_ms2sysv_extra_regs and hfp so everything is already
available in struct machine_function.
Thanks
Bernd.