This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Is it OK for rtx_addr_can_trap_p_1 to attempt to compute the frame layout? (was Re: [PATCH] Skip re-computing the mips frame info after reload completed)
- From: Bernd Edlinger <bernd dot edlinger at hotmail dot de>
- To: Matthew Fortune <Matthew dot Fortune at imgtec dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Nick Clifton <nickc at redhat dot com>, "ebotcazou at libertysurf dot fr" <ebotcazou at libertysurf dot fr>, "rdsandiford at googlemail dot com" <rdsandiford at googlemail dot com>
- Date: Fri, 29 Jan 2016 17:54:16 +0000
- Subject: Re: Is it OK for rtx_addr_can_trap_p_1 to attempt to compute the frame layout? (was Re: [PATCH] Skip re-computing the mips frame info after reload completed)
- Authentication-results: sourceware.org; auth=none
- Authentication-results: imgtec.com; dkim=none (message not signed) header.d=none;imgtec.com; dmarc=none action=none header.from=hotmail.de;
- References: <HE1PR07MB09054A3F1F796882634605CEE4C50 at HE1PR07MB0905 dot eurprd07 dot prod dot outlook dot com> <6D39441BF12EF246A7ABCE6654B023536A70509A at LEMAIL01 dot le dot imgtec dot org> <HE1PR07MB09058BE6154DB5AA7AA7395BE4C70 at HE1PR07MB0905 dot eurprd07 dot prod dot outlook dot com> <6D39441BF12EF246A7ABCE6654B023536A705EB0 at LEMAIL01 dot le dot imgtec dot org> <87y4bcavl5 dot fsf_-_ at googlemail dot com> <HE1PR07MB09059E183F56E4457D600825E4D90 at HE1PR07MB0905 dot eurprd07 dot prod dot outlook dot com> <87io2djqm9 dot fsf at googlemail dot com>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:23
On 28.01.2016 23:17, Richard Sandiford wrote:
> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
>> On 26.01.2016 22:18, Richard Sandiford wrote:
>>> [cc-ing Eric as RTL maintainer]
>>>
>>> Matthew Fortune <Matthew.Fortune@imgtec.com> writes:
>>>> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
>>>>> Matthew Fortune <Matthew.Fortune@imgtec.com> writes:
>>>>>> Has the patch been tested beyond just building GCC? I can do a
>>>>>> test run for you if you don't have things set up to do one yourself.
>>>>>
>>>>> I built a cross-gcc with all languages and a cross-glibc, but I have
>>>>> not set up an emulation environment, so if you could give it a test
>>>>> that would be highly welcome.
>>>>
>>>> mipsel-linux-gnu test results are the same before and after this patch.
>>>>
>>>> Please go ahead and commit.
>>>
>>> I still object to this. And it feels like the patch was posted
>>> as though it was a new one in order to avoid answering the objections
>>> that were raised when it was last posted:
>>>
>>> https://gcc.gnu.org/ml/gcc-patches/2015-12/msg02218.html
>>>
>>
>> Richard, I am really sorry when you feel now like I did not take your
>> objections seriously. Let me first explain what happened from my point
>> of view:
>>
>> When I posted this response to your objections here:
>>
>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00235.html
>>
>> I was waiting for your response, but nothing happened, so I kind of
>> forgot about the issue. In the meantime Ubuntu and Debian began to
>> roll out GCC-6 and they got stuck at the same issue, but instead of
>> waiting for pr69012 to be eventually resolved they created a duplicate
>> pr69129, and then last Thursday Nick applied my initial patch without
>> my intervention, probably because of the pressure they put on us.
>
> Ah, I'd missed that, sorry. It's not obvious from the email thread
> that the patch was actually approved. I just see a message from Matthias
> saying that it worked for him and then a message from Nick saying that
> he'd applied it.
>
> Ah well. I guess at some point I have to get over the fact that I'm no
> longer the MIPS maintainer :-)
>
>> That changed significantly how I looked at the issue after that point,
>> as there was no actual regression anymore, the revised patch still made
>> sense, but for another reason. When you look at the 20+ targets in the
>> gcc tree you'll see that almost all of them have a frame-layout
>> computation function and all except mips have a shortcut
>> "if (reload_completed) return;" in that function. And OTOH mips has
>> one of the most complicated frame layout functions of all targets.
>>
>> For all of these reasons I posted a new patch which tries to resolve
>> differences between mips and other targets inital_elimination_offset
>> functions.
>
> OK. But the point still stands that the patch is only useful because
> we're now calling mips_compute_frame_info in cases where we wouldn't
> previously, because of the rtx_addr_can_trap_p changes.
>
Yes of course, but it cannot hurt to have all targets behave
identical in such a central point. Even if at some point also the
implementation in rtx_addr_can_trap_p will eventually be improved in
one way or the other.
>> I still think that it is not OK for the mips target to do the frame
>> layout already in mips_frame_pointer_required because the frame layout
>> will change several times, until reload is completed, and that function
>> is only called right in the beginning.
>
> I don't think it's any better or worse than doing the frame layout in
> INITIAL_ELIMINATION_OFFSET (which is common practice and pretty much
> required). They're both part of the initial setup phase --
> targetm.frame_layout_required determines frame_pointer_needed, which is
> a vital input to the code that decides which eliminations to make.
>
Hmm... That can also be a difference between LRA and traditional reload.
Reload calls targetm.frame_pointer_required in update_eliminables, and
can apparently handle 0=>1 and 1=>0 transitions here.
While LRA calls frame_pointer_required only once in
ira_setup_eliminable_regset and can only handle 0=>1 transitions of
frame_pointer_needed in setup_can_eliminate, but not based on
targetm.frame_pointer_required but
targetm.can_eliminate(FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM).
At least it I read the source correctly, I could be wrong of course.
> And this is an example of us doing the kind of caching that I was
> suggesting. Code that wants to know whether the frame pointer is needed
> for the current function should use frame_pointer_needed. Only the code
> that sets up frame_pointer_needed should call frame_pointer_required
> directly.
>
But frame_pointer_needed adds at least some value to the raw
targetm.frame_pointer_required, while a cached result of
initial_elimination_offset would not, just conserve the current
interface exactly as it is.
>> And I think that it is not really well-designed to have a frame layout
>> function F(x,y)->z unless you assume F to be a trivial O(1) function
>> that has no significant computation overhead. When you assume F to be
>> an expensive O(N) or higher complexity you would design the interface
>> like compute_frame_layout_now() and
>> get_cached_initial_elimination_offset(x,y)->z.
>>
>> Also note, that with the current design, of initial_elimination_offset
>> the frame layout is already computed several times, because reload has
>> to call the function with different parameters, and there is no way how
>> to know when the cached value can be used and when not.
>
> Agreed -- the current interface is pretty poor. But that's even more
> reason not to add uses of it after LRA/reload. Caching would be both
> simpler and more efficient.
>
>> I do also think that having to cache the initial elimination offset
>> values in the back-end that are already cached in the target does not
>> improve anything. Then I would prefer to have a set of new target
>> callbacks that makes it easy to access the cached target values.
>
> I don't agree. Asking each backend to cache a particular value in
> its frame_info structure and then providing a hook to query that
> cached value means adding code to every target. And calling an
> indirect function is going to much less efficient than accessing
> a structure field.
>
> It seems better to have one piece of code (or maybe two, until
> reload goes away) that caches the information that the target-independent
> code needs.
>
It is for sure a matter of personal taste, nothing more.
And yes, it is do-able, and not even really complicated, but it would
not improve anything, just conserve the interface as it is (or as it was
before I spoiled it :).
I would rather like to improve something at this point. For instance
I think it would be good to split the initial_elimination_offset
function in an optional compute_initial_frame_layout function, doing
only the O(N) computation and leave the actual O(1) querying of the
result in the initial_elimination_offset function. If the target does
not implement the compute_initial_frame_layout the default just does
nothing, and the initial_elimination_offset function does not have
to change, but if the target implements the optional
compute_initial_frame_layout, it can rely on that function to be called
immediately before the initial_elimination_offset, which will then be
only an O(1) function. That would at least improve a tiny bit over the
current state of the initial_elimination_offset function.
Tanks
Bernd.
> I think that applies to other frame-related information too. At the
> moment there's no simple way for the postreload passes to tell which
> registers are available for use and which have to be left untouched,
> so we get complicated tests like:
>
> for (i = nregs - 1; i >= 0; --i)
> if (TEST_HARD_REG_BIT (this_unavailable, new_reg + i)
> || fixed_regs[new_reg + i]
> || global_regs[new_reg + i]
> /* Can't use regs which aren't saved by the prologue. */
> || (! df_regs_ever_live_p (new_reg + i)
> && ! call_used_regs[new_reg + i])
> #ifdef LEAF_REGISTERS
> /* We can't use a non-leaf register if we're in a
> leaf function. */
> || (crtl->is_leaf
> && !LEAF_REGISTERS[new_reg + i])
> #endif
> || ! HARD_REGNO_RENAME_OK (reg + i, new_reg + i))
> return false;
>
> This is an example of relying (to some extent) on querying the target
> code every time we want to know something, but surely it would be
> better to have LRA/reload create a single "usable registers" HARD_REG_SET.
> (I have an old patch series for that but never found time to clean it up
> and submit it.)
>
>> If you want to propose a patch for that I will not object, but I doubt
>> it will be suitable for Stage 4.
>
> Fair enough. Guess I just have to learn to live with it.
>
> Thanks,
> Richard
>