This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [MIPS][LS2][4/5] Scheduling and tuning
- From: Richard Sandiford <rdsandiford at googlemail dot com>
- To: Maxim Kuvyrkov <maxim at codesourcery dot com>
- Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>, Zhang Le <r0bertz at gentoo dot org>, Eric Fisher <joefoxreal at gmail dot com>
- Date: Sat, 14 Jun 2008 09:20:01 +0100
- Subject: Re: [MIPS][LS2][4/5] Scheduling and tuning
- References: <4835A9B4.9000709@codesourcery.com> <4835B40C.5080308@codesourcery.com> <877idi636v.fsf@firetop.home> <4851284A.60106@codesourcery.com> <48515794.7050007@codesourcery.com> <87od668r5p.fsf@firetop.home> <4852B98D.2070502@codesourcery.com>
Maxim Kuvyrkov <maxim@codesourcery.com> writes:
> Richard Sandiford wrote:
>> FWIW, this is PR 35802. 'Fraid I still haven't had chance to look at it,
>> what with IRA, recog-related stuff and reviews.
>
> ...
>
>> We have two choices:
>>
>> - always use pseudo registers, rather than introducing uses of $3
>> from the outset
>>
>> - force the destination of tls_get_tp_<mode> to be $3 only.
>
> I don't see how first approach can be any better than second. We will
> allocate register $3 for all those pseudos in the end.
The first approach is better if it works because some passes can only
optimise things that can have pseudo destinations. E.g. after the patch,
gcse won't be able to optimise these patterns.
The ICE currently only occurs when a pass has made such a replacement,
presumably because it thought that the change was an improvement.
I imagine that the patch I posted prevents something that we thought
was an optimisation in your testcase.
That's the big drawback of the second approach. These instructions
are emulated on the vast majority of processors, so we're losing
the ability to optimise very expensive instructions.
But like I say, I'm not convinced the first approach is going to avoid
the ICE in all cases. We have a PR against a release branch, so in the
first instance, I think we need something that is safe over something
that leads to better optimisation.
A third alternative is to use ugly workarounds like:
(define_insn "tls_get_tp_<mode>"
[(set (match_operand:P 0 "register_operand" "=v,???d")
(unspec:P [(const_int 0)]
UNSPEC_TLS_GET_TP))]
"HAVE_AS_TLS && !TARGET_MIPS16"
and make the second alternative use a sequence like:
move $1,$3
rdhwr $3,$29
move %0,$3
move $3,$1
which is _usually_ going to be a win in cases where it avoids a further
rdhwr instruction at runtime. But it would lose otherwise.
This too should be safe, either on its own or in combination with
the first approach.
But longer-term, I think we need to do something less hacky that
still allows the optimisations. I'm just not sure what yet. ;)
>> The second is probably the most conservative approach, since explicit
>> uses of $3 can occur through normal calls. The patch below does this.
>>
>> I admit I haven't verified any of this yet, so sorry if I'm off-ball.
>> But does the patch fix things?
>
> Looks like it does. I didn't run full regression testsuite, but glibc
> now builds.
Great, thanks. I'll think a bit more about this before installing anything.
You needn't wait before applying your patches.
Richard