This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 2/6] Andes nds32: machine description of nds32 porting (2).
- From: Chung-Ju Wu <jasonwucj at gmail dot com>
- To: gcc-patches <gcc-patches at gcc dot gnu dot org>, rdsandiford at googlemail dot com
- Date: Sun, 6 Oct 2013 21:24:40 +0800
- Subject: Re: [PATCH 2/6] Andes nds32: machine description of nds32 porting (2).
- Authentication-results: sourceware.org; auth=none
- References: <CADj25HOO04tn85ZfL2adeHUo8EL7mGwFf8yB4CadofGCNVszVQ at mail dot gmail dot com> <Pine dot LNX dot 4 dot 64 dot 1307092324060 dot 29921 at digraph dot polyomino dot org dot uk> <51EFF7CA dot 6050601 at gmail dot com> <51F0F2F5 dot 6040905 at gmail dot com> <522CA258 dot 2010403 at gmail dot com> <87six7k4x5 dot fsf at talisman dot default> <5245CAF2 dot 2020106 at gmail dot com> <87had0lwyg dot fsf at talisman dot default> <525058A5 dot 4020504 at gmail dot com> <87li26sowi dot fsf at talisman dot default>
2013/10/6 Richard Sandiford <rdsandiford@googlemail.com>:
> Chung-Ju Wu <jasonwucj@gmail.com> writes:
>> On 10/2/13 1:31 AM, Richard Sandiford wrote:
>>> Chung-Ju Wu <jasonwucj@gmail.com> writes:
>>>> + /* Use $r15, if the value is NOT in the range of Is20,
>>>> + we must output "sethi + ori" directly since
>>>> + we may already passed the split stage. */
>>>> + return "sethi\t%0, hi20(%1)\;ori\t%0, %0, lo12(%1)";
>>>> + case 17:
>>>> + return "#";
>>>
>>> I don't really understand the comment for case 16. Returning "#"
>>> (like for case 17) forces a split even at the output stage.
>>>
>>> In this case it might not be worth forcing a split though, so I don't
>>> see any need to change the code. I think the comment should be changed
>>> to give a different reason though.
>>>
>>
>> Sorry for the misleading comment.
>>
>> For case 17, we were trying to split large constant into two individual
>> rtx patterns into "sethi" + "addi" so that we can have chance to match
>> "addi" pattern with 16-bit instruction.
>>
>> But case 16 is different.
>> This case is only produced at prologue/epilogue phase, using a temporary
>> register $r15 to hold a large constant for adjusting stack pointer.
>> Since prologue/epilogue is after split1/split2 phase, we can only
>> output "sethi" + "ori" directly.
>> (The "addi" instruction with $r15 is a 32-bit instruction.)
>
> But this code is in the output template of the define_insn. That code
> is only executed during final, after all passes have been run. If the
> template returns "#", final will split the instruction itself, which is
> possible even at that late stage. "#" doesn't have any effect on the
> passes themselves.
>
> (FWIW, there's also a split3 pass that runs after prologue/epilogue
> generation but before sched2.)
>
> However, ISTR there is/was a rule that prologue instructions shouldn't
> be split, since they'd lose their RTX_FRAME_RELATED_P bit or something.
> Maybe you hit an ICE because of that?
>
Ah... yes, you are right. In the nds32_force_addi_stack_int(),
I move a large constant to a temp register for stack pointer adjustment:
+ /* $r15 is going to be temporary register to hold the value. */
+ tmp_reg = gen_rtx_REG (SImode, TA_REGNUM);
+
+ /* Create one more instruction to move value
+ into the temporary register. */
+ value_move_insn = emit_move_insn (tmp_reg, GEN_INT (full_value));
+
+ /* At prologue, we need to tell GCC that this is frame related insn,
+ so that we can consider this instruction to output debug information.
+ If full_value is NEGATIVE, it means this function
+ is invoked by expand_prologue. */
+ if (full_value < 0)
+ RTX_FRAME_RELATED_P (value_move_insn) = 1;
+
+ /* Create new 'add' rtx. */
+ sp_adjust_insn = gen_addsi3 (stack_pointer_rtx,
+ stack_pointer_rtx,
+ tmp_reg);
+ /* Emit rtx into insn list and receive its transformed insn rtx. */
+ sp_adjust_insn = emit_insn (sp_adjust_insn);
+
+ /* At prologue, we need to tell GCC that this is frame related insn,
+ so that we can consider this instruction to output debug information.
+ If full_value is NEGATIVE, it means this function
+ is invoked by expand_prologue. */
+ if (full_value < 0)
+ RTX_FRAME_RELATED_P (sp_adjust_insn) = 1;
If there is a rule to avoid spliting instructions with RTX_FRAME_RELATED_P,
I think it is the case why I hit an ICE of unrecognized insn for
'value_move_insn'.
It seems that my comment to case 16 is incorrect.
Thanks for clarifying it.
> Another way to handle this would be to have the movsi expander split
> large constant moves. When can_create_pseudo_p (), the intermediate
> results can be stored in new registers, otherwise they should reuse
> operands[0]. Two advantages to doing it that way are that high parts
> can be shared before RA, and that calls to emit_move_insn from the
> prologue code will split the move automatically. I think many ports
> do it that way (including MIPS FWIW).
>
Do you mean that I should split large constant by myself in movsi
(or starting from movsi) for both case 16 and case 17?
Thanks for the suggestion. I'll try to implement it. :)
Best regards,
jasonwucj