This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] PR target/70155: Use SSE for TImode load/store
- From: Ilya Enkovich <enkovich dot gnu at gmail dot com>
- To: "H.J. Lu" <hjl dot tools at gmail dot com>
- Cc: Uros Bizjak <ubizjak at gmail dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Richard Biener <rguenther at suse dot de>
- Date: Tue, 26 Apr 2016 19:20:30 +0300
- Subject: Re: [PATCH] PR target/70155: Use SSE for TImode load/store
- Authentication-results: sourceware.org; auth=none
- References: <20160425125145 dot GA5326 at intel dot com> <CAFULd4ZFMuMR0UVtxYL6teuO-CABSJv6QbqXhkU5DX5F_aGdYQ at mail dot gmail dot com> <CAMe9rOrzOZYGSwq9g9aj3xSjSHnGuBJh_rmFTai0V7eUPkaSTw at mail dot gmail dot com> <CAFULd4Z54Gv2GQcjz=qfmd0PSaON44zz-h7SWBvKLhTzgOOzKg at mail dot gmail dot com> <CAMe9rOqMxQruLKDakrO6s=gcUmg-QQJdABEUbOzu6=cKgMAOog at mail dot gmail dot com> <CAMbmDYZLf4FFcRZpxwUdy5C7LqTU08Xcy2ycAVb-_rzW4O_n5g at mail dot gmail dot com> <CAMe9rOps5ODg-2wNQAPJXo=dW1ewSjKzGuy7SjYYbybK5W0yVA at mail dot gmail dot com> <CAMbmDYbja8eX2+d_swDBB2-BpHKYhLH0u9Bd_8Qp6BOQ76NWmw at mail dot gmail dot com> <CAMe9rOrsDmLKVJf50mooU6fQn2OakruFHpSb3dubfNAEHYuCcg at mail dot gmail dot com> <CAMbmDYYBWiivahcD62FkT0TtzPorrc891ygsdw8Yd4VsHTb8Eg at mail dot gmail dot com> <CAMe9rOojZ=xRTybL-bTs+J+Ha5gkfwHTDMKa_3_N75gdtJikAw at mail dot gmail dot com> <CAMbmDYbhD_F1YbgA4sPn9H7NVycmfEE3_jL9UrpzwQp0xhNcfw at mail dot gmail dot com> <CAMe9rOonURjJKzjBMy=tAzzWtew0O=cLvVyCR1T-0G_6uO78cQ at mail dot gmail dot com> <CAMbmDYZ=XXhDkOeGCK36hWbsz9wHNLDJAd_F9KGqOkoNzhmR5g at mail dot gmail dot com> <CAMe9rOp6w1f5MpNFe9SoNnAsohqS2dAtFsQuBNjpPKC2x0hqJg at mail dot gmail dot com>
2016-04-26 19:12 GMT+03:00 H.J. Lu <hjl.tools@gmail.com>:
> On Tue, Apr 26, 2016 at 9:07 AM, Ilya Enkovich <enkovich.gnu@gmail.com> wrote:
>> 2016-04-26 18:39 GMT+03:00 H.J. Lu <hjl.tools@gmail.com>:
>>> On Tue, Apr 26, 2016 at 8:21 AM, Ilya Enkovich <enkovich.gnu@gmail.com> wrote:
>>>> 2016-04-26 18:12 GMT+03:00 H.J. Lu <hjl.tools@gmail.com>:
>>>>> On Tue, Apr 26, 2016 at 8:05 AM, Ilya Enkovich <enkovich.gnu@gmail.com> wrote:
>>>>>> 2016-04-26 17:55 GMT+03:00 H.J. Lu <hjl.tools@gmail.com>:
>>>>>>> On Tue, Apr 26, 2016 at 7:15 AM, Ilya Enkovich <enkovich.gnu@gmail.com> wrote:
>>>>>>>> 2016-04-26 17:07 GMT+03:00 H.J. Lu <hjl.tools@gmail.com>:
>>>>>>>>> On Mon, Apr 25, 2016 at 9:13 AM, Ilya Enkovich <enkovich.gnu@gmail.com> wrote:
>>>>>>>>>> 2016-04-25 18:27 GMT+03:00 H.J. Lu <hjl.tools@gmail.com>:
>>>>>>>>>>>
>>>>>>>>>>> Ilya, can you take a look?
>>>>>>>>>>>
>>>>>>>>>>> Thanks.
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> H.J.
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Algorithmic part of the patch looks OK to me except the following piece of code.
>>>>>>>>>>
>>>>>>>>>> +/* Check REF's chain to add new insns into a queue
>>>>>>>>>> + and find registers requiring conversion. */
>>>>>>>>>>
>>>>>>>>>> Comment is wrong because you don't have any conversions required for
>>>>>>>>>> your candidates.
>>>>>>>>>
>>>>>>>>> I will fix it.
>>>>>>>>>
>>>>>>>>>> +
>>>>>>>>>> +void
>>>>>>>>>> +scalar_chain_64::analyze_register_chain (bitmap candidates, df_ref ref)
>>>>>>>>>> +{
>>>>>>>>>> + df_link *chain;
>>>>>>>>>> +
>>>>>>>>>> + gcc_assert (bitmap_bit_p (insns, DF_REF_INSN_UID (ref))
>>>>>>>>>> + || bitmap_bit_p (candidates, DF_REF_INSN_UID (ref)));
>>>>>>>>>> + add_to_queue (DF_REF_INSN_UID (ref));
>>>>>>>>>> +
>>>>>>>>>> + for (chain = DF_REF_CHAIN (ref); chain; chain = chain->next)
>>>>>>>>>> + {
>>>>>>>>>> + unsigned uid = DF_REF_INSN_UID (chain->ref);
>>>>>>>>>> +
>>>>>>>>>> + if (!NONDEBUG_INSN_P (DF_REF_INSN (chain->ref)))
>>>>>>>>>> + continue;
>>>>>>>>>> +
>>>>>>>>>> + if (!DF_REF_REG_MEM_P (chain->ref))
>>>>>>>>>> + continue;
>>>>>>>>>>
>>>>>>>>>> I believe here you wrongly jump to the next ref intead of actually adding it
>>>>>>>>>> to a queue. You may just use
>>>>>>>>>>
>>>>>>>>>> gcc_assert (!DF_REF_REG_MEM_P (chain->ref));
>>>>>>>>>>
>>>>>>>>>> because you should'n have a candidate used in address operand.
>>>>>>>>>
>>>>>>>>> I will update.
>>>>>>>>>
>>>>>>>>>> +
>>>>>>>>>> + if (bitmap_bit_p (insns, uid))
>>>>>>>>>> + continue;
>>>>>>>>>> +
>>>>>>>>>> + if (bitmap_bit_p (candidates, uid))
>>>>>>>>>> + add_to_queue (uid);
>>>>>>>>>>
>>>>>>>>>> Probably gcc_assert (bitmap_bit_p (candidates, uid)) since no uses and defs
>>>>>>>>>> out of candidates list are allowed?
>>>>>>>>>
>>>>>>>>> That would be wrong since there are
>>>>>>>>>
>>>>>>>>> while (!bitmap_empty_p (queue))
>>>>>>>>> {
>>>>>>>>> insn_uid = bitmap_first_set_bit (queue);
>>>>>>>>> bitmap_clear_bit (queue, insn_uid);
>>>>>>>>> bitmap_clear_bit (candidates, insn_uid);
>>>>>>>>> add_insn (candidates, insn_uid);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> An instruction is a candidate and the bit is cleared when
>>>>>>>>> analyze_register_chain is called.
>>>>>>>>
>>>>>>>> You clear candidates bit but the first thing you do in add_insn is set
>>>>>>>> insns bit.
>>>>>>>> Thus you should hit:
>>>>>>>>
>>>>>>>> if (bitmap_bit_p (insns, uid))
>>>>>>>> continue;
>>>>>>>>
>>>>>>>> For handled candidates.
>>>>>>>>
>>>>>>>> Probably it would be more clear if we keep this clear/set pair
>>>>>>>> together? E.g. move
>>>>>>>> bitmap_clear_bit (candidates, insn_uid) to scalar_chain::add_insn.
>>>>>>>>
>>>>>>>
>>>>>>> After we started processing candidates, we only use candidates
>>>>>>> to check if an instruction is a candidate, not to check if an
>>>>>>> instruction is NOT a candidate.
>>>>>>
>>>>>> I don't see how it's related to what I said. My point is that
>>>>>> when you analyze added insn you shouldn't reach insns which are both
>>>>>> not in candidates and not in current scalar_chain_64. That's why I
>>>>>> think you miss an assert in scalar_chain_64::analyze_register_chain.
>>>>>
>>>>> Since all candidates will be processed by
>>>>>
>>>>> while (!bitmap_empty_p (queue))
>>>>> {
>>>>> insn_uid = bitmap_first_set_bit (queue);
>>>>> bitmap_clear_bit (queue, insn_uid);
>>>>> bitmap_clear_bit (candidates, insn_uid);
>>>>> add_insn (candidates, insn_uid);
>>>>> }
>>>>
>>>> This process only queue, not all candidates. analyze_register_chain
>>>> fills queue bitmap to build a chain.
>>>>
>>>>>
>>>>> I will change to
>>>>>
>>>>> /* Check REF's chain to add new insns into a queue. */
>>>>>
>>>>> void
>>>>> timode_scalar_chain::analyze_register_chain (bitmap candidates,
>>>>> df_ref ref)
>>>>> {
>>>>> gcc_assert (bitmap_bit_p (insns, DF_REF_INSN_UID (ref))
>>>>> || bitmap_bit_p (candidates, DF_REF_INSN_UID (ref)));
>>>>> add_to_queue (DF_REF_INSN_UID (ref));
>>>>> }
>>>>>
>>>>
>>>> The purpose of analyze_register_chain is to collect all register defs
>>>> or uses into chain's queue. You can't just remove a loop over ref's
>>>> chain then.
>>>
>>> That is true for DImode conversion. For TImode conversion,
>>> there are no register conversions required:
>>>
>>> https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01394.html
>>>
>>> All it does it to call add_to_queue.
>>
>> No conversion means no mark_dual_mode_def calls in analyze_register_chain
>> but it still may call add_to_queue multiple times in its loop. E.g.
>>
>> 1: r1 = 1
>> ...
>> 5: r1 = 2
>> ..
>> 10: r2 = r1
>>
>> When you add #10 into a chain you should add both #1 and #5 into a queue.
>> That's what analyze_register_chain is supposed to do.
>
> Since both i1 and i5 are candidates, they are added to queue.
By who? They will just go into another chain.
Thanks,
Ilya
>
> --
> H.J.