This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, libcpp]: Use asm flag outputs in search_line_sse42 main loop
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: Richard Henderson <rth at redhat dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 30 Jun 2015 08:46:27 +0200
- Subject: Re: [PATCH, libcpp]: Use asm flag outputs in search_line_sse42 main loop
- Authentication-results: sourceware.org; auth=none
- References: <CAFULd4bHyQQvSbocbs2ReW+7YonKq4ocJj_AmCwNQx9QT07V7w at mail dot gmail dot com> <559232FD dot 6020105 at redhat dot com>
On Tue, Jun 30, 2015 at 8:11 AM, Richard Henderson <rth@redhat.com> wrote:
> On 06/29/2015 08:07 PM, Uros Bizjak wrote:
>>
>> Index: lex.c
>> ===================================================================
>> --- lex.c (revision 225138)
>> +++ lex.c (working copy)
>> @@ -450,15 +450,30 @@ search_line_sse42 (const uchar *s, const uchar *en
>> s = (const uchar *)((si + 16) & -16);
>> }
>>
>> - /* Main loop, processing 16 bytes at a time. By doing the whole loop
>> - in inline assembly, we can make proper use of the flags set. */
>> - __asm ( "sub $16, %1\n"
>> - " .balign 16\n"
>> + /* Main loop, processing 16 bytes at a time. */
>> +#ifdef __GCC_ASM_FLAG_OUTPUTS__
>> + while (1)
>> + {
>> + char f;
>> + __asm ("%vpcmpestri\t$0, %2, %3"
>> + : "=c"(index), "=@ccc"(f)
>> + : "m"(*s), "x"(search), "a"(4), "d"(16));
>> + if (f)
>> + break;
>> +
>> + s += 16;
>> + }
>
>
> This change looks good. Modulo keeping a comment mentioning why we can't
> use the builtin.
OK, I'll say something like "By using inline assembly instead of the
builtin, we can use the result, as well as the flags set." here,
similar to the moved comment.
>> +#else
>> + s -= 16;
>> + /* By doing the whole loop in inline assembly,
>> + we can make proper use of the flags set. */
>> + __asm ( ".balign 16\n"
>> "0: add $16, %1\n"
>> - " %vpcmpestri $0, (%1), %2\n"
>> + " %vpcmpestri\t$0, (%1), %2\n"
>> " jnc 0b"
>> : "=&c"(index), "+r"(s)
>> : "x"(search), "a"(4), "d"(16));
>> +#endif
>
>
> I do wonder about keeping this bit around. Surely we only really care about
> the performance of search_line after a full bootstrap, at which point we've
> got the new path.
According to [1], " ... the instructions for building libgccjit
recommend --disable-bootstrap, ...". IMO, we could leave this part for
now, it is not that much of a maintenance burden.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66593#c0
Thanks,
Uros.