[PATCH] Re: Rewrite i386 string operation expansion

Jan Hubicka jh@suse.cz
Thu Nov 30 01:17:00 GMT 2006


> On 11/29/06, Richard Guenther <richard.guenther@gmail.com> wrote:
> 
> >> > Also crashes for -march=pentium, -march=k6 and -march=nocona.
> >>
> >> Attached patch fixes this failure. Bootstrapped on i686-pc-linux-gnu.
> >>
> >> 2006-11-29  Uros Bizjak  <ubizjak@gmail.com>
> >>
> >>         config/i386/i386.c (decide_alg): For TARGET_INLINE_ALL_STRINGOPS
> >>         initialize alg to algs->size[0].alg.
> >>
> >> OK for mainline?
> >
> >I think using rep_prefix_1_byte in this case is better.  If we don't
> >know the size
> >we should use a small sequence, not whatever would be used for small sizes.
> 
> Actually, this patch just avoids the second assert in the loop, when
> following is defined (pentium4_costs):
> 
>  {{libcall, {{256, rep_prefix_4_byte}, {-1, libcall}}},
>   {libcall, {{256, rep_prefix_4_byte}, {-1, libcall}}}},
>  {{libcall, {{256, rep_prefix_4_byte}, {-1, libcall}}},
>   {libcall, {{256, rep_prefix_4_byte}, {-1, libcall}}}}
> 
> As you can see, for structures > 256 we proceed to a libcall. As the
> default alg is also a libcall, we crash.
> 
> If there is a better alternative, we still use it.

Actually I would probably go for rep_prefix_4_byte/rep_prefix_8_byte
when not optimizing for size and for rep_prefix_1_byte otherwise.  The 
full sized REP instruction is usually significantly faster in most
cases and it is what kernel folks do (and those are about the main users of
full inlining here).

Thanks and my apologizes for this oversight!
Honza



More information about the Gcc-patches mailing list