This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Re: Rewrite i386 string operation expansion


> On 11/29/06, Richard Guenther <richard.guenther@gmail.com> wrote:
> 
> >> > Also crashes for -march=pentium, -march=k6 and -march=nocona.
> >>
> >> Attached patch fixes this failure. Bootstrapped on i686-pc-linux-gnu.
> >>
> >> 2006-11-29  Uros Bizjak  <ubizjak@gmail.com>
> >>
> >>         config/i386/i386.c (decide_alg): For TARGET_INLINE_ALL_STRINGOPS
> >>         initialize alg to algs->size[0].alg.
> >>
> >> OK for mainline?
> >
> >I think using rep_prefix_1_byte in this case is better.  If we don't
> >know the size
> >we should use a small sequence, not whatever would be used for small sizes.
> 
> Actually, this patch just avoids the second assert in the loop, when
> following is defined (pentium4_costs):
> 
>  {{libcall, {{256, rep_prefix_4_byte}, {-1, libcall}}},
>   {libcall, {{256, rep_prefix_4_byte}, {-1, libcall}}}},
>  {{libcall, {{256, rep_prefix_4_byte}, {-1, libcall}}},
>   {libcall, {{256, rep_prefix_4_byte}, {-1, libcall}}}}
> 
> As you can see, for structures > 256 we proceed to a libcall. As the
> default alg is also a libcall, we crash.
> 
> If there is a better alternative, we still use it.

Actually I would probably go for rep_prefix_4_byte/rep_prefix_8_byte
when not optimizing for size and for rep_prefix_1_byte otherwise.  The 
full sized REP instruction is usually significantly faster in most
cases and it is what kernel folks do (and those are about the main users of
full inlining here).

Thanks and my apologizes for this oversight!
Honza


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]