This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][i386] Fix PR83008
> On December 8, 2017 4:51:16 PM GMT+01:00, Jan Hubicka <hubicka@ucw.cz> wrote:
> >>
> >> This restores the vec_construct cost dependence on the vector element
> >> count. Honza removed this (accidentially?) during the rework.
> >>
> >> Bootstrap / regtest running on x86_64-unknown-linux-gnu, ok for
> >trunk?
> >
> >Hmm, the false parameter to ix86_vec_cost is supposed to do that. It
> >uses:
> >
> >if (!parallel)
> >
> >return cost * GET_MODE_NUNITS (mode);
> >
> >
> >Why it doesn't work?
>
> I see. Not exactly the same. I guess we need to see why the costs favor a 16 element vector in it plus store over 16 scalar stores then...
Isn't it what you proposed as profitable for Martin Jambor's copy by pieces
issue? I think the reason is that stores are modeled as having latency (or
cost/2) of 4. Construction is modelled as simple sse op (latency 1)*number of
parts.
So in this simplified model we cummulate latency of 16 stores as 16*4,
while construction plus one store as 16*1+4.
Honza
>
> See the PR.
>
> >
> >Honza
> >>
> >> Thanks,
> >> Richard.
> >>
> >> 2017-12-08 Richard Biener <rguenther@suse.de>
> >>
> >> PR target/83008
> >> * config/i386/i386.c (ix86_builtin_vectorization_cost): Restore
> >> vec_construct dependence on vector element count.
> >>
> >> Index: gcc/config/i386/i386.c
> >> ===================================================================
> >> --- gcc/config/i386/i386.c (revision 255499)
> >> +++ gcc/config/i386/i386.c (working copy)
> >> @@ -44879,7 +44879,8 @@ ix86_builtin_vectorization_cost (enum ve
> >> ix86_cost->sse_op, true);
> >>
> >> case vec_construct:
> >> - return ix86_vec_cost (mode, ix86_cost->sse_op, false);
> >> + return (ix86_vec_cost (mode, ix86_cost->sse_op, false)
> >> + * (TYPE_VECTOR_SUBPARTS (vectype) - 1));
> >>
> >> default:
> >> gcc_unreachable ();