This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFA] Zen tuning part 9: Add support for scatter/gather in vectorizer costmodel


> > Those instructions seems similarly expensive in Intel implementation.
> > http://users.atw.hu/instlatx64/GenuineIntel0050654_SkylakeXeon9_InstLatX64.txt
> > lists latencies ranging from 18 to 32 cycles.
> > 
> > Of course it may also be the case that the utility is measuring gathers incorrectly.
> > according to Agner's table Skylake has optimized gathers, they used to be
> > 12 to 34 uops on haswell and are no 4 to 5.
> > > 
> > > > > Note the most major source of impreciseness in the cost model
> > > > > is from vec_perm because we lack the information of the
> > > > > permutation mask which means we can't distinguish between
> > > > > cross-lane and intra-lane permutes.
> > > > 
> > > > Besides that we lack information about what operation we do (addition
> > > > or division?) which may be useful to pass down, especially because we do
> > > > have relevant information handy in the x86_cost tables.  So I am thinking
> > > > of adding extra parameter to the hook telling the operation.
> > > 
> > > Not sure.  The costs are all supposed to be relative to scalar cost
> > > and I fear we get nearer to a GIGO syndrome when adding more information
> > > here ;)
> > 
> > Yep, however there is setup cost (like loads/stores) which comes into game
> > as well.  I will see how far i can get by making x86 costs more "realistic"
> 
> I think it should be always counting the cost of n scalar loads plus
> an overhead depending on the microarchitecture.  As you say we're
> not getting rid of any memory latencies (in the worst case).  From
> Agner I read Skylake optimized gathers down to the actual memory
> access cost, the overhead is basically well hidden.

Where did you find it? It does not seem to quite match the instruction latency table
above.

Honza
> 
> Richard.
> 
> -- 
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]