This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PR91598] Improve autoprefetcher heuristic in haifa-sched.c


Hi Maxim,
 
 >  It appears that cores with autoprefetcher hardware prefer loads and stores bundled together, not interspersed with > other instructions to occupy the rest of CPU units.
  
 I don't believe it is as simple as that - modern cores have multiple prefetchers but
 won't prefer bundling loads and stores in large blocks. That would result in terrible
 performance due to dispatch and issue stalls. Also the increased register pressure
 could cause extra spilling. If we group loads and stores, we'd definitely need to
 limit them to say 4 or so at most, and then interleave ALU operations.
 
  > Autoprefetching heuristic is enabled only for cores that support it, and isn't active for by default.
  
 It's enabled on most cores, including the default (generic). So we do have to be
 careful that this doesn't regress any other benchmarks or do worse on modern
 cores.
 
 Cheers,
 Wilco
  
     

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]