This is the mail archive of the
mailing list for the GCC project.
Re: [EXT] Re: GCC missing -flto optimizations? SPEC lbm benchmark
On Fri, 2019-02-15 at 17:48 +0800, Jun Ma wrote:
> ICC is doing much more than GCC in ipo, especially memory layout
> optimizations. See https://software.intel.com/en-us/node/522667.
> ICC is more aggressive in array transposition/structure splitting
> /field reordering. However, these optimizations have been removed
> from GCC long time ago.
> As for case lbm_r, IIRC a loop with memory access which stride is 20 is
> most time-consuming. ICC will optimize the array(maybe structure?)
> and vectorize the loop under ipo.
Interesting. I tried using '-qno-opt-mem-layout-trans' on ICC
along with '-Ofast -ipo' and that had no affect on the speed. I also
tried '-no-vec' and that had no affect either. The only thing that
slowed down ICC was '-ip-no-inlining' or '-fno-inline'. I see that
'-Ofast -ipo' resulted in everything (except libc functions) getting
inlined into the main program when using ICC. GCC did not do that, but
if I forced it to by using the always_inline attribute, GCC could
inline everything into main the way ICC does. But that did not speed
up the GCC executable.