[Bug target/104271] [12 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

lili.cui at intel dot com gcc-bugzilla@gcc.gnu.org
Tue Mar 29 06:48:30 GMT 2022


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271

--- Comment #7 from cuilili <lili.cui at intel dot com> ---
Created attachment 52706
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52706&action=edit
Add a heuristic for eliminate redundant load and store in inline pass.

Hi Richard,

Could you help take a look? This is my first time adding code in mid-end, hope
you can give me some advice, thank you!

I add a INLINE_HINT_eliminate_load_and_store hint in to inline pass. when
callee's memory access is caller's local memory parameter and access size is
greater than the target threshold, we will enable the hint. with the hint,
inlining_insns_auto will enlarge the bound. The target hook is only enabled for
x86 now.

With the patch applied
Icelake server: 538.imagic_r get 15.18% improvement for multicopy and 40.78%
improvement for single copy with no measurable changes for other benchmarks.

Casecadelake: 538.imagic_r get 12.4% improvement for multicopy with and code
size increased by 0.4%. With no measurable changes for other benchmarks.

Znver3 server: 538.imagic_r get 9.6% improvement for multicopy with and code
size increased by 0.5%. With no measurable changes for other benchmarks.


More information about the Gcc-bugs mailing list