[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.
ubizjak at gmail dot com
gcc-bugzilla@gcc.gnu.org
Wed May 4 13:44:00 GMT 2016
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873
Uroš Bizjak <ubizjak at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com
--- Comment #22 from Uroš Bizjak <ubizjak at gmail dot com> ---
Created attachment 38412
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38412&action=edit
Proposed patch
This patch moves all TARGET_SSE_PARTIAL_REG_DEPENDENCY FP conversion splitters
to a later split pass. Plus, the patch substantially cleans these and related
patterns.
The functionality of post-reload conversion splitters goes this way:
- process FP conversions for TARGET_USE_VECTOR_FP_CONVERTS in an early
post-reload splitter. This pass will rewrite FP conversions to vector insns and
is thus incompatible with the next two passes. AMDFAM10 processors depend on
this transformation.
- process FP conversions for TARGET_SPLIT_MEM_OPND_FOR_FP_CONVERTS in a
peephole2 pass. This will transform mem->reg insns to reg->reg insns, and these
insn could be processed by the next pass. Some Intel processors depend on this
transformation.
- process FP conversions for TARGET_SSE_PARTIAL_REG_DEPENDENCY in a late
post-reload splitter, when allocated registers are stable. AMD and Intel
processors depend on this pass, so it is part of generic tuning.
More information about the Gcc-bugs
mailing list