This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug rtl-optimization/74585] powerpc64: Very poor code generation for homogeneous vector aggregates passed in registers
- From: "wschmidt at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 12 Aug 2016 17:30:10 +0000
- Subject: [Bug rtl-optimization/74585] powerpc64: Very poor code generation for homogeneous vector aggregates passed in registers
- Authentication-results: sourceware.org; auth=none
- Auto-submitted: auto-generated
- References: <bug-74585-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=74585
Bill Schmidt <wschmidt at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|tree-optimization |rtl-optimization
Summary|SRA forces parameters to |powerpc64: Very poor code
|memory causing awful code |generation for homogeneous
|generation |vector aggregates passed in
| |registers
--- Comment #7 from Bill Schmidt <wschmidt at gcc dot gnu.org> ---
One issue is in the expansion logic, specifically in assign_parm_setup_block in
function.c:
/* If a BLKmode arrives in registers, copy it to a stack slot. ... */
The preceding logic handles single-register parameters being placed in a pseudo
without a stack slot, but all aggregates are forced to the stack no matter
what, so that's where
(mem/c:BLK (plus:DI (reg/f:DI 150 virtual-stack-vars)
(const_int 128 [0x80])) [0 A128])
comes from initially. I suspect this generally has to be this way, and
subsequent optimization is supposed to try to use the registers where possible
and eliminate the stores when they go dead. The question is whether the
optimizer is smart enough to recognize that the BIT_FIELD_EXPRs correspond to
vector extract operations that can be performed on the incoming pseudo. I'll
look further.
Revising title/component accordingly...