This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/82399] New: [openacc, nvptx] Optimize complex reduction
- From: "vries at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Mon, 02 Oct 2017 11:53:17 +0000
- Subject: [Bug target/82399] New: [openacc, nvptx] Optimize complex reduction
- Auto-submitted: auto-generated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82399
Bug ID: 82399
Summary: [openacc, nvptx] Optimize complex reduction
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
Target Milestone: ---
Currently reduction updates are implemented like this:
...
/* Emit a sequence to update a reduction accumlator at *PTR with the
value held in VAR using operator OP. Return the updated value.
TODO: optimize for atomic ops and indepedent complex ops. */
static tree
nvptx_reduction_update (location_t loc, gimple_stmt_iterator *gsi,
tree ptr, tree var, tree_code op)
{
tree type = TREE_TYPE (var);
tree size = TYPE_SIZE (type);
if (size == TYPE_SIZE (unsigned_type_node)
|| size == TYPE_SIZE (long_long_unsigned_type_node))
return nvptx_lockless_update (loc, gsi, ptr, var, op);
else
return nvptx_lockfull_update (loc, gsi, ptr, var, op);
}
...
This means that for f.i. a complex long long addition we choose the
nvptx_lockfull_update.
The real and the complex part of the addition are independent, so instead we
could call nvptx_lockless_update twice (as the TODO implies).