This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/82399] New: [openacc, nvptx] Optimize complex reduction


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82399

            Bug ID: 82399
           Summary: [openacc, nvptx] Optimize complex reduction
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

Currently reduction updates are implemented like this:
...
/* Emit a sequence to update a reduction accumlator at *PTR with the            
   value held in VAR using operator OP.  Return the updated value.              

   TODO: optimize for atomic ops and indepedent complex ops.  */

static tree
nvptx_reduction_update (location_t loc, gimple_stmt_iterator *gsi,
                        tree ptr, tree var, tree_code op)
{
  tree type = TREE_TYPE (var);
  tree size = TYPE_SIZE (type);

  if (size == TYPE_SIZE (unsigned_type_node)
      || size == TYPE_SIZE (long_long_unsigned_type_node))
    return nvptx_lockless_update (loc, gsi, ptr, var, op);
  else
    return nvptx_lockfull_update (loc, gsi, ptr, var, op);
}
...

This means that for f.i. a complex long long addition we choose the
nvptx_lockfull_update.

The real and the complex part of the addition are independent, so instead we
could call nvptx_lockless_update twice (as the TODO implies).

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]