This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/85341] New: [nvptx] Implement atomic load


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85341

            Bug ID: 85341
           Summary: [nvptx] Implement atomic load
           Product: gcc
           Version: 8.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

[ Follow-up PR of PR84041 - "[nvptx] Hang in for-3.c" ]

Atm the nvptx port does not define an atomic load insn. Consequently, it goes
through the fallback scenario in expand_atomic_load, and ends up generating a
regular load insn combined with a membar.sys memory barrier.

[ Context:

The __atomic_load builtin is defined as:
...
Built-in Function: type __atomic_load_n (type *ptr, int memorder)

    This built-in function implements an atomic load operation. It returns the
contents of *ptr.

    The valid memory order variants are __ATOMIC_RELAXED, __ATOMIC_SEQ_CST,
__ATOMIC_ACQUIRE, and __ATOMIC_CONSUME.
...

The atomic_load insn pattern is described like this (with a local fix applied
for https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00517.html ):
...
‘atomic_loadmode’

    This pattern implements an atomic load operation with memory model
semantics. Operand 1 is the memory address being loaded from. Operand 0 is the
result of the load. Operand 2 is the memory model to be used for the load
operation.

    If not present, the __atomic_load built-in function will resort to a normal
load with memory barriers. 
...
]

If we'd define an atomic_load insn pattern, we could be able to use the pointer
operand to deduce a reduced scope (.gpu or .cta) for the memory barrier.

Say we define memory spaces __global and __shared, then we could used 
membar.gpu for __global and membar.cta for __shared.

Of course, we'd have to annotate libgomp/config/nvptx with the appropriate
namespaces, otherwise we keep generating the same code there.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]