This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] nvptx per-warp compiler-defined stacks (-msoft-stack)


On 05/24/16 17:29, Alexander Monakov wrote:
On Fri, 20 May 2016, Nathan Sidwell wrote:
ah,  that's much more understandable,  thanks.  Presumably this doesn't
support worker-single mode (in OpenACC parlance, I don't know what the OpenMP
version of that is?)

I don't see why you have concerns.

Because I want to understand it.

It seems like we should reject the combination of -msoft-stack -fopenacc?

Possibly; the doc text makes it explicit that the option is exposed only for
the purpose of testing the compiler, anyway.

It is always best to prevent the user doing something you don't recommend, rather than presume they'll be sensible.

+  else if (need_frameptr || cfun->machine->has_varadic ||
cfun->calls_alloca)
+    {
+      /* Maintain 64-bit stack alignment.  */

This block needs a more descriptive comment -- it appears to be doing a great
deal more than maintaining 64-bit stack alignment!

The comment is just for the line that follows, not the whole block.

It's the only comment in the block and right at the start.   Fix it.

+      int keep_align = BIGGEST_ALIGNMENT / BITS_PER_UNIT;
+      sz = ROUND_UP (sz, keep_align);
+      int bits = POINTER_SIZE;
+      fprintf (file, "\t.reg.u%d %%frame;\n", bits);
+      fprintf (file, "\t.reg.u32 %%fstmp0;\n");
+      fprintf (file, "\t.reg.u%d %%fstmp1;\n", bits);
+      fprintf (file, "\t.reg.u%d %%fstmp2;\n", bits);

Some of these register names appear to be long lived -- and referenced in
other functions.  It would be better to give those more descriptive names, or
even give them hard-regs.

That's just %fstmp2 (pointer into __nvptx_stacks) and %fstmp1 (previous stack
pointer that we need to restore). I can rename them to %ssloc and %ssold
(better names welcome), but I don't see a value in making them hard-regs --
there's no interface with the middle-end that would be interested in those.

Making them hard regs would elide the need to keep naming them in multiple places. Sure, you'll have to use some #define symbol to index the array, but then you'll get a build time error if there's a mismatch. You can also put a comment on that #define saying what it is in an obvious place.

The lack of comments means it's not clear to me what ssold is -- is it the caller's value of the slot?
%sspslot for the address of the __soft_stack array element.
%sspprev for the caller'svalue (if that is indeed what you mean by ssold)

You should  certainly  do so for those that are already hard regs (%frame &
%stack)

Sorry, do what? They are already hard regs, and have descriptive names.

Use the hard reg name array, not duplicated string constants.

-- is it more feasible to augment init_frame to initialize them?

I don't think so. The whole block could be moved to a separate function though.

That would be acceptable too.

+  bool using_softstack; /* Need to restore __nvptx_stacks[tid.y].  */

Comment should describe what the attribute is, not what it implies.  In this
case I think it's /*  Current function has   a soft stack frame.  */
>
> Yes; note it's false when current function is leaf, so the description should
> be more like "Current function has a soft stack frame that needs restoring".

Ok. so the name doesn't match the semantics. I suggest renaming it to 'restore_softstack', with a similarly adjusted comment.


Is this alloca related (UNSPEC_ALLOCA) or restore related (invoked in
restore_stack_block), or stack setting (as insn name suggests).  Things seem
inconsistently named.  Comments would be good.

OK, I'll add some in a respin. This is related to stack setting. I can add a
new UNSPEC for that (UNSPEC_SET_SOFTSTACK).

Thanks -- although integers are bounded in our computery world, it's not like there's a shortage here!

Only restore_stack_block seems defined, and not save_stack_block.  Why the
asymmetry?

Because no special handling is needed on saving. We need to update the pointer
in shared memory when stack pointer is moved (thus, in allocate_stack and
restore_stack_block), but when the stack pointer is copied to a save area we
don't need to do anything special.

Thanks -- lack of comments here really impedes understanding.

Are changes needed in nvptx's  crt0.s (or an auxilary file in libc or libgcc)
to make this work for testing?

Yes.

Please include that chunk in the next revision of this patch. But see the commit I just made. I intend to move the malloc stuff to newlib next, but that shouldn't affect the changes needed for this.


I saw some changes due to tests passing when they otherwise wouldn't (e.g.
changing an auto variable with an atomic op). There are also changes in either
direction due to PTX translation bugs in the CUDA driver.

It'd be good to know what that delta is.

nathan


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]