This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Fwd: Re: [cxx-mem-model] compare_exchange implementation



grr, reposting due to thunderbird apparently sneaking some HTML in and it being rejected... sigh.


On 10/19/2011 11:34 AM, Richard Henderson wrote:
!   weak = expand_expr (CALL_EXPR_ARG (exp, 3), NULL_RTX, ptr_mode,
! 		      EXPAND_NORMAL);
!
!   if (weak != const0_rtx&&  weak != const1_rtx)
!     {
!       error ("strong/weak parameter must be 0 or 1 for %<__atomic_compare_exchange%>");
!       return NULL_RTX;
!     }
Really? Are you even allowed to do that, like unto the variable memory model parameters?

well, they are explicitly 'compare_exchange_strong' and 'compare_exchange_weak' calls... so yes, they have 'hardcoded' one or the other :-)
we could alternatively do 2 separate builtins, but I didn't see the point. But Im ambivalent.
+       /* Emit the compare_and_swap.  */
         create_output_operand (&ops[0], target, QImode);
         create_output_operand (&ops[1], mem, mode);
!       create_output_operand (&ops[2], current_val, mode);
!       create_convert_operand_to (&ops[3], expected_val, mode, true);
!       create_convert_operand_to (&ops[4], desired, mode, true);
!       create_integer_operand (&ops[5], success);
!       expand_insn (icode, 6, ops);
The mem operand must use create_fixed_operand, as with all of the other atomic builtins.
I strongly suggest swapping op 1 and op2, so that all of the "real" outputs come first.
oh, right, my error on the output operand... I will shuffle.

I suggested on IRC that you examine the mode of the boolean output operand, and not
hard-code QImode.  Most targets will produce a word-sized boolean output that need not
be zero-extended.

I clearly misunderstood what you said then, I thought you said use QImode :-P. So, look at target and use that mode? target may not be set sometimes tho right? what do you use then? QImode?
!       emit_cmp_and_jump_insns (ops[0].value, const0_rtx, NE, const0_rtx,
! 			       QImode, 1, true_label);
!
!       /* if not successful copy expected_val into *expected and issue the
! 	 failure fence.  */
!       emit_move_insn (gen_rtx_MEM (mode, expected), current_val);
!       expand_builtin_mem_thread_fence (failure);
!
!       /* If no success fence is required, we're done. Otherwise jump around the
!          code for TRUE and emitthe fence.  */
!       if (true_label != done_label)
!         {
! 	  emit_jump_insn (gen_jump (done_label));
!
! 	  emit_label (true_label);
! 	  if (success != MEMMODEL_RELAXED&&  success != MEMMODEL_RELEASE)
! 	    expand_builtin_mem_thread_fence (success);
! 	}
Really?  We can do better than this.  In particular, by leaving it up to the target.
In the x86 case, cmpxchg is a full barrier.  Always.  No need for any followup barriers.

This is the tricky part. Actually, it is wrong but for a different reason. I don't think the CAS needs a model. If I understand my previous conversation with Lawrence, we need to know what mode the native CAS is. If its seq-cst, then we don't need to do anything else. If it is relaxed though, then additional fences are required. Worst case, if the native CAS is relaxed, the fences need to be issued as such to implement


compare_exchange (T* ptr, T* old, new, success, failure)

   memory_fence (success)         // release modes
   val = *old;
   res = relaxed_CAS (ptr, val, new)
   if (success)
      {
        memory_fence (success);   // acquire modes
        return true;
      }
   *old = res;
   memory_fence (failure)
   return false;


so leaving it up to the target for the CAS isnt enough.. we may need to issue one or more of these fences if the native CAS mode is weaker than the success and/or failure mode.


Or perhaps I misunderstood... Lawrence?

if instead it maps to

   val = *old;
   memory_fence (success)         // release modes
   res = relaxed_CAS (ptr, val, new)
   if (success_of_CAS)
      {
        memory_fence (success);   // acquire modes
        return true;
      }
   memory_fence (failure)
   *old = res;
   return false;

then the CAS could take care of issuing all the required fences without a problem. I'd be very happy with that :-)

The ll/sc targets are going to generate code that looks like

	start-barrier
  restart:
	ll	curval, mem
	mov	t, 0
	cmp	curval, oldval
	bne	fail
	mov	t, newval
	sc	t, mem
	beq	t, restart	// strong-version only
  fail:
	end-barrier

   // outputs: t = 1/0 for success/failure
   //          curval = the current value of the memory, to be stored back into expected

where the position of fail: before or after the end-barrier depends on the relative
strength of the memory models.  No one is going to want to insert an extra jump
around a barrier, especially for a not-expected failure path.

We should encode the two memory model parameters and the weak parameter into a single
CONST_INT operand.  Probably with some macros in optabs.h to extract those parameters
in the targets.


So if the CAS can handle it all, why does it matter if the pattern has a single "compressed" parameter for the 3 values, or whether it simply explicitly lists all three?


So you think its better not to have a seperate CAS_weak and CAS_strong pattern, but to unite them with a parameter? thats certainly easy enough...

Andrew





Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]