[PATCH v2] generate EH info for volatile asm statements (PR93981)

J.W. Jagersma jwjagersma@gmail.com
Fri Mar 13 16:40:09 GMT 2020


On 2020-03-13 14:31, Richard Sandiford wrote:
> "J.W. Jagersma" <jwjagersma@gmail.com> writes:
>> On 2020-03-12 10:59, Richard Sandiford wrote:
>>> The other case I mentioned was equivalent to:
>>>
>>>      int
>>>      test_mem2 ()
>>>      {
>>>        int i = 2;
>>>        try
>>>          {
>>>            asm volatile ("ud2; mov%z0 $1, %0" : "=m" (i));
>>>          }
>>>        catch (const illegal_opcode&)
>>>          {
>>>            if (i == 2) return 0;
>>>          }
>>>        return i;
>>>      }
>>>
>>> Is this test expected to pass?
>>
>> Good point.  Yes, this should pass, and I thought it did, but it seems
>> I was mistaken.  To fix that would require transforming "=m" into "+m"
>> as Segher suggested.
>>
>>> However, these "mem" tests are testing gimple register types, so they'll
>>> still be SSA names outside of the asm.  It would also be good to test what
>>> happens for non-register types, e.g. something like:
>>>
>>>      int
>>>      test_mem3 ()
>>>      {
>>>        int i[5] = { 1, 2, 3, 4, 5 };
>>>        try
>>>          {
>>>            asm volatile ("ud2; <insert asm here>" : "=m" (i));
>>>          }
>>>        catch (const illegal_opcode&)
>>>          {
>>>            if (i[0] == 1 && ...) return 0;
>>>          }
>>>        return ...;
>>>      }
>>>
>>> and similarly for ud2 after the store.
>>
>> I think I see what you mean.  Would such a test not also cover what the
>> current test_mem() function does?  If so then that could be removed.
> 
> I think it's better to have tests for both the is_gimple_reg_type case
> and the !is_gimple_reg_type case, so keeping test_mem sounds better.

Especially because the initial 'int i = 2;' is currently eliminated
for gimple register types that end up in memory, I see.

>> Also in my current patch I used: (tree-eh.c:2104)
>>
>>     if (tree_could_throw_p (opval)
>>         || !is_gimple_reg_type (TREE_TYPE (opval))
>>         || !is_gimple_reg (get_base_address (opval)))
>>
>> to determine the difference between a register and memory type.  Could
>> there potentially be a case where that identifies an operand as gimple
>> register type, but ends up compiled as a memory operand to an asm?
> 
> The constraints can support both registers and memory, e.g. via "rm"
> or "g".  I'm not sure off-hand what we do with those for gimple.

In gimplify_asm_expr (gimplify.c:6281), I see that:

    if (!allows_reg && allows_mem)
      mark_addressable (TREE_VALUE (link));

So TREE_ADDRESSABLE I think is a good predicate to distinguish between
pure memory constraints and everything else.  I have changed the above
to:

       if (!allows_reg && allows_mem)
-	mark_addressable (TREE_VALUE (link));
+	{
+	  mark_addressable (TREE_VALUE (link));
+	  /* If non-call exceptions are enabled, mark each memory output
+	     as inout, so that previous assignments are not eliminated.  */
+	  if (cfun->can_throw_non_call_exceptions)
+	    is_inout = true;
+	}

So that memory operands are converted to in+out.  Then in
lower_eh_constructs_2 we can use TREE_ADDRESSABLE to find non-memory
operands:

+		if (tree_could_throw_p (opval)
+		    || TREE_ADDRESSABLE (opval))
+		  continue;

(tree_could_throw_p may be superfluous here)
Then I have a new test case that forces an "=rm" into memory, and then
confirm that it behaves the same as "=r":

+int
+rm ()
+{
+  int i = 2;
+  try
+    {
+      asm volatile ("mov%z0 $1, %0; ud2" : "=rm" (i) :
+                    : "eax", "ebx", "ecx", "edx", "edi", "esi"
+#ifdef __x86_64__
+                    , "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15"
+#endif
+                    );
+    }
+  catch (const illegal_opcode&)
+    {
+      if (i == 2) return 0;
+    }
+  return i;
+}

>>> It wasn't clear from my message above, but: I was mostly worried about
>>> requiring the asm to treat memory operands in a certain way when the
>>> exception is thrown.  IMO it would be better to say that the values of
>>> memory operands are undefined on the exception edge.
>>
>> I'm not sure about the semantic difference between undefined and
>> unspecified.  But gcc should not write to any memory after a throw,
>> because that write operation itself may have side effects.  Likewise
>> asm memory operands should not be redirected to a temporary for the
>> same reason, and also because gcc can't possibly know which parts of
>> an (offsettable) memory operand are written to.
> 
> This might not be what you mean, but for:
> 
> int
> f1 (void)
> {
>   int x = 1;
>   asm volatile ("": "=m" (x));
>   return x;
> }
> 
> struct s { int i[5]; };
> struct s
> f2 (void)
> {
>   struct s x = { 1, 2, 3, 4, 5 };
>   asm volatile ("": "=m" (x));
>   return x;
> }
> 
> we do delete "x = 1" for f1.   I think that's the expected behaviour.
> We don't yet delete the initialisation in f2, but I think in principle
> we could.
> 
> So the kind of contract I was advocating was:
> 
> - the compiler can't make any assumptions about what the asm does
>   or doesn't do to output operands when an exception is raised
> 
> - the source code can't make any assumption about the values bound
>   to output operands when an exception is raised
> 
> Thanks,
> Richard
> 

Right.  The compiler technically can't make any assumptions, but I
think it can at least provide consistent behavior.  The user can read
the asm, so they should be able to know exactly what happened when an
exception is thrown.


More information about the Gcc-patches mailing list