[PATCH 2/6] RISC-V Port: gcc
Richard Henderson
rth@redhat.com
Tue Jan 31 18:35:00 GMT 2017
On 01/30/2017 05:10 PM, Andrew Waterman wrote:
>>> +(define_expand "clear_cache"
>>> + [(match_operand 0 "pmode_register_operand")
>>> + (match_operand 1 "pmode_register_operand")]
>>> + ""
>>> + "
>>> +{
>>> + emit_insn (gen_fence_i ());
>>> + DONE;
>>> +}")
>>
>>
>> Do you need a FENCE before the FENCE.I?
>
> It's actually not clear to me what the semantics of clear_cache are
> for multiprocessors. Can you shed some light?
>
> If thread A does modifies code then sets a flag, then thread B reads
> the flag and executes a FENCE.I, then thread A needs a FENCE before
> setting the flag and thread B needs a fence before the FENCE.I. But,
> is it not the software's responsibility to insert both fences, rather
> than assuming one of the fences is folded into clear_cache?
Your introduction of "flag" confuses the issue.
Having re-read the description in section 2.7, I see that FENCE.I is
thread-local and is all that is required for a single thread to sync its own I
and D caches. I think perhaps I'd mis-read or mis-remembered before.
Practically speaking, I'm not sure we have put any real thought about what
needs to happen for threads using on-stack trampolines. Certainly no other gcc
port attempts to broadcast the need for an icache flush to other cpus.
So just leave as-is -- __builtin_clear_cache works properly for the local thread.
>>> +(define_insn "call_value_multiple_internal"
>>> + [(set (match_operand 0 "register_operand" "")
>>> + (call (mem:SI (match_operand 1 "call_insn_operand" "l,S"))
>>> + (match_operand 2 "" "")))
>>> + (set (match_operand 3 "register_operand" "")
>>> + (call (mem:SI (match_dup 1))
>>> + (match_dup 2)))
>>
>>
>> Any reason for this? Your return value registers are sequential. The
>> normal thing to do is just use e.g. (reg:TI 10).
>
> I think we'd need different patterns for mixed int/FP struct returns
> (which use a0 and fa0) if we took that approach.
Ah. Other targets get away with using a PARALLEL.
>From sparc.c, function_arg_record_value:
data.ret = gen_rtx_PARALLEL (mode, rtvec_alloc (data.stack + nregs));
and sparc.md:
(define_insn "*call_value_symbolic_sp64"
[(set (match_operand 0 "" "")
(call (mem:DI (match_operand:DI 1 "symbolic_operand" "s"))
(match_operand 2 "" "")))
(clobber (reg:DI O7_REG))]
So you wind up with things like
(set (parallel [
(reg:DI o0)
(reg:DF fp0)
(reg:DI o1)
(reg:DF fp1)
])
(call (mem:DI (symbol_ref:DI "function"))
(const_int 0)))
We do the same for x86_64 -- in function_value_64:
ret = construct_container (mode, orig_mode, valtype, 1,
X86_64_REGPARM_MAX, X86_64_SSE_REGPARM_MAX,
x86_64_int_return_registers, 0);
r~
More information about the Gcc-patches
mailing list