-fprofile-update=atomic vs. 32-bit architectures
Sebastian Huber
sebastian.huber@embedded-brains.de
Tue Dec 6 13:11:54 GMT 2022
On 05/12/2022 08:44, Richard Biener wrote:
> On Mon, Dec 5, 2022 at 8:26 AM Sebastian Huber
> <sebastian.huber@embedded-brains.de> wrote:
>> On 08/11/2022 11:25, Richard Biener wrote:
>>>> It would be great to have a code example for the construction of the "if
>>>> (f()) f();".
>>> I think for the function above we need to emit __atomic_fetch_add_8,
>>> not the emulated form because we cannot insert the required control
>>> flow (if (f()) f()) on an edge. The __atomic_fetch_add_8 should then be
>>> lowered after the instrumentation took place.
>> Would it help to change the
>>
>> if (__atomic_add_fetch_4 ((unsigned int *) &val, 1, __ATOMIC_RELAXED)
>> == 0)
>> __atomic_fetch_add_4 (((unsigned int *) &val) + 1, 1,
>> __ATOMIC_RELAXED);
>>
>> into
>>
>> unsigned int v = __atomic_add_fetch_4 ((unsigned int *) &val, 1,
>> __ATOMIC_RELAXED)
>> == 0)
>> v = (unsigned int)(v == 0);
>> __atomic_fetch_add_4 (((unsigned int *) &val) + 1, 1,
>> __ATOMIC_RELAXED);
> that's supposed to add 'v' instead of 1? Possibly use uint32_t here
> (aka uint32_type_node).
>
>> to get rid of an inserted control flow?
> That for sure wouldn't require any changes to how the profile
> instrumentation works,
> so yes it would be simpler.
Yes, this seems to work. After a bit of trial and error I ended up with
something in gimple_gen_edge_profiler() like this (endian support is
missing):
else if (flag_profile_update == PROFILE_UPDATE_SPLIT_ATOMIC)
{
tree addr = tree_coverage_counter_addr (GCOV_COUNTER_ARCS, edgeno);
tree f = builtin_decl_explicit (BUILT_IN_ATOMIC_ADD_FETCH_4);
gcall *stmt1 = gimple_build_call (f, 3, addr, one,
build_int_cst (integer_type_node,
MEMMODEL_RELAXED));
tree low = create_tmp_var (uint32_type_node);
gimple_call_set_lhs (stmt1, low);
tree is_zero = create_tmp_var (boolean_type_node);
gassign *stmt2 = gimple_build_assign (is_zero, EQ_EXPR, low,
build_zero_cst (uint32_type_node));
tree high_inc = create_tmp_var (uint32_type_node);
gassign *stmt3 = gimple_build_assign (high_inc, COND_EXPR, is_zero,
build_one_cst (uint32_type_node),
build_zero_cst (uint32_type_node));
tree addr_high = create_tmp_var (TREE_TYPE (addr));
gassign *stmt4 = gimple_build_assign (addr_high, addr);
gassign *stmt5 = gimple_build_assign (addr_high, POINTER_PLUS_EXPR,
addr_high,
build_int_cst (size_type_node, 4));
gcall *stmt6 = gimple_build_call (f, 3, addr_high, high_inc,
build_int_cst (integer_type_node,
MEMMODEL_RELAXED));
gsi_insert_on_edge (e, stmt1);
gsi_insert_on_edge (e, stmt2);
gsi_insert_on_edge (e, stmt3);
gsi_insert_on_edge (e, stmt4);
gsi_insert_on_edge (e, stmt5);
gsi_insert_on_edge (e, stmt6);
}
It can be probably simplified. The generated code:
.type f, @function
f:
lui a4,%hi(__gcov0.f)
li a3,1
addi a4,a4,%lo(__gcov0.f)
amoadd.w a5,a3,0(a4)
lui a4,%hi(__gcov0.f+4)
addi a5,a5,1
seqz a5,a5
addi a4,a4,%lo(__gcov0.f+4)
amoadd.w zero,a5,0(a4)
li a0,3
ret
looks good for this code:
int f(void)
{
return 3;
}
The loading of the high address could be probably optimized from
lui a4,%hi(__gcov0.f+4)
addi a4,a4,%lo(__gcov0.f+4)
to
addi a4,a4,4
I wasn't able to figure out how to do this.
--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.huber@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax: +49-89-18 94 741 - 08
Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/
More information about the Gcc
mailing list