Bug 68622 - initialization of atomic objects emits unnecessary fences
Summary: initialization of atomic objects emits unnecessary fences
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 6.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2015-11-30 16:19 UTC by Martin Sebor
Modified: 2016-08-05 06:20 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2016-08-05 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Sebor 2015-11-30 16:19:07 UTC
Gcc emits unnecessary fences for expressions involving objects of atomic types that are not (yet) shared across threads.  For example, in the two functions below, the objects are not shared with other threads and thus the assignments to the atomic variables do not require any fences.  Such assignments are commonplace when objects containing atomic variables are being initializing (as in the second function).

In comparison, Clang emits no fences for the functions below.

$ cat t.c && /build/gcc-trunk/gcc/xgcc -B /build/gcc-trunk/gcc -O2 -S -Wall -Wextra -o/dev/tty t.c
int foo (void)
{
    _Atomic int i;
    i = 0;
    return i;
}

struct S {
    _Atomic int i;
    int n;
    char data[];
};

extern void* malloc (__SIZE_TYPE__);

struct S* bar (int n)
{
    struct S *s = malloc (sizeof *s + n);
    s->i = 0;
    return s;
}
	.file	"t.c"
	.machine power8
	.abiversion 2
	.section	".toc","aw"
	.section	".text"
	.align 2
	.p2align 4,,15
	.globl foo
	.type	foo, @function
foo:
	sync
	li 9,0
	stw 9,-16(1)
	sync
	lwz 3,-16(1)
	cmpw 7,3,3
	bne- 7,$+4
	isync
	extsw 3,3
	blr
	.long 0
	.byte 0,0,0,0,0,0,0,0
	.size	foo,.-foo
	.align 2
	.p2align 4,,15
	.globl bar
	.type	bar, @function
bar:
0:	addis 2,12,.TOC.-0b@ha
	addi 2,2,.TOC.-0b@l
	.localentry	bar,.-bar
	mflr 0
	addi 3,3,8
	std 0,16(1)
	stdu 1,-32(1)
	bl malloc
	nop
	sync
	li 10,0
	addi 1,1,32
	stw 10,0(3)
	ld 0,16(1)
	mtlr 0
	blr
	.long 0
	.byte 0,0,0,1,128,0,0,0
	.size	bar,.-bar
	.ident	"GCC: (GNU) 6.0.0 20151125 (experimental)"
	.section	.note.GNU-stack,"",@progbits
Comment 1 Andrew Pinski 2016-08-05 06:08:34 UTC
To do this, GCC needs a pass which changes __atomic_stores to be relaxed stores.

Confirmed.  I wonder how clang does it.
Comment 3 Andrew Pinski 2016-08-05 06:20:21 UTC
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html

This means someone should implement this for GCC.