Bug 66391 - suboptimal code for assignment of SImode struct with bitfields
Summary: suboptimal code for assignment of SImode struct with bitfields
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 5.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks: bitfield
  Show dependency treegraph
 
Reported: 2015-06-03 06:56 UTC by Paolo Bonzini
Modified: 2023-07-19 04:07 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2015-06-03 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Paolo Bonzini 2015-06-03 06:56:43 UTC
This is caused by early SRA splitting elem's assignment into separate per-field assignments.

struct x {
        unsigned a : 6;
        unsigned b : 26;
};      

int f(struct x *x, unsigned a, unsigned b)
{       
        struct x elem = { .a = a, .b = b };
        int i;
        
        for (i = 0; i < 512; i++)
                x[i] = elem;
}       

Generated code:

.LFB0:
	.cfi_startproc
	leaq	2048(%rdi), %rcx
	andl	$63, %esi
	sall	$6, %edx
	.p2align 4,,10
	.p2align 3
.L2:
	movzbl	(%rdi), %eax
	addq	$4, %rdi
	andl	$-64, %eax
	orl	%esi, %eax
	movb	%al, -4(%rdi)
	movl	-4(%rdi), %eax
	andl	$63, %eax
	orl	%edx, %eax
	movl	%eax, -4(%rdi)
	cmpq	%rcx, %rdi
	jne	.L2
	rep ret
	.cfi_endproc
Comment 1 Andrew Pinski 2021-08-19 00:47:24 UTC
So at -O2 we get decent code from GCC 9+ due to store merging which "undoes" what SRA did.

But at -O3 the loop gets split into two.