Missed optimization wrt. constructor clobbers?

Avi Kivity avi@scylladb.com
Wed Dec 7 11:02:00 GMT 2016


On 12/07/2016 12:47 AM, Marc Glisse wrote:
> On Tue, 6 Dec 2016, Avi Kivity wrote:
>
>> Consider the following code
>>
>>
>> === begin code ===
>>
>> #include <experimental/optional>
>>
>> using namespace std::experimental;
>>
>> struct array_of_optional {
>>  optional<int> v[100];
>> };
>>
>> array_of_optional
>> f(const array_of_optional& a) {
>>  return a;
>> }
>>
>> === end code ===
>>
>>
>> Compiling with -O3 (6.2.1), I get:
>>
>>
>> 0000000000000000 <f(array_of_optional const&)>:
>>   0:    48 8d 8f 20 03 00 00     lea    0x320(%rdi),%rcx
>>   7:    48 89 f8                 mov    %rdi,%rax
>>   a:    48 89 fa                 mov    %rdi,%rdx
>>   d:    0f 1f 00                 nopl   (%rax)
>>  10:    c6 42 04 00              movb   $0x0,0x4(%rdx)
>>  14:    80 7e 04 00              cmpb   $0x0,0x4(%rsi)
>>  18:    74 0a                    je     24 <f(array_of_optional 
>> const&)+0x24>
>>  1a:    44 8b 06                 mov    (%rsi),%r8d
>>  1d:    c6 42 04 01              movb   $0x1,0x4(%rdx)
>>  21:    44 89 02                 mov    %r8d,(%rdx)
>>  24:    48 83 c2 08              add    $0x8,%rdx
>>  28:    48 83 c6 08              add    $0x8,%rsi
>>  2c:    48 39 ca                 cmp    %rcx,%rdx
>>  2f:    75 df                    jne    10 <f(array_of_optional 
>> const&)+0x10>
>>  31:    f3 c3                    repz retq
>
> For high-level optimizations, I find it better to look at the file 
> created by compiling with -fdump-tree-optimized.
>

I guess you have to read a few of them to get a feel for it.

>> However, because we're constructing into the return value, we're 
>> under no obligation to leave the memory untouched, so this can be 
>> optimized into a memcpy, which can be significantly faster if the 
>> optionals are randomly engaged; but gcc doesn't notice that.
>
> Feel free to file an enhancement PR in gcc's bugzilla. The easiest is 
> probably to handle it in libstdc++ in the copy constructor, under some 
> conditions (trivially copy constructible and not too large). But some 
> tools might complain about the read from uninitialized memory, even if 
> it is safe.

I think this is too fragile.  For example optional<optional<int>> would 
not benefit from the optimization.

>
> Optimizers could turn
>
> out.engaged=0
> if(in.engaged)
>   out.engaged=1
>
> into out.engaged=in.engaged
>
> but the condition would still be there, and I don't see the optimizers 
> introducing the extra reads/writes, seems unlikely to be added.
>

That's a pity, because the extra writes would make it much faster.

The optimizers do feel free to write to padding holes, no? Clobbered 
memory could be treated as a padding hole.



More information about the Gcc-help mailing list