86013 – std::vector::shrink_to_fit() could sometimes use realloc()

Bug 86013 - std::vector::shrink_to_fit() could sometimes use realloc()

Summary: std::vector::shrink_to_fit() could sometimes use realloc()

Status:	RESOLVED INVALID

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	libstdc++ (show other bugs)
Version:	8.1.1

Importance:	P3 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:	missed-optimization

Depends on:
Blocks:

Reported:	2018-05-31 09:48 UTC by Jan Kratochvil
Modified:	2020-01-17 14:05 UTC (History)
CC List:	2 users (show)

See Also:	69879
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:

Attachments
incredibleshrinkingvector.C (1.16 KB, text/plain) 2018-06-05 10:05 UTC, Jan Kratochvil	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Jan Kratochvil 2018-05-31 09:48:05 UTC

std::vector::shrink_to_fit() when reducing the size it still calls new()+copy.
It could use realloc() when the objects are trivially copyable resulting in no copy during size reduction.

Maybe it could even always call realloc() for size reduction of any type of objects and just assert the returned pointer did not change.

Comment 1 Jonathan Wakely 2018-05-31 10:48:52 UTC

(In reply to Jan Kratochvil from comment #0)
> std::vector::shrink_to_fit() when reducing the size it still calls
> new()+copy.
> It could use realloc() when the objects are trivially copyable resulting in
> no copy during size reduction.

In general it can't. It's only valid to do that when we know that the allocator obtained memory from malloc, which we can't know for the default new_allocator (users could have replaced operator new to get memory from somewhere else). It would be OK for malloc_allocator only.

The only reliable way to get realloc-like behaviour is by extending the allocator API to cover it.
 
> Maybe it could even always call realloc() for size reduction of any type of
> objects and just assert the returned pointer did not change.

That suggestion terrifies me.

Comment 2 Jan Kratochvil 2018-05-31 13:08:04 UTC

You apparently know better all the pitfalls, I just got shocked that a squeezing shrink_to_fit() does a copy.

Comment 3 Richard Biener 2018-06-01 08:06:47 UTC

I wonder if technically shrink_to_fit() is allowed to elide the shrinking?  And
yes, I'm equally shocked about the need to copy!

Comment 4 Jonathan Wakely 2018-06-01 08:24:50 UTC

shrink_to_fit is a non-binding request, the implementation is allowed to completely ignore it. But the expected semantics (at least, expected by those who designed the feature) is that it reallocates new storage, copies the elements, and deallocates the old storage. If you don't want that, don't call shrink_to_fit.

I don't see how this qualifies as a missed-optimization when doing it is impossible in the general case.

Comment 5 Marc Glisse 2018-06-01 09:02:55 UTC

(In reply to Jan Kratochvil from comment #0)
> Maybe it could even always call realloc() for size reduction of any type of
> objects and just assert the returned pointer did not change.

I can't find anywhere a guarantee that realloc doesn't move stuff when the new size is smaller than the old.

(In reply to Richard Biener from comment #3)
> I wonder if technically shrink_to_fit() is allowed to elide the shrinking? 
> And yes, I'm equally shocked about the need to copy!

What would be the point of shrink_to_fit() otherwise? It was created as a nicer alternative for people who were copying to a new temporary vector and swapping them.

Comment 6 Jan Kratochvil 2018-06-02 19:18:45 UTC

(In reply to Marc Glisse from comment #5)
> I can't find anywhere a guarantee that realloc doesn't move stuff when the
> new size is smaller than the old.

In practice it does not.


> What would be the point of shrink_to_fit() otherwise? It was created as a
> nicer alternative for people who were copying to a new temporary vector and
> swapping them.

It should be implemented in the most optimal way. ISO C++ standard does not talk about any copying:
https://stackoverflow.com/questions/2664051/why-is-shrink-to-fit-non-binding/2664094

Comment 7 Jonathan Wakely 2018-06-03 19:53:09 UTC

The standard specifically says that if reallocation occurs then iterators and points are invalidated, which is how it talks about copying to new storage.

You are making up an interpretation for shrink_to_fit which is novel, not supported by the standard, and not intended by the designers of the feature.

Comment 8 Jan Kratochvil 2018-06-05 10:05:14 UTC

Created attachment 44236 [details]
incredibleshrinkingvector.C

What's wrong with this implementation? The permitted operator new symbols interposition was suggested by Pavel Labath at: https://reviews.llvm.org/D47492#1122080

Comment 9 Jonathan Wakely 2018-06-06 13:40:37 UTC

You're jumping through arcane hoops just to provide a property which is not guaranteed by the function and was never intended to be supported. If you don't want shrink_to_fit to reallocate then stop calling it.

Comment 10 Jonathan Wakely 2018-06-06 13:48:57 UTC

is_trivially_move_constructible is the wrong trait, trivially_copyable is necessary to determine if memcpy/memmove (and so presumably realloc) are safe.

Does your &n!=&alias##n check still work if operator new is replaced in a different translation unit, but the default one is the only one in scope in the current TU?

Comment 11 Jan Kratochvil 2018-06-06 14:33:37 UTC

(In reply to Jonathan Wakely from comment #10)
> Does your &n!=&alias##n check still work if operator new is replaced in a
> different translation unit, but the default one is the only one in scope in
> the current TU?

alias##n is expected to be a visibility-hidden symbol of libstdc++ so there will be no ld.so relocation.
n will be relocated by ld.so as R_X86_64_GLOB_DAT to the real function wherever it comes from in the final process. It will not point to the PLT slot of libstdc++.

ISO C++ does not specify the replacement specifics, they say only "displaces the default version defined by the C ++ standard library".
Here they talk about them:
  http://en.cppreference.com/w/cpp/memory/new/operator_new#Global_replacements

I expect replacing an allocator needs to be done globally for a process as one may pass allocated objects ownership between program/libraries. That is typically from the main program or by LD_PRELOAD. Although cppreference talks about visibility so I guess they think replacing 'operator new' only in one ELF image (program / shared library) with -fvisibility=hidden is also valid. They talk about undefined behavior if the replacement is specific only to some TU which I also find obvious.

Comment 12 Jonathan Wakely 2018-06-06 14:45:57 UTC

That page says "a user-provided non-member function with the same signature defined anywhere in the program, in any source file, replaces the default version. Its declaration does not need to be visible."

There's no bug here, I don't think this can be done sensibly.

Comment 13 Jan Kratochvil 2018-06-06 14:59:40 UTC

Why the example code should not work with "a user-provided non-member function with the same signature defined anywhere in the program, in any source file, replaces the default version. Its declaration does not need to be visible."?

I would do the real libstdc++ sources patch as that would comply with these requirements but I wanted to know first if there are some technical reasons why it gets rejected, so far I do not see any.