This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug libstdc++/84170] New: std::find_if performance issues
- From: "barannikov88 at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Thu, 01 Feb 2018 17:20:57 +0000
- Subject: [Bug libstdc++/84170] New: std::find_if performance issues
- Auto-submitted: auto-generated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84170
Bug ID: 84170
Summary: std::find_if performance issues
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: libstdc++
Assignee: unassigned at gcc dot gnu.org
Reporter: barannikov88 at gmail dot com
Target Milestone: ---
Hi,
There are two internal implementations of std::find_if in bits/stl_algo.h. One
is for the input iterator case, the other is for the random access iterator
case. They are almost identical, except that the second is somewhat
"optimized".
> /// This is an overload used by find algos for the Input Iterator case.
> template<typename _InputIterator, typename _Predicate>
> inline _InputIterator
> __find_if(_InputIterator __first, _InputIterator __last,
> _Predicate __pred, input_iterator_tag)
> {
> while (__first != __last && !__pred(__first))
> ++__first;
> return __first;
> }
>
> /// This is an overload used by find algos for the RAI case.
> template<typename _RandomAccessIterator, typename _Predicate>
> _RandomAccessIterator
> __find_if(_RandomAccessIterator __first, _RandomAccessIterator __last,
> _Predicate __pred, random_access_iterator_tag)
> {
> typename iterator_traits<_RandomAccessIterator>::difference_type
> __trip_count = (__last - __first) >> 2;
>
> for (; __trip_count > 0; --__trip_count)
> {
> if (__pred(__first))
> return __first;
> ++__first;
>
> if (__pred(__first))
> return __first;
> ++__first;
>
> if (__pred(__first))
> return __first;
> ++__first;
>
> if (__pred(__first))
> return __first;
> ++__first;
> }
>
> switch (__last - __first)
> {
> case 3:
> if (__pred(__first))
> return __first;
> ++__first;
> case 2:
> if (__pred(__first))
> return __first;
> ++__first;
> case 1:
> if (__pred(__first))
> return __first;
> ++__first;
> case 0:
> default:
> return __last;
> }
> }
Manual unrolling increases the code size and reduces the chances of this
function to be inlined. Inlining of this function is critical because it can
contain indirect calls to the predicate function, and inlining can change this
calls to direct improving performance.
I would suggest to revert manual unrolling and let the compiler decide which
optimization steps should be taken.