22634 – [DR 539] partial_sum is too constrained

Bug 22634 - [DR 539] partial_sum is too constrained

Summary: [DR 539] partial_sum is too constrained

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	libstdc++ (show other bugs)
Version:	4.0.1

Importance:	P2 minor
Target Milestone:	4.5.0
Assignee:	Paolo Carlini

URL:
Keywords:

Duplicates (1):	42943 (view as bug list)
Depends on:
Blocks:

Reported:	2005-07-23 20:07 UTC by Marc Schoolderman
Modified:	2010-02-04 18:36 UTC (History)
CC List:	5 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:	2005-12-23 00:07:35

Attachments
See the original post (813 bytes, text/plain) 2005-07-23 20:08 UTC, Marc Schoolderman	Details
Workaround iterator (535 bytes, application/octet-stream) 2005-07-25 09:03 UTC, Chris Jefferson	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Marc Schoolderman 2005-07-23 20:07:47 UTC

Say, we're running std::partial_sum, where the output type is 'wider' than
the input. E.g,

    char in_array[4] = { 96, 96, 96, 96 };
    int  out_array[4];

    partial_sum(in_array, in_array+4, out_array);

This "should" be:

    int out_array[4] = { 96, 96+96, 96+96+96, 96+96+96+96 };

This is a problem since partial_sum obviously uses an accumulator to hold
the intermediate result, but it uses the input iterator's value_type to
determine the type of the accumulator. So the result in the above case will
be an array with the values 96, -64, 32, -128 (with 8bit signed char).

More involved cases, where the input value_type and output_value type are
incompatible, won't even compile.

I have taken the partial_sum from bits/stl_numeric.h (distributed with DJGPP's
version of g++ 4.01) and produced a work-around solution for this which I
think works correctly and is portable. Note that I reformatted the code and
removed the "gxx" bits, so the attached code is intended as a concept only.

Summary; the solution uses template argument type deduction to determine the
type of accumulator to be used.

Comment 1 Marc Schoolderman 2005-07-23 20:08:37 UTC

Created attachment 9338 [details]
See the original post

There is an issue here that this routine assumes that the result of the
operation is CopyConstructible and Assignable. However, this is the same
requirement placed on T in std::accumulate(), and the original routine even
assumed Assignability of the input iterator's value_type, which in fact _IS
NOT_ guaranteed!. This proposal does not assume this (since it doesn't need
it anymore), by also replacing the first assignment by a copy construction.

Comment 2 Wolfgang Bangerth 2005-07-23 21:15:01 UTC

No, the standard says that the result for an iterator 'i' in the output  
range is  
  ((...(*first + *(first + 1)) + ...) + *(first + (i - result)))  
So arithmetic is done with the data type of the input range. That may be 
undesirable on occasion, but that's what the standard says. 
  
W.

Comment 3 Marc Schoolderman 2005-07-23 23:20:53 UTC

(In reply to comment #2)
>   ((...(*first + *(first + 1)) + ...) + *(first + (i - result)))  
> So arithmetic is done with the data type of the input range. That may be 
> undesirable on occasion, but that's what the standard says. 

The standard is too vague on this routine to say such things, which rather is
the problem :)

About the only thing we know is that this routine takes Input Iterators. 
Those have the requirements that '*i' is "convertible to T". The standard
doesn't specify what an operator+ (or binary_op) that is used should look
like. It especially doesn't say it takes arguments of value_type and outputs
a result of value_type.

It does explicitly say that the value_type of an input iterator is not
required to be assignable (see 24.1.1/3), which anyway means the present
implementation isn't strictly conforming since it uses 'value = value + *first'.

My proposal fixes this (quite trivially), while introducing any runtime
inefficiencies.

If the changed behaviour with expanding types is deemed too dangerous
because it introduces a change, the routine can be modified so it produces
identical results as the current partial_sum, by adding explicit type
conversions back to the input iterators value_type. That will at least
solve the problem of assigning to an Input Iterator's value_type.

I will not reopen this bug, but I suggest giving this some more thought.

~Marc.

Comment 4 Wolfgang Bangerth 2005-07-24 00:14:29 UTC

Well, let's have the libstdc++ people comment then :-) 
W.

Comment 5 Marc Schoolderman 2005-07-24 02:31:07 UTC

To clarify a bit; the comments dealing with the lines that read;

  ValueType value ( *first );   // copy construct!

Should be ignored. I wrote them out of fear for situations where 
"T obj(initializer);" works but "T obj = initializer;" doesn't, but this has to
do with implicit conversion, not assignability. The existing 
"ValueType value = *first" is preferable for these two lines.

Comment 6 Gabriel Dos Reis 2005-07-24 03:21:57 UTC

Subject: Re:  partial_sum is too constrained

"bangerth at dealii dot org" <gcc-bugzilla@gcc.gnu.org> writes:

| No, the standard says that the result for an iterator 'i' in the output  
| range is  
|   ((...(*first + *(first + 1)) + ...) + *(first + (i - result)))  
| So arithmetic is done with the data type of the input range. 

However, there is a snag.  In the example submitted by the reporter,
"first" is of type "char*".  So any arithmetic happening, starts by
*promoting* char to int before computation is done.
Consequently, it appears legitimate to expect that intermediate result
being used throughout the accumulation, because it is the natural type
of the result.  

|            What    |Removed                     |Added
| ----------------------------------------------------------------------------
|              Status|UNCONFIRMED                 |RESOLVED
|          Resolution|                            |INVALID

I don't think it is invalid PR.

We may, however,  eithed do The Right Thing; or submit a DR; or do both.

-- Gaby

Comment 7 Gabriel Dos Reis 2005-07-24 03:27:25 UTC

Subject: Re:  partial_sum is too constrained

"squell at alumina dot nl" <gcc-bugzilla@gcc.gnu.org> writes:

| (In reply to comment #2)
| >   ((...(*first + *(first + 1)) + ...) + *(first + (i - result)))  
| > So arithmetic is done with the data type of the input range. That may be 
| > undesirable on occasion, but that's what the standard says. 
| 
| The standard is too vague on this routine to say such things, which rather is
| the problem :)

Indeed.

| About the only thing we know is that this routine takes Input Iterators. 
| Those have the requirements that '*i' is "convertible to T". The standard
| doesn't specify what an operator+ (or binary_op) that is used should look
| like. It especially doesn't say it takes arguments of value_type and outputs
| a result of value_type.
| 
| It does explicitly say that the value_type of an input iterator is not
| required to be assignable (see 24.1.1/3), which anyway means the present
| implementation isn't strictly conforming since it uses 'value = value + *first'.

Yes, the standard requirements for iterators exhibit inconsistencies
at many places; for example an InputIterator is not required (by the
Standard) to be copy-constructible; consequently it cannot be used as
function parameter type.  Which means that things like copy(),
accumulate() and friends are not supposed to work.  Yes, obviously
that is a bug in the standard.  We've found a few of them recently.

BAck to the issue of this PR; I think there is way too vagueness in
the standard and the "obvious" thing to do is to use the natural type
of the intermediate results.  However, we may actually want to submit
a DR (I'll do it).  I would suggest this issue be suspended in the
meantime.

-- Gaby

Comment 8 Paolo Carlini 2005-07-24 09:06:21 UTC

Ok, let's suspend it for now: in the meanwhile I checked that at least another
implementation behaves exactly like GCC, another good reason to wait for feedback
from the committee before taking any action.

Gaby, maybe adjacent_difference should also be in the DR?

Comment 9 Gabriel Dos Reis 2005-07-24 11:35:40 UTC

Subject: Re:  partial_sum is too constrained

"pcarlini at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:

| Gaby, maybe adjacent_difference should also be in the DR?

yes, you're right.

-- Gaby

Comment 10 Marc Schoolderman 2005-07-24 16:42:33 UTC

(In reply to comment #7)

> Yes, the standard requirements for iterators exhibit inconsistencies
> at many places; for example an InputIterator is not required (by the
> Standard) to be copy-constructible; consequently it cannot be used as
> function parameter type.  

Yes, I noticed the standard isn't explicit about that, but isn't
that requirement implied in the semantics for '*r++'?

I read somewhere the LWG believed the requirement to be implicit.
(found: http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html#241)

> BAck to the issue of this PR; I think there is way too vagueness in
> the standard and the "obvious" thing to do is to use the natural type
> of the intermediate results.  However, we may actually want to submit
> a DR (I'll do it).  I would suggest this issue be suspended in the
> meantime.

I agree the paragraph should probably make it explicit what type is being
used. I think there's another subtle error in the Effects clause, since 
dereferencing an input iterator invalidates previous copies, and order of
evaluation is unspecified.

I noticed this also happens in unique_copy. It works on Input Iterators but
its semantics are specified in terms of '*i == *(i-1)' or 'pred(*i, *(i-1))'.

Perhaps this should be part of the DR, or a seperate one?

Comment 11 Gabriel Dos Reis 2005-07-24 17:18:14 UTC

Subject: Re:  partial_sum is too constrained

"squell at alumina dot nl" <gcc-bugzilla@gcc.gnu.org> writes:

| ------- Additional Comments From squell at alumina dot nl  2005-07-24 16:42 -------
| (In reply to comment #7)
| 
| > Yes, the standard requirements for iterators exhibit inconsistencies
| > at many places; for example an InputIterator is not required (by the
| > Standard) to be copy-constructible; consequently it cannot be used as
| > function parameter type.  
| 
| Yes, I noticed the standard isn't explicit about that,

Oh, the standard is quite explicit about that.  Look up the
requirement table input iterators and compare it with other iterators.

| but isn't
| that requirement implied in the semantics for '*r++'?

No.

| I read somewhere the LWG believed the requirement to be implicit.
| (found: http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html#241)

No, that is issue is different.  I'm talking about the
copy-construction of the iterator itself; you're talking about the
copy-construction of the value_type.  Not the same thing.

| > BAck to the issue of this PR; I think there is way too vagueness in
| > the standard and the "obvious" thing to do is to use the natural type
| > of the intermediate results.  However, we may actually want to submit
| > a DR (I'll do it).  I would suggest this issue be suspended in the
| > meantime.
| 
| I agree the paragraph should probably make it explicit what type is being
| used. I think there's another subtle error in the Effects clause, since 
| dereferencing an input iterator invalidates previous copies, and order of
| evaluation is unspecified.
| 
| I noticed this also happens in unique_copy. It works on Input Iterators but
| its semantics are specified in terms of '*i == *(i-1)' or 'pred(*i, *(i-1))'.
| 
| Perhaps this should be part of the DR, or a seperate one?

It should probably be a separate one.  Thanks!

-- Gaby

Comment 12 Chris Jefferson 2005-07-25 09:03:00 UTC

Created attachment 9358 [details]
Workaround iterator

Comment 13 Chris Jefferson 2005-07-25 09:03:22 UTC

I'm not personally 100% sure that this should be "fixed", I've used partial_sum where I've assumed this 
behaviour, and adding the things in the "output type" would have broken...

I've attached the work-around I personally use to this kind of problem, which is wrapping the iterator in 
another iterator which changes the value_type. Feel free to use this code if you like.

Comment 14 Gabriel Dos Reis 2005-07-25 09:51:13 UTC

Subject: Re:  partial_sum is too constrained

"chris at bubblescope dot net" <gcc-bugzilla@gcc.gnu.org> writes:

| I'm not personally 100% sure that this should be "fixed", I've used
| partial_sum where I've assumed this behaviour,

I have codes that naturally fits partial_sum(), but will break if
ported to STL with the current implementation -- and yes, I do have
valid reasons to move the codes to use STL.  And I think a fix is
needed to match the language semantics.  I don't believe in tower of
wrappers where the specification is clearly not precise enough and
arguably needs adjustments.  '2' + '4', has type int.  Period.

-- Gaby

Comment 15 Marc Schoolderman 2005-07-25 14:01:44 UTC

(In reply to comment #13)
> I've attached the work-around I personally use to this kind of problem, which is
> wrapping the iterator in another iterator which changes the value_type. 

I tried this approach as well :) I called it a 'cast_iterator'. However,
writing a full iterator class is involved, especially if you don't want to 
restrict the iterator class to 'input_iterator_tag'.

Second, I felt it was 'writing to the implementation'. I know that libstdc++
uses the 'value_type' to determine the type, but this doesn't mean other
implementations use it! In the end, it is a non-solution.

If you've want the non-expanding behaviour, this will always be possible by
simply using a function object. For example;

   std::partial_sum(begin, end, std::plus<char>())

This is *guaranteed* not to expand.

~Marc.

Comment 16 Marc Schoolderman 2005-07-25 14:04:02 UTC

Mentally fix the typographical errors in that last post. :)

Comment 17 Marc Schoolderman 2006-02-04 12:45:31 UTC

Out of curiosity, I was checking the LWG website; I couldn't find these issues (but then, I don't have inside access). I'm more than willing write a DR for both points mentioned, but I'd hate to duplicate any effort. Comments?

Comment 18 Paolo Carlini 2006-02-04 20:53:39 UTC

(In reply to comment #17)
> Out of curiosity, I was checking the LWG website; I couldn't find these issues
> (but then, I don't have inside access). I'm more than willing write a DR for
> both points mentioned, but I'd hate to duplicate any effort. Comments?

Hi. As far as I know, nobody is actively working on the text of an actual DR and your help would be really appreciated. Thanks in advance!

Comment 19 Paolo Carlini 2009-12-11 17:02:36 UTC

This is now [Ready]:

  http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-active.html#539

and we are already *almost* doing the right thing, besides a std::move call in std::adjacent_difference, which I'm going to add...

Comment 20 paolo@gcc.gnu.org 2009-12-11 17:55:02 UTC

Subject: Bug 22634

Author: paolo
Date: Fri Dec 11 17:54:37 2009
New Revision: 155173

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=155173
Log:
2009-12-11  Paolo Carlini  <paolo.carlini@oracle.com>

	PR libstdc++/22634, DR 539 [Ready]
	* include/bits/stl_numeric.h (adjacent_difference): Use std::move
	at the end of the loop body, per the Ready resolution.
	* include/std/numeric: Do not include unnecessarily <cstddef>.
	* doc/xml/manual/intro.xml: Add an entry for DR 539.

Modified:
    trunk/libstdc++-v3/ChangeLog
    trunk/libstdc++-v3/doc/xml/manual/intro.xml
    trunk/libstdc++-v3/include/bits/stl_numeric.h
    trunk/libstdc++-v3/include/std/numeric

Comment 21 Paolo Carlini 2009-12-11 17:56:31 UTC

Done.

Comment 22 Paolo Carlini 2010-02-04 18:36:34 UTC

*** Bug 42943 has been marked as a duplicate of this bug. ***