Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[iterator.operations] std::distance may be missing a precondition #6822

Closed
Eisenwave opened this issue Feb 25, 2024 · 10 comments
Closed

[iterator.operations] std::distance may be missing a precondition #6822

Eisenwave opened this issue Feb 25, 2024 · 10 comments

Comments

@Eisenwave
Copy link
Contributor

Preconditions: last is reachable from first, or InputIterator meets the Cpp17RandomAccessIterator requirements and first is reachable from last.

https://eel.is/c++draft/iterator.operations#lib:distance

How come there is no precondition that the distance can be represented using the result type? There are a number of ways that it can happen:

  • For InputIterator, if the step count exceeds the maximum value of difference_type.
  • For pointers, if std::ptrdiff_t cannot represent the pointer difference (last - first).
  • For RandomAccessIterators in general, if the precondition fails that an iterator difference can be represented.

How can the effect be (last - first) for random access iterators, but this isn't always well-defined, and it isn't included in the Preconditions paragraph for this function?

Is this a LWG defect, an editorial issue, or a non-issue and I'm just missing something obvious?

@Eisenwave Eisenwave changed the title [itertaor.operations] std::distance may be missing a precondition [iterator.operations] std::distance may be missing a precondition Feb 25, 2024
@Eisenwave
Copy link
Contributor Author

Another thing that strike me as odd is that max_size() for containers doesn't have preconditions.

If

  • max_size() is defined as distance(begin(), end()), and
  • distance(begin(), end()) is defined in terms of (end() - begin()), but
  • this difference is possibly undefined behavior

... doesn't that imply that max_size() can also be undefined behavior? Isn't it missing a Preconditions paragraph then?

@jwakely
Copy link
Member

jwakely commented Feb 25, 2024

There's nothing editorial here. Please file an LWG issue for std::distance. I don't see a problem for max_size though. Any precondition for that seems to be a requirement on the implementation: don't be stupid. It would be silly for an implementation to define the container's iterator's difference type to be unable to represent the largest container. Or in other words, max_size should be defined so that max_size is always valid and has no precondition. Allowing a container to have a maximum size that can't be returned by max_size makes no sense.

I don't know why the core language allows ptrdiff_t to be unable to represent the size of the largest possible array, that just seems to complicate things unnecessarily. In sensible implementations the maximum array size is limited so that ptrdiff_t can always hold the number of elements.

@Eisenwave
Copy link
Contributor Author

There's nothing editorial here. Please file an LWG issue for std::distance.

Alright, thanks, I've done that. I wanted to make sure this is non-editorial because I don't 100% understand the conventions behind Preconditions yet; there would have been a chance that it's all OK as is.

It would be silly for an implementation to define the container's iterator's difference type to be unable to represent the largest container.

I mean sure, it would be stupid. I was just curious whether it's technically permitted in which case max_size() may be UB, even if an implementation would never do that. Normally I don't care about malicious implementations, but here it makes or breaks whether a max_size() precondition is needed, so it has impact on wording.

I don't know why the core language allows ptrdiff_t to be unable to represent the size of the largest possible array, that just seems to complicate things unnecessarily.

Presumably, it would mean that on an 8-bit architecture with 8-bit pointers, you can have 200-byte arrays. The lack of pointer difference guarantees doubles the maximum object size.

@Eisenwave
Copy link
Contributor Author

Eisenwave commented Feb 25, 2024

Any precondition for that seems to be a requirement on the implementation: don't be stupid.

If you think that, then it may be better to rephrase it along the lines of:

-Returns: distance(begin(), end()) for the largest possible container.
+Returns: The size of the largest possible container,
+which shall be low enough to ensure that distance(begin(), end()) is well-defined.

If you really think this is already implied by the lack of preconditions, this would be purely editorial. However, it doesn't feel very editorial to me.

@jwakely
Copy link
Member

jwakely commented Feb 25, 2024

It would be silly for an implementation to define the container's iterator's difference type to be unable to represent the largest container.

I mean sure, it would be stupid. I was just curious whether it's technically permitted in which case max_size() may be UB, even if an implementation would never do that.

The implementation isn't actually required to construct the largest possible container and then call distance(begin(), end()). It can just return the value. And if the value can't be returned, the implementation is broken. We don't need a precondition saying "this function is not undefined", and that wouldn't be a precondition on the user anyway.

Normally I don't care about malicious implementations, but here it makes or breaks whether a max_size() precondition is needed, so it has impact on wording.

There cannot be a precondition for max_size. How would you ever verify that precondition to be sure it was safe to call it? You would have to check that the difference type could represent the return value of the function you want to call, a value which you can't know without calling it.

Presumably, it would mean that on an 8-bit architecture with 8-bit pointers, you can have 200-byte arrays. The lack of pointer difference guarantees doubles the maximum object size.

You can have that anyway, use a 16-bit types for ptrdiff_t (int must be at least 16 bits anyway).

@Eisenwave
Copy link
Contributor Author

You can have that anyway, use a 16-bit types for ptrdiff_t (int must be at least 16 bits anyway).

Yeah fair point; perhaps a more relevant example would be ptrdiff_t on 32-bit architectures. You may want object sizes of more than 2GiB but not require a 33-bit ptrdiff_t. Or similarly, it's plausible that you wouldn't want such multi-precision pointer differences on 16-bit architectures.

Is GCC actually limited to 32 KiB on 16-bit targets and 2 GiB on 32-bit targets, or does it leverage this additional freedom? I'd find it surprising if no C++ compiler ever made use of this.

@Eisenwave
Copy link
Contributor Author

@Eisenwave
Copy link
Contributor Author

There cannot be a precondition for max_size. How would you ever verify that precondition to be sure it was safe to call it? You would have to check that the difference type could represent the return value of the function you want to call, a value which you can't know without calling it.

You can check whether difference_type is at least one bit wider than the container's size_type or iterator. This would have false negatives though.

@frederick-vs-ja
Copy link
Contributor

Any precondition for that seems to be a requirement on the implementation: don't be stupid.

If you think that, then it may be better to rephrase it along the lines of:

-Returns: distance(begin(), end()) for the largest possible container.
+Returns: The size of the largest possible container,
+which shall be low enough to ensure that distance(begin(), end()) is well-defined.

If you really think this is already implied by the lack of preconditions, this would be purely editorial. However, it doesn't feel very editorial to me.

It seems that we shouldn't specify c.size() and c.max_size() in term of std::distance. IMO we should just say that c.size() returns "the number of elements in the container" and c.max_size() returns "c.size() for the largest possible container".

Perhaps distance(begin(), end()) and the number of elements should be mentioned before, and the equality should only be required when distance(begin(), end()) is well-defined.

@Eisenwave
Copy link
Contributor Author

No editorial work required (but normative work needed https://cplusplus.github.io/LWG/issue4055), I consider this done. max_size() is a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants