Give std::optional Range Support

Document number: P3168R1
Date: 2024-04-11
Authors: Marco Foco <marco.foco@gmail.com>, Darius Neațu <dariusn@adobe.com>, Barry Revzin <barry.revzin@gmail.com>, David Sankel <dsankel@adobe.com>
Audience: Library Evolution

Abstract

In P1255R12: A view of 0 or 1 elements: views::maybe, Steve Downey explores the benefits of an optional type that models the range concept and concludes that a new optional type with range support, std::views::maybe, should be introduced into the standard library. This paper explores the design alternative where, instead of introducing a new type, std::optional is made into a range. Observing that this improves usage examples and is dramatically more straightforward, we recommend this approach over that of P1255R12.

// A person's attributes (e.g., eye color). All attributes are optional.
class Person {                        
    /* ... */                               
public:                               
    optional<string> eye_color() const;   
};                                    
    
vector<Person> people = ...;         
P1255R12This Proposal
// Compute eye colors of 'people'.
vector<string> eye_colors = people
  | views::transform(&Person::eye_color)
  | views::transform(views::nullable)
  | views::join
  | ranges::to<set>()
  | ranges::to<vector>();
// Compute eye colors of 'people'.
vector<string> eye_colors = people
  | views::transform(&Person::eye_color)
  // no extra wrapping necessary
  | views::join
  | ranges::to<set>()
  | ranges::to<vector>();

Changelog

Introduction

std::optional<T>, a class template that "may or may not store a value of type T in its storage space" was introduced by Fernando Cacciola and Andrezej Krzemieński's A proposal to add a utility class to represent optional objects (N3793). This feature was incorporated into C++17 and has since seen frequent use in data members, function parameters, and function return types.

Ranges were introduced into C++20 with Eric Niebler et al.'s The One Ranges Proposal (P0896R4) as a set of standard library concepts and algorithms that simplify algorithm composition and usage.

As usage experience of both std::optional and ranges has grown, it became apparent that treating std::optional objects as ranges is quite useful, especially in algorithmic contexts. Steve Downey in A view of 0 or 1 elements: views::maybe (P1255R12) showcases many examples where algorithm readability improves when std::optional objects are treated as ranges.

Downey proposed two facilities that integrate std::optional and ranges:

  1. views::nullable. A range adapter producing a view over std::optional objects and, more generally, any dereferencable object with a bool conversion.
  2. views::maybe. A data type with the same semantics as std::optional with the some key interface differences. We will discuss these differences a little later.

Downey suggests std::maybe_view be used instead of std::optional for return types and other situations where "the value will have operations applied if present, and ignored otherwise." Furthermore, Downey claims "std::optional is filling too many roles" and is not "directly safe", unlike std::maybe_view.

While we support views::nullable, we find the addition of the std::maybe_view contrary to the principles of simplicity and genericity. Instead, we propose making std::optional a range.

This paper highlights the principles behind our design alternative, outlines the drawbacks of adding a new std::optional-like type, surveys existing practice, and demonstrates how our proposal simplifies all of P1255R12's examples.

Principles

Two design principles directly apply to the rangification of std::optional: genericity and simplicity.

Alex Stepanov, the progenitor of much of the C++ standard library, repeatedly emphasized the importance of genericity for software reuse. In Fundamentals of Generic Programming he and James C. Dehnert wrote:

Generic programming recognizes that dramatic productivity improvements must come from reuse without modification, as with the successful libraries. Breadth of use, however, must come from the separation of underlying data types, data structures, and algorithms, allowing users to combine components of each sort from either the library or their own code. Accomplishing this requires more than just simple, abstract interfaces – it requires that a wide variety of components share the same interface so that they can be substituted for one another. It is vital that we go beyond the old library model of reusing identical interfaces with pre-determined types, to one which identifies the minimal requirements on interfaces and allows reuse by similar interfaces which meet those requirements but may differ quite widely otherwise. Sharing similar interfaces across a wide variety of components requires careful identification and abstraction of the patterns of use in many programs, as well as development of techniques for effectively mapping one interface to another.

Simplicity, the second applicable principle, makes C++ easier to learn and use, increasing its applicability to different problem domains. The direction group in Direction for ISO C++ (P2000R4) sees "C++ in danger of losing coherency due to proposals based on differing and sometimes mutually contradictory design philosophies and differing stylistic tastes." and David Sankel highlights the impact on training costs in C++ Should Be C++.

With genericity and simplicity in mind, let us now investigate the proposed std::maybe_view<T>. In essence, it "may or may not store a value of type T in its storage space". The wording used here is notably identical to that of the proposal that introduced optional[1]. This duplication alone is cause for concern due to the complexity of having multiple standardized types representing the same thing.

Indeed, the interfaces of the two types have a substantial amount in common. In the table below, let V be some type, Opt be either optional<V> or maybe_view<V>, o and o2 be objects of type Opt, and v be an object of type V:

std::optional<V> both Proposed std::maybe_view<V>
Opt(nullopt) Opt()
Opt(v)
Opt(in_place, v)
o = nullopt; o = v;
o.emplace(v);
o.reset();
o.swap(o2);
o == o2 and o <=> o2
o == v and o <=> v
*o and o->m o.transform(f) o.begin() and o.end()
o.has_value() and bool(o) o.and_then(f) o.size()
o.value_or(v) o.or_else(f) o.data()
std::hash<Opt>{}(o)

Nevertheless, Downey argues that the interface differences make this new type worthwhile. Consider each of these differences:

  1. std::maybe_view lacks a dereference operator and boolean conversion. Dereference and boolean conversion are the standard interface to optional-like objects and originate in nullable C pointers. Generic algorithms that today output both a vector<optional<T>> and a vector<shared_ptr<T>> using optional-like interfaces, will not work with a vector<maybe_view<T>>. This interface difference goes against genericity.

  2. std::maybe_view contains some, but not all, accessors. std::maybe_view provides the continuation functions transform, and_then, and or_else but not value_or. What's the motivation for this omission? A GitHub search[2] results in 111k uses of value_or. If we add future accessors to std::optional, what will be the basis for deciding whether they should be added to std::maybe_view?

  3. std::maybe_view's template parameter supports references. This is fixing a defect of std::optional that is already being fixed with Steve Downey and Peter Sommerlad's std::optional<T&> paper.

  4. std::maybe_view satisfies the range concept. This interface difference allows std::maybe_view to be used in more algorithms than std::optional, which is great. However, if this is the only viable interface difference, we should add range support to std::optional to avoid confusing users with two otherwise identical types. This interface difference goes against simplicity.

The standard providing two almost identical types will result in choice overload, which is mentally draining. Downey suggests using std::maybe_view when "the value will have operations applied if present, and ignored otherwise." For a return type, how could one know if this will be the case for a caller? For an argument type, this may be true for one implementation, but what if it changes? Users should not have to spend mental energy considering questions like these.

Survey of existing practice

Since its standardization, std::optional's usage has increased in modern C++ codebases. Iterating over this standard type is a recurring topic as shown in forum discussions such as Iterating over std::optional and begin and end iterators for std::optional?.

Some in the C++ community have already experienced this idea. owi_optional is an open-source implementation. The following code snippet shows an example using syntax similar to our proposal:

#include "owi/optional.hpp"

owi::optional<int> empty;
for (int i : empty) {   // iterating over empty optional
    std::cout << i;
}

owi::optional<int> opt{ 123 };
for (int i : opt) { // iterating over non-empty optional
    std::cout << i;
}

Several other programming languages with std::optional-like types allow direct iteration. Rust's Option type and Scala's Option are notable examples. The following code illustrates a typical Rust usage example:

let yep = Some(42);
let nope = None;
// chain() already calls into_iter(), so we don't have to do so
let nums: Vec<i32> = (0..4).chain(yep).chain(4..8).collect();
assert_eq!(nums, [0, 1, 2, 3, 42, 4, 5, 6, 7]);
let nums: Vec<i32> = (0..4).chain(nope).chain(4..8).collect();
assert_eq!(nums, [0, 1, 2, 3, 4, 5, 6, 7]);

Proposal

We propose a simple design alternative to P1255R12: make std::optional satisfy the ranges::range concept where iterating over a std::optional object will iterate over its 0 or 1 elements.

To view or not to view

As with the proposed std::maybe_view, std::optional should be a view. As it only ever has at most one element, it would satisfy the semantic requirements.

The question is how to opt std::optional into being a view. There are two ways to do this:

  1. Change std::optional<T> to inherit from std::ranges::view_interface<std::optional<T>>, or
  2. Specialize std::ranges::enable_view<std::optional<T>> to true.

The problem with inheriting from ranges::view_interface, separate from any questions of ABI and the like, are that it would bring in more functions into std::optional's public interface: empty(), data(), size(), front(), back(), and operator[](). While empty() is potentially sensible (if unnecessary given that both has_value() and operator bool exist), the rest are actively undesirable. Note that all the accessor objects in std::ranges will still work (e.g. std::ranges::size, std::ranges::data, etc.) simply from providing begin() and end().

So we propose instead to specialize ranges::enable_view (similar to std::string_view and std::span).

std::optional<T> is not a borrowed range, although std::optional<T&> (when added) would be.

Choice of type for iterator

optional<T> is a contiguous, sized range, so the simplest choice of iterator is T*. The original revision of this paper simply chose T* (and T const*) for the iterator types.

However, there are some benefits to having a distinct iterator type: it prevents some misuse and some potential confusion from code of the form:

void takes_pointer(T const*);
    
void f(optional<T> const& o) {
    // what does this mean?
    T const* p = o.begin();  
    
    // do we want this to work?
    takes_pointer(o.begin());
    
    // or this?
    if (o.begin()) { /* ... */ }
}

Users might misinterpret p to always point to the storage for the T object in the disengaged case and think they might be able to placement-new on top of it (even though begin() should ideally return nullptr for the disengaged case). The call to takes_pointer is not really something we intend on supporting. Likewise the conversion to bool.

The goal of begin() and end() is simply to provide the range interface for optional, so they should be the minimal implementation to provide that functionality without unintentionally introducing other functionality.

As such, a better choice (as discussed in Tokyo) would be to make the iterator and const_iterator types implementation-defined, where the implementations would hopefully provide some class type to catch many such misuses.

Additional iterator types

Typically, containers provide a large assortment of iterators. In addition to the iterator you get from begin()/end(), there is also areverse_iterator (from rbegin()/rend()), a const_iterator (from cbegin()/cend()), and a const_reverse_iterator (from crbegin()/crend()). The original revision of this paper provided the full set of iterator algorithms.

However, we think this is unnecessary for the same reason it is unnecessary to provide data() and size().

ranges::cbegin() and ranges::cend() will already work and correctly provide const_iterator, so there is no benefit to providing it directly. Generic code should already be using those function objects rather than attempting to call member cbegin() and cend() directly.

Similarly, ranges::rbegin() and ranges::rend() will already work and correctly provide a reverse_iterator. Indeed, for all containers in the standard library, member rbegin and rend already do the same thing that ranges::rbegin and ranges::rend would do anyway.

Interestingly, in this particular case, optional could do better. Since our size is at most 1, reversing an optional is a no-op: rbegin() could simply be implemented to return begin() as an optimization. However, given that our iterator is contiguous, the added overhead of std::reverse_iterator will get optimized out anyway.

Given that neither cbegin nor rbegin add value, crbegin certainly doesn't either.

We therefore propose to provide the minimal interface to opt into ranges: simply begin() and end(). All the functionality is there for the user if they want to use the ranges:: objects.

Wording

Note that this wording is based on adopting the currently proposed resolution for LWG 4015.

[optional.syn]

Add the specialization of ranges::enable_view and format_kind to [optional.syn] right after the class optional declaration:

namespace std {
  // [optional.optional], class template optional
  template<class T>
    class optional;                                     // partially freestanding
  
+ template<class T>
+   constexpr bool ranges::enable_view<optional<T>> = true;   
+ templat<class T>
+   constexpr auto format_kind<optional<T>> = range_format::disabled;

  template<class T>
    concept is-derived-from-optional = requires(const T& t) {  // exposition only
        []<class U>(const optional<U>&){ }(t);
    };
  // ...
}

[optional.optional.general]

Add:

namespace std {
  template<class T>
  class optional {
  public:
    using value_type             = T;
+   using iterator               = implementation-defined; // see [optional.iterators]
+   using const_iterator         = implementation-defined; // see [optional.iterators]

    // ...

    // [optional.swap], swap
    constexpr void swap(optional&) noexcept(see below);

    
+   // [optional.iterators], iterator support
+   constexpr iterator begin() noexcept;
+   constexpr const_iterator begin() const noexcept;
+   constexpr iterator end() noexcept;
+   constexpr const_iterator end() const noexcept;    
    
    // [optional.observe], observers
    constexpr const T* operator->() const noexcept;
    // ...
  };
}

[optional.iterators]

Add a new clause [optional.iterators] between [optional.swap] and [optional.observe]:

Iterator support [optional.iterators]
using iterator       = implementation-defined;
using const_iterator = implementation-defined;

The type models contiguous_iterator ([iterator.concept.contiguous]), meets the Cpp17RandomAccessIterator requirements ([random.access.iterators]), and meets the requirements for constexpr iterators ([iterator.requirements.general]), with value type T. The reference type is T& for iterator and const T& for const_iterator.

All requirements on container iterators ([container.reqmts]) apply to optional::iterator and optional::const_iterator as well.

constexpr iterator begin() noexcept;
constexpr const_iterator begin() const noexcept;

Returns: If has_val, an iterator referring to val. Otherwise, unspecified.

constexpr iterator end() noexcept;
constexpr const_iterator end() const noexcept;

Returns: begin() + has_val.

[version.syn]

Add the following macro definition to [version.syn], header <version> synopsis, with the value selected by the editor to reflect the date of adoption of this paper:

    #define __cpp_lib_optional 202110L // also in <optional>
+   #define __cpp_lib_optional_range_support 20XXXXL // freestanding, also in <optional>
    #define __cpp_lib_out_ptr 202106L // also in <memory>

LEWG feedback

P3168R0 was reviewed during the LEWG sessions at Tokyo 2024. It was decided to proceed with our proposed solution, albeit with a few changes.

Tokyo 2024 P3168R0 related polls:

  1. We want to make optional satisfy the range concept.

SF F N A SA
11 8 2 3 0
  1. Modify P3168R0 (Give std::optional Range Support) by opting out of formatter range support for optional and changing the return type of iterator functions from T* to iterator and family, and then send the revised paper to LWG for C++26 classified as B2, to be confirmed with a Library Evolution electronic poll.

SF F N A SA
11 3 1 0 2

Implementation experience

We tested these std::optional modifications against two multi-million line internal Adobe repositories and detected no build or test suite regressions.

Appendix: Comparison P1255R12 vs P3168R0

This section compares examples from P1255R12 to equivalents using our own proposal.

Optional access

P1255R12This Proposal
for (auto&& opt : views::nullable(get_optional())) {
    // a few dozen lines...
    use(opt); // opt is Safe
}
for (auto&& opt : get_optional()) {
    // a few dozen lines...
    use(opt); // opt is Safe
}

Optional assignment

P1255R12This Proposal
auto opt_int = get_optional();
for (auto&& i : views::nullable(std::ref(opt_int))) {
    // a few dozen lines...
    i = 123;
}
auto opt_int = get_optional();
for (auto&& i : opt_int) {
    // a few dozen lines...
    i = 123;
}

Range chain example (1)

vector<int> v{2, 3, 4, 5, 6, 7, 8, 9, 1};
auto test = [](int i) -> optional<int> {
    switch(i) { 
        case 1: 
        case 3: 
        case 7:
        case 9:
            return i; 
        default: 
            return {};
    } 
};  
P1255R12This Proposal
auto&& r = v 
    | views::transform(test) 
    | views::transform(views::nullable) 
    | views::join | 
    | views::transform(
        [](int i) { cout << i; return i; }
    ); 
auto&& r = v 
    | views::transform(test)
    | views::filter([](auto x) { return bool(x); })
    | views::transform([](auto x){ return *x; }) 
    | views::transform(
        [](int i) { cout << i; return i; }
    ); 
for (auto&& i : r) {
    std::cout << i << "\n";
}

Range chain example (2)

unordered_set<int> set{1, 3, 7, 9};
auto flt = [=](int i) -> optional<int> {
    if (set.contains(i))
        return i;
    else 
        return {}; 
}; 
P1255R12This Proposal
for(auto i : views::iota{1,10} | views::transform(flt)) { 
    for (auto j : views::nullable(i)) { 
        for (auto k : views::iota(0, j))
            cout <<'\a'; 
        cout << '\n';
    }
}
for(auto i : views::iota{1,10} | views::transform(flt)) { 
    for (auto j : i) { // no need to transform
        for (auto k: views::iota(0, j))
            cout <<'\a'; 
        cout << '\n';
    }
}

yield_if

Eric Niebler’s Pythagorian Triples triple example:

P1255R12This Proposal
auto yield_if = []<class T>(bool b, T x) {
    return b ? 
        maybe_view<T>{move(x)} :
        maybe_view<T>{}; 
};
auto yield_if = []<class T>(bool b, T x) {
    return b ?
        optional<T>{move(x)} :
        nullopt; 
};

References


  1. See the introduction of A proposal to add a utility class to represent optional objects ↩︎

  2. See GitHub search of '".value_or" path:*.cpp' ↩︎