Document number: P2643R1
Date: 2023-05-18
Reply to: Gonzalo Brito Gadeschi <gonzalob _at_ nvidia.com>
Authors: Gonzalo Brito Gadeschi, Olivier Giroux, Thomas Rodgers
Audience: Concurrency

Improving C++ concurrency features

Revisions

r0 - (Kona Drafts)

r1 - (Varna Draft)

Introduction

P1135R6 introduced serval new concurrency primitives to the C++20 concurrency library:

Though each element included was long coming, and had much implementation experience behind it, fresh user feedback tells us that some improvements could still be made:

  1. Return last observed value from atomic::wait; this value is lost otherwise.
  2. Add timed versions of atomic::wait and other concurrency primitves like barrier and latch, to make it easier to implement concurrency primitives on top of atomic that expose timed waiting facilities themselves, e.g., like <semaphore>.
  3. Avoid spurious polling in atomic::wait by accepting a predicate.

This proposal proposes extensions to address these shortcomings. This branch demonstrates its implementability in libstdc++.

Design

The design of the features above is mostly orthogonal, and this section explores them independently.

Return last observed value on success

Desing: ::wait APIs change the return type signature from returning voidT wait(...);

Example 0:

Before After
atomic<int> a(42);
a.wait(42);
auto o = a.load();
assert(o != 42); // MAY FAIL!
atomic<int> a(42);
auto o = a.wait(42);

assert(o != 42); // OK!

The atomic<T>::wait method guarantees that the thread is unblocked only if the value changed.

Before this paper, the new atomic<T> value is not returned to the caller. This has the following two shortcomings:

After this paper, the value returned by wait is returned to the caller, eliminating the need for the subsequent load.

This is an ABI breaking change that could be resolved by picking a different to-be-determined name for these APIs.

Fallible and timed versions of wait APIs

The design of the fallible timed versions of wait APIs adds three new APIs. For atomic these are

template <class T> optional<T> atomic<T>::try_wait(T, memory_order); template <class T, class Rep, class Period> optional<T> atomic<T>::try_wait_for(T, duration<Rep, Period> const&, memory_order); template <class T, class Clock, class Duration> optional<T> atomic<T>::try_wait_until(T, time_point<Clock, Duration> const&, memory_order);

They are non-blocking, i.e., they eventually return to the caller in a finite-set of steps, even if the value did not change. This enables the application to “do something else” before attempting to wait again.

On failure, i.e., if the value did not change, they return nullopt and the operation has no effects (it does not synchronize). On success, they return an optional<T> containing the last observed value, which is guaranteed to be different from the one the call site waited on.

The untimed try_wait overload attempts to wait for a finite unspecified duration. The implementation may pick a different duration every time, enabling it decide for how long the operation should attempt to wait based on dynamic information, like the load of the system.

Since <chrono> and <optional> are not freestanding, these APIs will not be available in freestanding implementations. We could attempt to improve this by:

Example 1: the atomic variable t tracks how many tasks need to still be processed, and the application prints every second - if it is still waiting - how many tasks remain:

Before After
atomic<int> t;
int old = 1;
auto b = clock::now();
while (old != 0) {
old = t.load();
if (auto e = clock::now(); (e - b) > 1s) {
cout << old;
b = e;
}
}
atomic<int> t;
auto old = 1;

while (old != 0) {
auto o = t.try_wait_for(old, 1s);
old = o.has_value()? o.value() : old;
cout << old;
}


Before this proposal, applications need to re-implement atomic<T>::wait logic, and like in this example, may accidentally call atomic<T>::load in a loop without any back-off.

After this proposal, try_wait_for expresses the application’s intent to wait for a certain duration of time, enabling the implementation to wait efficiently.

For barrier and latch they look like:

template <class CF> bool barrier<CF>::try_wait(arrival_token& tok) const; template <class CF, class Rep, class Period> bool barrier<CF>::try_wait_for(arrival_token& tok, duration<Rep, Period> const& rel_time) const; template <class CF, class Clock, class Duration> bool barrier<CF>::try_wait_until(arrival_token& tok, time_point<Clock, Duration> const& abs_time) const; // Available since C++20 // bool latch::try_wait() const noexcept; template <class Rep, class Period> bool latch::try_wait_for(duration<Rep, Period> const& rel_time) const; template <class Clock, class Duration> bool latch::try_wait_until(time_point<Clock, Duration> const& abs_time) const;

Example 2: using a barrier which tracks the completion of task_count tasks:

std::barrier b(task_count); auto n = process_thread_local_tasks(); auto t = b.arrive(n); while (!b.try_wait_for(t, 1ms)) { auto s = steal_and_process_tasks(); b.arrive(s); }

Predicate versions

There is no wording for the predicate versions proposed in this revision of this paper.

When the program is waiting for a condition different from “not equal to”, there is an added re-try loop around the wait operation in the program. This loop causes each call to wait to be performed as if it were the first call to wait, oblivious to the fact that the program has already been waiting for some time. This leads to re-executing the short-term polling strategy.

Taking a predicate instead of a value allows us to push the program-defined condition inside of atomic::wait, delete the outer loop, and allows the implementation to track time spent.

At least two implementations currently implement atomic::wait in terms of a wait taking a predicate.

The authors see two APIs that are both equvialent in terms of implementation complexity and usability, but have an opposite API depending on how the semantics of the predicate are defined.

Both options have this signature T atomic<T>::wait_with_predicate(Predicate&& p) and the predicate takes a T and returns a bool, but the predicate returns:

Feedback needed: We should pick an option before working on wording.

Example 3: same example as Example 1 using Option 1 above (true if we synchronize):

Before After
atomic<int> t;
int old = 1;
auto b = clock::now();
while (old != 0) {
old = t.load();
auto e = clock::now();
if ((e - b) > 1s) {
cout << old;
b = e;
}
}
atomic<int> t;
int old = t.load();
auto p = [&](int v) {
old = v;
return v == 0;
};

while(!t.try_wait_with_predicate_for(p, 1s).has_value()) {
cout << old;
}

In Example 1, the “After this paper” implementation has the issue that try_wait_for(old, 1s) returns as soon as old changes. That is, while the goal of the application is to wait for the value becoming zero, and it could wait up to 1s for that, if after 0.1s the value changes from 10 to 9 this API will return.

The “After this paper” implementation in Example 2 fixes that using the fallibe timed _with_predicate version. The predicate is called every time the value changes, but it doesn’t “succeed” until the value is 0.

Wording

Return last observed value from atomic::wait

To [atomics.ref.generic.general]:

namespace std {
  template<class T> struct atomic_ref {  // [atomics.ref.generic.general]
    voidT wait(T, memory_order = memory_order::seq_cst) const noexcept;
  };
}

UNRESOLVED QUESTION: all atomic_ref types are missing volatile wait overloads. Should that be fixed as part of this work?

To [atomics.ref.ops]:

voidT wait(T old, memory_order order = memory_order::seq_cst) const noexcept;

To [atomics.ref.int]:

namespace std {
  template<> struct atomic_ref<integral> {
    voidT wait(integral, memory_order = memory_order::seq_cst) const noexcept;
  };
}

To [atomics.ref.float]:

namespace std {
  template<> struct atomic_ref<floating-point> {
    voidT wait(floating-point, memory_order = memory_order::seq_cst) const noexcept;
  };
}

To [atomics.ref.pointer]:

namespace std {
  template<class T> struct atomic_ref<T*> {
    voidT* wait(T*, memory_order = memory_order::seq_cst) const noexcept;
  };
}

To [atomics.types.generic.general]:

namespace std {
  template<class T> struct atomic {
    voidT wait(T, memory_order = memory_order::seq_cst) const volatile noexcept;
    voidT wait(T, memory_order = memory_order::seq_cst) const noexcept;
  };
}

To [atomics.types.operations]:

voidT wait(T old, memory_order order = memory_order::seq_cst) const volatile noexcept;
voidT wait(T old, memory_order order = memory_order::seq_cst) const noexcept;

To [atomics.types.int]:

namespace std {
  template<> struct atomic<integral> {
    voidT wait(integral, memory_order = memory_order::seq_cst) const volatile noexcept;
    voidT wait(integral, memory_order = memory_order::seq_cst) const noexcept;
  };
}

To [atomics.types.float]:

namespace std {
  template<> struct atomic<floating-point> {
    voidT wait(floating-point, memory_order = memory_order::seq_cst) const volatile noexcept;
    voidT wait(floating-point, memory_order = memory_order::seq_cst) const noexcept;
  };
}

To [atomics.types.pointer]:

namespace std {
  template<class T> struct atomic<T*> {
    voidT* wait(T*, memory_order = memory_order::seq_cst) const volatile noexcept;
    voidT* wait(T*, memory_order = memory_order::seq_cst) const noexcept;
  };
}

To [util.smartptr.atomic.shared]:

namespace std {
  template<class T> struct atomic<shared_ptr<T>> {
    voidshared_ptr<T> wait(shared_ptr<T> old, memory_order = memory_order::seq_cst) const noexcept;
  };
}

and

voidshared_ptr<T> wait(shared_ptr<T> old, memory_order order = memory_order::seq_cst) const noexcept;

To [util.smartptr.atomic.weak]:

namespace std {
  template<class T> struct atomic<weak_ptr<T>> {
    voidweak_ptr<T> wait(weak_ptr<T> old, memory_order = memory_order::seq_cst) const noexcept;
  };
}
voidweak_ptr<T> wait(weak_ptr<T> old, memory_order order = memory_order::seq_cst) const noexcept;

No changes to [atomics.nonmembers] are needed.

No changes to [atomic.flag]'s wait APIs are needed.

Fallible and timed-versions of ::wait APIs

To [atomics.ref.generic.general]:

namespace std {
  template<class T> struct atomic_ref {  // [atomics.ref.generic.general]
    optional<T> try_wait(memory_order = memory_order::seq_cst) const noexcept;
    template <class Rep, class Period>
    optional<T> try_wait_for(
      T, chrono::duration<Rep, Period> const& rel_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
    template <class Clock, class Duration>
    optional<T> try_wait_until(
      T, chrono::time_point<Clock, Duration> const& abs_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
  };
}

To [atomics.ref.ops]:

optional<T> try_wait(memory_order = memory_order::seq_cst) const noexcept;
template <class Rep, class Period>
optional<T> try_wait_for(T old, 
    chrono::duration<Rep, Period> const& rel_time,
    memory_order order = memory_order::seq_cst
) const noexcept;
template <class Clock, class Duration>
optional<T> try_wait_until(T old, 
    chrono::time_point<Clock, Duration> const& abs_time,
    memory_order order = memory_order::seq_cst
) const noexcept;

To [atomics.ref.int]:

namespace std {
  template<> struct atomic_ref<integral> {
    optional<integral> try_wait(memory_order = memory_order::seq_cst) const noexcept;
    template <class Rep, class Period>
    optional<integral> try_wait_for(
      integral, chrono::duration<Rep, Period> const& rel_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
    template <class Clock, class Duration>
    optional<integral> try_wait_until(
      integral, chrono::time_point<Clock, Duration> const& abs_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
  };
}

To [atomics.ref.float]:

namespace std {
  template<> struct atomic_ref<floating-point> {
    optional<floating-point> try_wait(memory_order = memory_order::seq_cst) const noexcept;
    template <class Rep, class Period>
    optional<floating-point> try_wait_for(
      floating-point, chrono::duration<Rep, Period> const& rel_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
    template <class Clock, class Duration>
    optional<floating-point> try_wait_until(
      floating-point, chrono::time_point<Clock, Duration> const& abs_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
  };
}

To [atomics.ref.pointer]:

namespace std {
  template<class T> struct atomic_ref<T*> {
    optional<T*> try_wait(memory_order = memory_order::seq_cst) const noexcept;
    template <class Rep, class Period>
    optional<T*> try_wait_for(
      T*, chrono::duration<Rep, Period> const& rel_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
    template <class Clock, class Duration>
    optional<T*> try_wait_until(
      T*, chrono::time_point<Clock, Duration> const& abs_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
  };
}

To [atomics.types.generic.general]:

namespace std {
  template<class T> struct atomic {
    optional<T> try_wait(memory_order = memory_order::seq_cst) const noexcept;
    template <class Rep, class Period>
    optional<T> try_wait_for(
      integral, chrono::duration<Rep, Period> const& rel_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
     optional<T> try_wait_for(
      integral, chrono::duration<Rep, Period> const& rel_time, 
      memory_order = memory_order::seq_cst
     ) const volatile noexcept;
    template <class Clock, class Duration>
    optional<T> try_wait_until(
      integral, chrono::time_point<Clock, Duration> const& abs_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
    template <class Clock, class Duration>
    optional<T> try_wait_until(
      integral, chrono::time_point<Clock, Duration> const& abs_time, 
      memory_order = memory_order::seq_cst
     ) const volatile noexcept;
     
  };
}

To [atomics.types.operations]:

optional<T> try_wait(memory_order = memory_order::seq_cst) const noexcept;
template <class Rep, class Period>
optional<T> try_wait_for(T old, 
                         chrono::duration<Rep, Period> const& rel_time,
                         memory_order order = memory_order::seq_cst
                        ) const noexcept;
template <class Rep, class Period>
optional<T> try_wait_for(T old, 
                         chrono::duration<Rep, Period> const& rel_time,
                         memory_order order = memory_order::seq_cst
                        ) const volatile noexcept;
template <class Clock, class Duration>
optional<T> try_wait_until(T old, 
                           chrono::time_point<Clock, Duration> const& abs_time,
                           memory_order order = memory_order::seq_cst
                          ) const noexcept;
template <class Clock, class Duration>
optional<T> try_wait_until(T old, 
                           chrono::time_point<Clock, Duration> const& abs_time,
                           memory_order order = memory_order::seq_cst
                          ) const volatile noexcept;

EDITORIAL: analogous to atomic_ref. Intentionally left out from the current revision of this paper.

To [atomics.types.int]:

namespace std {
  template<> struct atomic<integral> {
    optional<integral> try_wait(memory_order = memory_order::seq_cst) const noexcept;
    template <class Rep, class Period>
    optional<integral> try_wait_for(
      integral, chrono::duration<Rep, Period> const& rel_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
    template <class Rep, class Period>
    optional<integral> try_wait_for(
      integral, chrono::duration<Rep, Period> const& rel_time, 
      memory_order = memory_order::seq_cst
     ) const volatile noexcept;
    template <class Clock, class Duration>
    optional<integral> try_wait_until(
      integral, chrono::time_point<Clock, Duration> const& abs_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
    template <class Clock, class Duration>
    optional<integral> try_wait_until(
      integral, chrono::time_point<Clock, Duration> const& abs_time, 
      memory_order = memory_order::seq_cst
     ) const volatile noexcept;
  };
}

To [atomics.types.float]:

namespace std {
  template<> struct atomic<floating-point> {
    optional<floating-point> try_wait(memory_order = memory_order::seq_cst) const noexcept;
    template <class Rep, class Period>
    optional<floating-point> try_wait_for(
      floating-point, chrono::duration<Rep, Period> const& rel_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
    template <class Rep, class Period>
    optional<floating-point> try_wait_for(
      floating-point, chrono::duration<Rep, Period> const& rel_time, 
      memory_order = memory_order::seq_cst
     ) const volatile noexcept;
    template <class Clock, class Duration>
    optional<floating-point> try_wait_until(
      floating-point, chrono::time_point<Clock, Duration> const& abs_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
    template <class Clock, class Duration>
    optional<floating-point> try_wait_until(
      floating-point, chrono::time_point<Clock, Duration> const& abs_time, 
      memory_order = memory_order::seq_cst
     ) const volatile noexcept;
  };
}

To [atomics.types.pointer]:

namespace std {
  template<class T> struct atomic<T*> {
    optional<T*> try_wait(memory_order = memory_order::seq_cst) const noexcept;
    template <class Rep, class Period>
    optional<T*> try_wait_for(
      T*, chrono::duration<Rep, Period> const& rel_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
    template <class Rep, class Period>
    optional<T*> try_wait_for(
      T*, chrono::duration<Rep, Period> const& rel_time, 
      memory_order = memory_order::seq_cst
     ) const volatile noexcept;
    template <class Clock, class Duration>
    optional<T*> try_wait_until(
      T*, chrono::time_point<Clock, Duration> const& abs_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
    template <class Clock, class Duration>
    optional<T*> try_wait_until(
      T*, chrono::time_point<Clock, Duration> const& abs_time, 
      memory_order = memory_order::seq_cst
     ) const volatile noexcept;
  };
}

To [util.smartptr.atomic.shared]:

namespace std {
  template<class T> struct atomic<shared_ptr<T>> {
    optional<shared_ptr<T>> try_wait(memory_order = memory_order::seq_cst) const noexcept;
    template <class Rep, class Period>
    optional<shared_ptr<T>> try_wait_for(
      shared_ptr<T>l, chrono::duration<Rep, Period> const& rel_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
    template <class Clock, class Duration>
    optional<shared_ptr<T>> try_wait_until(
      shared_ptr<T>, chrono::time_point<Clock, Duration> const& abs_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
  };
}

EDITORIAL: analogous to the try_wait_for/_until APIS of atomic_ref, with shared_ptr/weak_ptr tweaks. Intentionally left out of the current revision of this paper.

To [util.smartptr.atomic.weak]:

namespace std {
  template<class T> struct atomic<weak_ptr<T>> {
    optional<weak_ptr<T>> try_wait(memory_order = memory_order::seq_cst) const noexcept;
    template <class Rep, class Period>
    optional<weak_ptr<T>> try_wait_for(
      weak_ptr<T>l, chrono::duration<Rep, Period> const& rel_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
    template <class Clock, class Duration>
    optional<weak_ptr<T>> try_wait_until(
      weak_ptr<T>, chrono::time_point<Clock, Duration> const& abs_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
  };
}

EDITORIAL: analogous to the try_wait_for/_until APIS of atomic_ref, with shared_ptr/weak_ptr tweaks. Intentionally left out of the current revision of this paper.

EDITORIAL: No changes to [atomics.nonmembers] are needed.

To [atomic.flag]:

namespace std {
  struct atomic_flag {
    bool try_wait(memory_order = memory_order::seq_cst) const noexcept;
    bool try_wait(memory_order = memory_order::seq_cst) const noexcept volatile;
    template <class Rep, class Period>
    bool try_wait_for(
      bool, chrono::duration<Rep, Period> const& rel_time, 
      memory_order = memory_order::seq_cst
    ) const noexcept;
    template <class Rep, class Period>
    bool try_wait_for(
      bool, chrono::duration<Rep, Period> const& rel_time, 
      memory_order = memory_order::seq_cst
    ) const volatile noexcept;
    template <class Clock, class Duration>
    bool try_wait_until(
      bool, chrono::time_point<Clock, Duration> const& abs_time, 
      memory_order = memory_order::seq_cst
     ) const noexcept;
     template <class Clock, class Duration>
     bool try_wait_until(
      bool, chrono::time_point<Clock, Duration> const& abs_time, 
      memory_order = memory_order::seq_cst
     ) const volatile noexcept;
  };
}
bool atomic_flag_try_wait(const atomic_flag* object, bool old) noexcept;
bool atomic_flag_try_wait(const volatile atomic_flag* object, bool old) noexcept;
bool atomic_flag_try_wait_explicit(const atomic_flag* object, bool old) noexcept;
bool atomic_flag_try_wait_explicit(const volatile atomic_flag* object, bool old) noexcept;
bool atomic_flag::try_wait(bool old, memory_order order = memory_order::seq_cst) const noexcept;
bool atomic_flag::try_wait(bool old, memory_order order = memory_order::seq_cst) const volatile noexcept;

bool atomic_flag_try_wait_for(const atomic_flag* object, bool old, chrono::duration<Rep, Period> const& rel_time) noexcept;
bool atomic_flag_try_wait_for(const volatile atomic_flag* object, bool old, chrono::duration<Rep, Period> const& rel_time) noexcept;
bool atomic_flag_try_wait_for_explicit(const atomic_flag* object, bool old, chrono::duration<Rep, Period> const& rel_time, memory_order order = memory_order::seq_cst) noexcept;
bool atomic_flag_try_wait_for_explicit(const volatile atomic_flag* object, bool old, chrono::duration<Rep, Period> const& rel_time, memory_order order = memory_order::seq_cst) noexcept;
bool atomic_flag::try_wait_for(bool old, chrono::duration<Rep, Period> const& rel_time, memory_order order = memory_order::seq_cst) const noexcept;
bool atomic_flag::try_wait_for(bool old, chrono::duration<Rep, Period> const& rel_time, memory_order order = memory_order::seq_cst) const volatile noexcept;

atomic_flag_try_wait_until(const atomic_flag* object, bool old, chrono::time_point<Clock, Duration> const& abs_time) noexcept;
bool atomic_flag_try_wait_until(const volatile atomic_flag* object, bool old, chrono::time_point<Clock, Duration> const& abs_time) noexcept;
bool atomic_flag_try_wait_until_explicit(const atomic_flag* object, bool old, chrono::time_point<Clock, Duration> const& abs_time, memory_order order = memory_order::seq_cst) noexcept;
bool atomic_flag_try_wait_until_explicit(const volatile atomic_flag* object, bool old, chrono::time_point<Clock, Duration> const& abs_time, memory_order order = memory_order::seq_cst) noexcept;
bool atomic_flag::try_wait_until(bool old, chrono::time_point<Clock, Duration> const& abs_time, memory_order order = memory_order::seq_cst) const noexcept;
bool atomic_flag::try_wait_until(bool old, chrono::time_point<Clock, Duration> const& abs_time, memory_order order = memory_order::seq_cst) const volatile noexcept;

For atomic_flag_try_wait, atomic_flag_try_wait_for, and atomic_flag_try_wait_until, let order be memory_order::seq_cst. Let flag be object for the non-member functions, and this for the member functions.

To [atomics.wait]:

  1. [Note 2: The following functions are atomic waiting operations:
    2.1 atomic<T>::wait, atomic<T>::wait_for, atomic::<T>::wait_until,
    2.2 atomic_flag::wait, atomic_flag::wait_for, atomic_flag::wait_until,
    2.3 atomic_wait and, atomic_wait_explicit, atomic_wait_for, atomic_wait_for_explicit, atomic_wait_until, atomic_wait_until_explicit,
    2.4 atomic_flag_wait and, atomic_flag_wait_explicit, atomic_flag_wait_for, atomic_flag_wait_for_explicit, atomic_flag_wait_until, atomic_flag_wait_until_explicit,and
    2.5 atomic_ref<T>::wait, atomic_ref<T>::wait_for, atomic_ref<T>::wait_until.
    end note]

UNRESOLVED QUESTION: do we need to change something else for the non-member versions of try_wait_for and try_wait_until operations?

To [thread.barrier]:

namespace std {
  template <class Completion Function>
  class barrier {
  
  public:
    bool try_wait(arrival_token& tok) const;
    template <class Rep, class Period>
    bool try_wait_for(arrival_token& tok, chrono::duration<Rep, Period> const& rel_time) const;
    template <class Clock, class Duration>
    bool try_wait_until(arrival_token& tok, chrono::time_point<Clock, Duration> const& abs_time) const;
  };
}

UNRESOLVED QUESTION: arrival_token& vs arrival_token const& ?

Note: try_wait_for/try_wait_until do not take ownership of the arrival_token because if they fail, the user still needs the token to be able to call them again.

bool try_wait(arrival_token& tok) const;
template <class Rep, class Period>
bool try_wait_for(arrival_token& tok, chrono::duration<Rep, Period> const& rel_time) const;
template <class Clock, class Duration>
bool try_wait_until(arrival_token& tok, chrono::time_point<Clock, Duration> const& abs_time) const;

To thread.latch:

namespace std {
  class latch {
  public:
    bool try_wait() const;
    template <class Rep, class Period>
    bool try_wait_for(chrono::duration<Rep, Period> const& rel_time) const;
    template <class Clock, class Duration>
    bool try_wait_until(chrono::time_point<Clock, Duration> const& abs_time) const;
  };
}
bool try_wait() const;
template <class Rep, class Period>
bool try_wait_for(chrono::duration<Rep, Period> const& rel_time) const;
template <class Clock, class Duration>
bool try_wait_until(chrono::time_point<Clock, Duration> const& abs_time) const;

EDITORIAL: semantics intentionally omitted from the current revision of this paper but analogous to barrier.