P1144R8
std::is_trivially_relocatable

Published Proposal,

Issue Tracking:
Inline In Spec
Author:
Audience:
LEWG, EWG
Project:
ISO/IEC JTC1/SC22/WG21 14882: Programming Language — C++
Draft Revision:
49
Current Source:
github.com/Quuxplusone/draft/blob/gh-pages/d1144-object-relocation.bs
Current:
rawgit.com/Quuxplusone/draft/gh-pages/d1144-object-relocation.html

Abstract

We define a new verb, "relocate," equivalent to a move and a destroy. For many C++ types, this is implementable "trivially," as a single memcpy. We propose a standard trait so library writers can detect such types. We propose a compiler rule that propagates trivial relocatability among Rule-of-Zero types. Finally, we propose a standard syntax for a user-defined type (e.g. boost::shared_ptr) to warrant that it is trivially relocatable. P1144R3 was presented at C++Now 2019 as a 90-minute talk titled "Trivially Relocatable."

1. Changelog

2. Introduction and motivation

Given an object type T and memory addresses src and dst, the phrase "relocate a T from src to dst" means "move-construct dst from src, and then immediately destroy the object at src."

Any type which is both move-constructible and destructible is relocatable. A type is nothrow relocatable if its relocation operation is noexcept, and a type is trivially relocatable if its relocation operation is trivial (which, just like trivial move-construction and trivial copy-construction, means "tantamount to memcpy").

In practice, almost all relocatable types are trivially relocatable: std::unique_ptr<int>, std::vector<int>, std::string (on libc++), std::list<int> (on MSVC). Examples of non-trivially relocatable types include std::list<int> (on libstdc++ and libc++), std::string (on libstdc++ and MSVC), std::set (everywhere). See Appendix C: Examples of non-trivially relocatable class types.

P1144 allows the library programmer to warrant to the compiler that a resource-management type is trivially relocatable. Explicit warrants are rarely needed because the compiler can infer trivial relocatability for Rule-of-Zero types. See § 4.5 Trivially relocatable type [basic.types.general].

The most important thing about P1144 relocation is that it is backward compatible and does not break either API or ABI. My intention is simply to legalize the well-understood tricks that many industry codebases ([BSL], [Folly], [Deque]) are already doing in practice. P1144 is not intended to change the behavior of any existing source code (except to speed it up), nor does it require major work from standard library vendors.

2.1. Optimizations enabled by trivial relocatability

The following optimizations are possible according to P1144R8’s notion of trivial relocatability. Here’s who does these optimizations in practice:

vector realloc type erasure fixed-cap move vector erase vector insert rotate swap
Arthur’s libc++ std::is_trivially_relocatable_v<T> std::vector no N/A std::vector
(Godbolt)
std::vector
(Godbolt)
yes, uses swap_ranges
(Godbolt)
std::swap_ranges
(Godbolt)
[BSL] bslmf::IsBitwiseMoveable<T> bsl::vector no ? bsl::vector bsl::vector ArrayPrimitives_Imp::rotate, unused by bsl::rotate no
[Folly] folly::IsRelocatable<T> fbvector no proposed for small_vector fbvector fbvector N/A N/A
[Qt] QTypeInfo<T>::isRelocatable QList QVariant QVarLengthArray QList QList q_rotate N/A

2.1.1. Contiguous reallocation

Trivial relocation can be used anywhere we do the moral equivalent of realloc, such as in vector<R>::reserve.

[Bench] (presented at C++Now 2018) shows a 3x speedup on vector<unique_ptr<int>>::resize. This Reddit thread demonstrates a similar 3x speedup using the online tool Quick-Bench.

2.1.2. Moving in-place/SBO type-erased types like any and function

Trivial relocation can be used to de-duplicate the code generated by type-erasing wrappers like any, function, and move_only_function. For these types, a move of the wrapper object is implemented in terms of a relocation of the contained object. (See for example libc++'s std::any.) In general, the relocate operation must have a different instantiation for each different contained type C, leading to code bloat. But every trivially relocatable C of a given size can share the same instantiation.

2.1.3. Moving fixed-capacity containers like static_vector and small_vector

The move-constructor of fixed_capacity_vector<R,N> can be implemented naïvely as an element-by-element move (leaving the source vector’s elements in their moved-from state), or efficiently as an element-by-element relocate (leaving the source vector empty).

For a detailed analysis of this case, see [FixedCapacityVector].

boost::container::static_vector<R,N> currently implements the naïve element-by-element-move strategy, but after LEWG feedback, [P0843R5] static_vector does permit the faster relocation strategy.

2.1.4. Contiguous insert and erase

Trivial relocation can be used anywhere we shift a contiguous range left or right, such as in vector::erase (i.e., destroy the erased elements and "close the window" by memmoving left) and vector::insert (i.e., "make a window" by memmoving right and then construct the new elements in place). [Folly]'s fbvector does this optimization; see fbvector::make_window. Bloomberg’s [BSL] also does this optimization.

But § 5.2 Confusing interactions with std::pmr determines whether this optimization will be possible for certain types.

2.1.5. Swap

Given a reliable way of detecting trivial relocatability, we can optimize any routine that uses the moral equivalent of std::swap, such as std::rotate, std::partition, or std::sort.

Optimizing std::swap produces massive code-size improvements for all swap-based algorithms, including std::sort and std::rotate. See @19:56–21:22 in my C++Now 2019 talk, and see this Godbolt. This Quick-Bench shows a 25% speedup in std::rotate when it is allowed to use bitwise swap on a Rule-of-Zero type.

But § 5.2 Confusing interactions with std::pmr determines whether this optimization will be possible for certain types.

2.2. Assertions, not assumptions

Some data structures might reasonably assert the trivial relocatability of their elements, just as they sometimes assert the stronger property of trivial copyability today. This might help them guarantee reasonable performance, or guarantee exception-safety.

2.3. Eliminate manual warrants, improve safety

Many real-world codebases contain templates which require trivial relocatability of their template parameters, but cannot today verify trivial relocatability. For example, [Folly] requires the programmer to warrant the trivial relocatability of any type stored in a folly::fbvector:

class Widget {
    std::vector<int> lst_;
};

folly::fbvector<Widget> vec;  // FAILS AT COMPILE TIME for lack of warrant

This merely encourages the programmer to add the warrant and continue. An incorrect warrant will be discovered only at runtime, via undefined behavior. See Allocated memory contains pointer to self, [FollyIssue889], and (most importantly) @27:26–31:47 in my C++Now 2019 talk.

class Gadget {
    std::list<int> lst_;
};
// sigh, add the warrant on autopilot
template<> struct folly::IsRelocatable<Gadget> : std::true_type {};

folly::fbvector<Gadget> vec;  // CRASHES AT RUNTIME due to fraudulent warrant

If this proposal is adopted, then Folly can start using static_assert(std::is_trivially_relocatable_v<T>) in the implementation of fbvector, and the programmer can stop writing explicit warrants. Finally, the programmer can start writing assertions of correctness, which aids maintainability and can even find real bugs. Example:

class Widget {
    std::vector<int> lst_;
};
static_assert(std::is_trivially_relocatable_v<Widget>);  // correctly SUCCEEDS

class Gadget {
    std::list<int> lst_;
};
static_assert(std::is_trivially_relocatable_v<Gadget>);  // correctly ERRORS

The improvement in user experience and safety in real-world codebases ([Folly], [BSL], [Qt], etc.) is the most important benefit to be gained by this proposal.

3. Design goals

For the former, longer, contents of this section, see P1144R7 §3.

Briefly: We want to support all five of the following use-cases (Godbolt).

static_assert(std::is_trivially_relocatable_v<std::unique_ptr<int>>); // #1

struct RuleOfZero { std::unique_ptr<int> p_; };
static_assert(std::is_trivially_relocatable_v<RuleOfZero>); // #2

struct [[trivially_relocatable]] RuleOf3 {
    RuleOf3(RuleOf3&&);
    RuleOf3& operator=(RuleOf3&&);
    ~RuleOf3();
};
static_assert(std::is_trivially_relocatable_v<RuleOf3>); // #3

struct [[trivially_relocatable]] Wrap0 {
    boost::movelib::unique_ptr<int> p_;
    static_assert(!std::is_trivially_relocatable_v<decltype(p_)>);
        // it’s not annotated, but we know it’s actually trivial
};
static_assert(std::is_trivially_relocatable_v<Wrap0>); // #4

struct [[trivially_relocatable]] Wrap3 {
    Wrap3(Wrap3&&);
    Wrap3& operator=(Wrap3&&);
    ~Wrap3();
    int i_;
    boost::interprocess::offset_ptr<int> p_ = &i_;
    static_assert(!std::is_trivially_relocatable_v<decltype(p_)>);
        // it’s not trivial, but we preserve our invariant correctly
};
static_assert(std::is_trivially_relocatable_v<Wrap3>); // #5

4. Proposed wording for C++2b

The wording in this section is relative to the current working draft ([N4944]).

Note: There is no difficulty in changing the attribute syntax to a contextual-keyword syntax; the only downsides are aesthetic. We can defer that decision to the last minute, according to CWG’s feedback on the final wording.

4.1. Nothrow bidirectional iterator [algorithms.requirements]

Modify [algorithms.requirements] as follows:

  • If an algorithm’s template parameter is named ForwardIterator, ForwardIterator1, ForwardIterator2, or NoThrowForwardIterator, the template argument shall meet the Cpp17ForwardIterator requirements ([forward.iterators]) if it is required to be a mutable iterator, or model forward_iterator ([iterator.concept.forward]) otherwise.

  • If an algorithm’s template parameter is named NoThrowForwardIterator, the template argument is also required to have the property that no exceptions are thrown from increment, assignment, or comparison of, or indirection through, valid iterators.

  • If an algorithm’s template parameter is named BidirectionalIterator, BidirectionalIterator1, or BidirectionalIterator2, or NoThrowBidirectionalIterator, the template argument shall meet the Cpp17BidirectionalIterator requirements ([bidirectional.iterators]) if it is required to be a mutable iterator, or model bidirectional_iterator ([iterator.concept.bidir]) otherwise.

  • If an algorithm’s template parameter is named NoThrowBidirectionalIterator, the template argument is also required to have the property that no exceptions are thrown from increment, decrement, assignment, or comparison of, or indirection through, valid iterators.

4.2. Relocation operation [defns.relocation]

Add a new section in [intro.defs]:

relocation operation

the homogeneous binary operation performed by std::relocate_at, consisting of a move construction immediately followed by a destruction of the source object

4.3. relocate_at and relocate [specialized.relocate]

Add a new section after [specialized.destroy]:

template<class T>
T *relocate_at(T* source, T* dest);

Mandates: T is a complete non-array object type.

Effects: Equivalent to:

struct guard { T *t; ~guard() { destroy_at(t); } } g(source);
return ::new (voidify(*dest)) T(std::move(*source));

except that if T is trivially relocatable [basic.types], side effects associated with the relocation of the value of *source might not happen.

template<class T>
[[nodiscard]] remove_cv_t<T> relocate(T* source);

Mandates: T is a complete non-array object type.

Effects: Equivalent to:

remove_cv_t<T> t = std::move(source);
destroy_at(source);
return t;

except that if T is trivially relocatable [basic.types], side effects associated with the relocation of the object’s value might not happen.

Note: These functions have both been implemented in my libc++ fork; for relocate, see godbolt.org/z/cqPP4oeE9 and [StdRelocateIsCute]. My implementation of their "as-if-by-memcpy" codepaths relies on Clang’s __builtin_memmove; vendors can use any vendor-specific means to implement them. The wording also deliberately permits a low-quality implementation with no such codepath at all. See @45:23–48:39 in my C++Now 2019 talk.

4.4. uninitialized_relocate, uninitialized_relocate_n, uninitialized_relocate_backward [uninitialized.relocate]

Add a new section after [uninitialized.move]:

template<class InputIterator, class NoThrowForwardIterator>
NoThrowForwardIterator uninitialized_relocate(InputIterator first, InputIterator last,
                                              NoThrowForwardIterator result);

Effects: Equivalent to:

try {
  for (; first != last; ++result, (void)++first) {
    ::new (voidify(*result))
      typename iterator_traits<NoThrowForwardIterator>::value_type(std::move(*first));
    destroy_at(addressof(*first));
  }
  return result;
} catch (...) {
  destroy(++first, last);
  throw;
}

except that if the iterators' common value type is trivially relocatable, side effects associated with the relocation of the object’s value might not happen.

Remarks: If an exception is thrown, all objects in both the source and destination ranges are destroyed.

template<class InputIterator, class Size, class NoThrowForwardIterator>
  pair<InputIterator, NoThrowForwardIterator>
    uninitialized_relocate_n(InputIterator first, Size n, NoThrowForwardIterator result);

Effects: Equivalent to:

try {
  for (; n > 0; ++result, (void)++first, --n) {
    ::new (voidify(*result))
      typename iterator_traits<NoThrowForwardIterator>::value_type(std::move(*first));
    destroy_at(addressof(*first));
  }
  return {first, result};
} catch (...) {
  destroy_n(++first, --n);
  throw;
}

except that if the iterators' common value type is trivially relocatable, side effects associated with the relocation of the object’s value might not happen.

Remarks: If an exception is thrown, all objects in both the source and destination ranges are destroyed.

template<class BidirectionalIterator, class NoThrowBidirectionalIterator>
  NoThrowBidirectionalIterator
    uninitialized_relocate_backward(BidirectionalIterator first, BidirectionalIterator last,
                                    NoThrowBidirectionalIterator result);

Effects: Equivalent to:

try {
  for (; last != first; ) {
    --last;
    --result;
    ::new (voidify(*result))
      typename iterator_traits<NoThrowBidirectionalIterator>::value_type(std::move(*last));
    destroy_at(addressof(*last));
  }
  return result;
} catch (...) {
  destroy(first, ++last);
  throw;
}

except that if the iterators' common value type is trivially relocatable, side effects associated with the relocation of the object’s value might not happen.

Remarks: If an exception is thrown, all objects in both the source and destination ranges are destroyed.

Note: The Remarks allude to blanket wording in [specialized.algorithms.general]/2.

4.5. Trivially relocatable type [basic.types.general]

Add a new section in [basic.types.general]:

An object type T is a trivially relocatable type if it is:

  • a trivially copyable type, or

  • an array of trivially relocatable type, or

  • a (possibly cv-qualified) class type declared with a trivially_relocatable attribute with value true [dcl.attr.trivreloc], or

  • a (possibly cv-qualified) class type which:

    • has no user-provided move constructors or move assignment operators,

    • has no user-provided copy constructors or copy assignment operators,

    • has no user-provided destructors,

    • has no virtual member functions,

    • has no virtual base classes,

    • all of whose members are either of reference type or of trivially relocatable type, and

    • all of whose base classes are of trivially relocatable type.

[Note: For a trivially relocatable type, the relocation operation (as performed by, for example, std::swap_ranges or std::vector::reserve) is tantamount to a simple copy of the underlying bytes. —end note]

[Note: It is intended that most standard library types be trivially relocatable types. —end note]

Note: Polymorphic types are disallowed from "natural" trivial relocatability. See Appendix C, example 5. Volatile members are not disallowed. See [Subobjects].

4.6. [[trivially_relocatable]] attribute [dcl.attr.trivreloc]

Add a new section after [dcl.attr.nouniqueaddr]:

The attribute-token trivially_relocatable specifies that a class type’s relocation operation has no visible side-effects other than a copy of the underlying bytes, as if by the library function std::memcpy. It may be applied to the definition of a class. It shall appear at most once in each attribute-list. An attribute-argument-clause may be present and, if present, shall have the form

( constant-expression )
The constant-expression shall be an integral constant expression of type bool. If no attribute-argument-clause is present, it has the same effect as an attribute-argument-clause of (true).

If any definition of a class type has a trivially_relocatable attribute with value V, then each definition of the same class type shall have a trivially_relocatable attribute with value V. No diagnostic is required if definitions in different translation units have mismatched trivially_relocatable attributes.

If a class type is declared with the trivially_relocatable attribute, and the program relies on observable side-effects of its relocation other than a copy of the underlying bytes, the behavior is undefined.

4.7. Type trait is_trivially_relocatable [meta.unary.prop]

Add a new entry to Table 47 in [meta.unary.prop]:

TemplateConditionPreconditions
template<class T> struct is_trivially_relocatable; T is a trivially relocatable type. remove_all_extents_t<T> shall be a complete type or cv void.

4.8. relocatable concept [concept.relocatable]

Add a new section after [concept.copyconstructible]:

template<class T>
  concept relocatable = move_constructible<T>;

If T is an object type, then let rv be an rvalue of type T, lv an lvalue of type T equal to rv, and u2 a distinct object of type T equal to rv. T models relocatable only if

  • After the definition T u = rv;, u is equal to u2.

  • T(rv) is equal to u2.

  • If the expression u2 = rv is well-formed, then the expression has the same semantics as u2.~T(); ::new ((void*)std::addressof(u2)) T(rv);

  • If the definition T u = lv; is well-formed, then after the definition u is equal to u2.

  • If the expression T(lv) is well-formed, then the expression’s result is equal to u2.

  • If the expression u2 = lv is well-formed, then the expression has the same semantics as u2.~T(); ::new ((void*)std::addressof(u2)) T(lv);

Note: We intend that a type may be relocatable regardless of whether it is copy-constructible; but, if it is copy-constructible then copy-and-destroy must have the same semantics as move-and-destroy. We intend that a type may be relocatable regardless of whether it is assignable; but, if it is assignable then assignment must have the same semantics as destroy-and-copy or destroy-and-move. The semantic requirements on assignment help us optimize vector::insert and vector::erase. pmr::forward_list<int> satisfies relocatable, but it models relocatable only when all relevant objects have equal allocators.

5. Rationale and alternatives

5.1. Attribute [[maybe_trivially_relocatable]]

My Clang implementation, currently available on Godbolt, supports both [[clang::trivially_relocatable]] and another attribute called [[clang::maybe_trivially_relocatable]], which John McCall requested that I explore.

In Issaquah (February 2023), [P2786R0] suggested a very similar design to [[clang::maybe_trivially_relocatable]]. EWGI took a three-way straw poll on [[trivially_relocatable]] versus [[maybe_trivially_relocatable]], with an inconclusive 7–5–6 vote (the author of P1144 voting "For" and the three representatives of P2786 presumably voting "Against").

For discussion of [[maybe_trivially_relocatable]] and why I don’t think it’s the right thing to standardize, see

5.2. Confusing interactions with std::pmr

Note: This section was added in P1144R5, revised in P1144R7, and again in P1144R8. Here I assume libc++, where std::string is trivially relocatable. Feel free to substitute your favorite trivially relocatable container type; I’m sticking with string for these examples because it is short and easy to spell.

P1144 proposes a "trivial relocatability" very similar to "trivial copyability": if the user provides any special member, we assume that the user wants control over this type’s value semantics, and we won’t implicitly infer triviality. This includes the assignment operator.

struct A {
    A(A&&) = default;
    A& operator=(A&&);
    ~A() = default;
};
static_assert(!std::is_trivially_relocatable_v<A>);
  // P2786 would say the opposite here

P1144 treats move as an optimization of copy; thus we assume that it always ought to be safe to replace "copy and destroy" with "move and destroy" (and thus with "relocate"). This is informed by Arthur’s experience standardizing "implicit move" in P1155 (C++20) and P2266 (C++23). "Implicit move" affects pmr types because their copy differs from their move. Example:

struct ByValueSink {
    ByValueSink(std::pmr::vector<int> v) : v_(std::move(v)) {}
    std::pmr::vector<int> v_;
};

ByValueSink f(std::pmr::memory_resource *mr) {
    auto v = std::pmr::vector<int>(mr);
    return v;
      // C++17, Clang 12-: Copies (changes allocator)
      // C++20, GCC, Clang 13+: Moves (preserves allocator)
}

We didn’t care about this when fiddling with implicit move; P1144 implicitly believes we shouldn’t care about it now. (This position might be debatable, but it’s definitely part of the worldview that led to P1144’s specific proposed semantics, so it’s good to have this background in mind.)

P1144 wants to be able to optimize not just vector reserve, but also vector insert and erase (see table §2.1). Today, vector insert and erase use move-assignment, not relocation. (In fact, [vector.modifiers]/5 specifies the exact number of times "the assignment operator of T is called." We might want to fix that, but that’s out of scope for this paper.) We can optimize vector insert to use (trivial) relocation for element types that model relocatable ([concept.relocatable]).

Some types invariably model relocatable; that is, their assignment is naturally tantamount to construct-and-destroy. But for pmr types and polymorphic types, assignment is not tantamount to construct-and-destroy, except under certain conditions. For pmr types, the condition is "both objects have the same allocator." For polymorphic types, the condition is "both objects have the same dynamic type."

std::vector<Derived> dv = { ~~~ };
dv.erase(dv.begin());
  // Every object in the vector certainly has the same dynamic type.

std::pmr::vector<std::pmr::string> pv = { ~~~ };
pv.erase(pv.begin());
  // Every string in the vector certainly has the same allocator.

Derived da[5] = { ~~~ };
std::rotate(da, da+2, da+5);
  // Every object in the contiguous range certainly has the same dynamic type.

std::pmr::string pa[5] = { ~~~ };
std::rotate(pa, pa+2, pa+5);
  // Objects in the contiguous range might have different allocators?

P1144 aims to permit efficient insertion and erasure in std::vector. Example:

std::vector<std::string> vs(501);
vs.erase(vs.begin());

This erase requires us to shift down 500 std::string objects, which can be done either by 500 calls to string::operator= followed by one call to ~string, or by one call to ~string followed by a memmove (as seen for example in [Folly] and BSL; see table §2.1). We want to permit BSL’s implementation. That’s why since P1144R5, the definition of concept relocatable in §4.8 places semantic requirements on T's assignment operators (if they exist) as well as on T's constructors and destructors.

If pmr::string is considered trivially relocatable, this will trickle down into all Rule-of-Zero types with a pmr::string member. (Godbolt.)

static_assert(std::is_trivially_relocatable_v<std::pmr::string>);
  // Before: not true, of course
  // After: suppose this is true
struct A {
  std::pmr::string ps;
};
static_assert(std::is_trivially_relocatable_v<A>);
  // After: then this will also be true, by the Rule of Zero
std::vector<A> v;
v.push_back({std::pmr::string("one", &mr1)});
v.push_back({std::pmr::string("two", &mr2)});
v.erase(v.begin());
  // Before: Well-defined behavior, acts as-if by assignment
  // After: Do we somehow make this a precondition violation?
assert(v[0].ps.get_allocator().resource() == &mr1);
  // Before: Succeeds
  // After: Might fail (and on a quality implementation, *will* fail)

If we (1) advertise pmr::string as is_trivially_relocatable; (2) propagate trivial relocatability in the core language as both P1144 and P2786 propose to do; and (3) optimize vector insert and erase for trivially relocatable types; then we inevitably arrive here. Arthur’s solution is to impose preconditions on user-defined types used within certain parts of the STL: Just as their destructors are forbidden to (dynamically) throw exceptions, and their copy constructors are forbidden to (dynamically) make things that aren’t copies, their assignment operators ought to be forbidden to (dynamically) violate concept relocatable's semantic requirements.

What new wording is needed to achieve this?

5.3. Non-issue: Overlapping base-class subobjects

Note: This section was added in P1144R7 and resolved in P1144R8.

In the following snippet, struct Dangerous is trivially copyable, so we might think that std::swap(ra, rb) should be allowed to swap the bytes of ra with the bytes of rb. A naïve approach is to swap sizeof(Dangerous) == 8 bytes, but that’s unsafe, because the tail padding of Dangerous is allowed to contain the value of Derived::k. If std::swap is to do "trivial swapping," it must account for potentially overlapping subobjects.

struct Dangerous {
    Dangerous(int i, int j) : i(i), j(j) {}
    int i;
    short j;
};
static_assert(std::is_standard_layout_v<Dangerous>);
static_assert(std::is_trivially_copyable_v<Dangerous>);
static_assert(std::is_trivially_relocatable_v<Dangerous>);

struct Derived : Dangerous {
    Derived(int i, int j, int k) : Dangerous(i, j), k(k) {}
    short k;
};
static_assert(!std::is_standard_layout_v<Derived>);
static_assert(std::is_trivially_copyable_v<Derived>);
static_assert(std::is_trivially_relocatable_v<Derived>);

int main() {
    Derived a = Derived(1,2,3);
    Derived b = Derived(4,5,6);
    Dangerous &ra = a, &rb = b;
    std::swap(ra, rb);  // must swap only 6 bytes,
                        // not sizeof(Dangerous) = 8 bytes
    assert(a.k == 3 && b.k == 6);
}

Arthur’s libc++ fork implements this solution (P1144R7’s "possible solution #4"): std::swap is allowed to bytewise-swap trivially copyable and/or trivially relocatable types only when they are definitely not potentially overlapping subobjects. In practice, this means:

6. Acknowledgements

Thanks to Pablo Halpern for [N4158], to which this paper bears a striking resemblance —including the meaning assigned to the word "trivial," and the library-algorithm approach to avoiding the problems with "lame duck objects" discussed in the final section of [N1377]. See discussion of N4034 at Rapperswil (June 2014) and discussion of N4158 at Urbana (November 2014).

Significantly different approaches to this problem have previously appeared in Rodrigo Castro Campos’s [N2754], Denis Bider’s [P0023R0] (introducing a core-language "relocation" operator), and Niall Douglas’s [P1029R3] (treating trivial relocatability as an aspect of move-construction in isolation, rather than an aspect of the class type as a whole). A less different approach is taken in Mungo Gill & Alisdair Meredith’s [P2786R0].

Thanks to Elias Kosunen, Niall Douglas, John Bandela, and Nicolas Lesser for their feedback on early drafts of P1144R0.

Thanks to Nicolas Lesser and John McCall for their review comments on the Clang implementation [D50119].

Many thanks to Matt Godbolt for allowing me to install my Clang fork on Compiler Explorer (godbolt.org). See also [Announcing].

Thanks to Howard Hinnant for appearing with me on [CppChat], and to Jon Kalb and Phil Nash for hosting us.

Thanks to Marc Glisse for his work integrating a "trivially relocatable" trait into GNU libstdc++ and for answering my questions on GCC bug 87106.

Thanks to Jens Maurer for his feedback on P1144R3 at Kona 2019, and to Corentin Jabot for championing P1144R4 at Prague 2020.

Thanks to Dana Jansens for her contributions re overlapping and non-standard-layout types (see [Subspace]), to Alisdair Meredith for our extensive discussions during the February 2023 drafting of [P2786R0], to Giuseppe D’Angelo and Thiago Maceira for contributing the Qt entries in table §2.1, and to Giuseppe D’Angelo for extensive review comments and discussion.

Thanks to Mungo Gill and Alisdair Meredith for [P2786R0] and [P2814R0].

Appendix A: Straw polls

Polls taken at EWGI at Issaquah on 2023-02-10

Arthur O’Dwyer presented [P1144R6]. Alisdair Meredith presented [P2786R0] (which proposed a [[maybe_trivially_relocatable]]-style facility, and expressed it as a contextual keyword instead of an attribute). EWGI took the following straw polls (as well as polls on attribute syntax and on both papers' readiness for EWG).

SF F N A SA
The problem presented in P1144/P2786 is worth solving. 10 8 0 0 0
The problem being introduced in P1144/P2786 should be solved in a more general way instead of as proposed. 3 0 5 6 4
The annotation should "trust the user" as in P1144R6’s [[trivially_relocatable]] ("sharp knife"), instead of diagnosing as in P1144R6’s [[clang::maybe_trivially_relocatable]] and P2786R0’s trivially_relocatable ("dull knife"). Three-way poll. 7 5 6

Polls taken at EWGI at Prague on 2020-02-13

Corentin Jabot championed P1144R4. EWGI discussed P1144R4 and Niall Douglas’s [P1029R3] consecutively, then took the following straw polls (as well as a poll on the attribute syntax).

SF F N A SA
We believe that P1029 and P1144 are sufficiently different that they should be advanced separately. 7 3 2 0 0
EWGI is ok to have the spelling as an attribute with an expression argument. 3 5 1 1 0
EWGI thinks the author should explore P1144 as a customizable type trait. 0 0 0 9 2
Forward P1144 to EWG. 1 3 4 1 0

For polls taken September–November 2018, see [P1144R6].

Appendix B: Sample code

See [P1144R6]'s Appendix B for reference implementations of relocate, relocate_at, and P1144R6’s version of the uninitialized_relocate library algorithm, plus a conditionally trivially relocatable std::optional<T>.

I have implemented the entire Standard Library using the [[clang::trivially_relocatable]] attribute; you can find the source code on my GitHub and explore the resulting codegen on Godbolt Compiler Explorer.

Appendix C: Examples of non-trivially relocatable class types

Class contains pointer to self

This fictional short_string illustrates a mechanism that can apply to any small-buffer-optimized class. libc++'s std::function uses this mechanism and is thus not trivially relocatable.

However, different mechanisms for small-buffer optimization exist. libc++'s std::any achieves small-buffer optimization on a 24-byte buffer, without (necessarily) sacrificing trivial relocatability.

struct short_string {
    char *data_ = buffer_;
    size_t size_ = 0;
    char buffer_[8] = {};

    const char *data() const { return data_; }

    short_string() = default;
    short_string(const char *s) : size_(strlen(s)) {
        if (size_ < sizeof buffer_)
            strcpy(buffer_, s);
        else
            data_ = strdup(s);
    }
    short_string(short_string&& s) {
        memcpy(this, &s, sizeof(*this));
        if (s.data_ == s.buffer_)
            data_ = buffer_;
        else
            s.data_ = nullptr;
    }
    ~short_string() {
        if (data_ != buffer_)
            free(data_);
    }
};

Allocated memory contains pointer to self

std::list needs somewhere to store its "past-the-end" node, commonly referred to as the "sentinel node," whose prev pointer points to the list’s last node. If the sentinel node is allocated on the heap, then std::list can be trivially relocatable; but if the sentinel node is placed within the list object itself (as happens on libc++ and libstdc++), then relocating the list object requires fixing up the list’s last node’s next pointer so that it points to the new sentinel node inside the destination list object. This fixup of an arbitrary heap object cannot be simulated by memcpy.

Traditional implementations of std::set and std::map also store a "past-the-end" node inside themselves and thus also fall into this category.

struct node {
    node *prev_ = nullptr;
    node *next_ = nullptr;
};
struct list {
    node n_;
    iterator begin() { return iterator(n_.next_); }
    iterator end() { return iterator(&n_); }
    list(list&& l) {
        if (l.n_.next_) l.n_.next_->prev_ = &n_;  // fixup
        if (l.n_.prev_) l.n_.prev_->next_ = &n_;  // fixup
        n_ = l.n_;
        l.n_ = node{};
    }
    // ...
};

Class invariant depends on this

The offset_ptr provided by [Boost.Interprocess] is an example of this category.

struct offset_ptr {
    uintptr_t value_;

    uintptr_t here() const { return uintptr_t(this); }
    uintptr_t distance_to(void *p) const { return uintptr_t(p) - here(); }
    void *get() const { return (void*)(here() + value_); }

    offset_ptr() : value_(distance_to(nullptr)) {}
    offset_ptr(void *p) : value_(distance_to(p)) {}
    offset_ptr(const offset_ptr& rhs) : value_(distance_to(rhs.get())) {}
    offset_ptr& operator=(const offset_ptr& rhs) {
        value_ = distance_to(rhs.get());
        return *this;
    }
    ~offset_ptr() = default;
};

Program invariant depends on this

In the following snippet, struct Widget is relocatable, but not trivially relocatable, because the relocation operation of destroying a Widget at point A and constructing a new Widget at point B has behavior that is observably different from a simple memcpy.

std::set<void *> registry;

struct registered_object {
    registered_object() { registry.insert(this); }
    registered_object(registered_object&&) = default;
    registered_object(const registered_object&) = default;
    registered_object& operator=(registered_object&&) = default;
    registered_object& operator=(const registered_object&) = default;
    ~registered_object() { registry.erase(this); }
};

struct Widget : registered_object {};

Polymorphic downcast effectively relies on offset-into-self

Thanks to David Stone for this example. In the following snippet, struct Base is relocatable, but not trivially relocatable, because its copy constructor and assignment operator do not copy the entire state of the right-hand object. (Notice that pf is initialized with f, not with a copy of o.pf.)

struct Base {
    static int f(Base*) { return 21; }
    int (*pf)(Base*);
    Base(int (*pf)(Base*) = f) : pf(pf) {}
    Base(const Base& o) : pf(f) {}
    Base& operator=(const Base&) { return *this; }
};
struct Derived : Base {
    static int f(Base *self) { return ((Derived*)self)->x; }
    Derived() : Base(f) {}
    Derived(const Derived&) = default;
    Derived& operator=(const Derived& o) { x = o.x; return *this; }
    int x = 42;
};

int main() {
    Base&& d = Derived();
    Base&& b = Base();
    std::swap(b, d);
    printf("%d\n", b.pf(&b));
}

The above snippet is isomorphic to a classically polymorphic hierarchy with virtual methods. Here is the same snippet using virtual:

struct Base {
    virtual int f() { return 21; }
};
struct Derived : Base {
    int f() override { return x; }
    int x = 42;
};

int main() {
    Base&& b = Base();
    Base&& d = Derived();
    std::swap(b, d);
    printf("%d\n", b.f());
}

This is why (since P1144R5) the compiler will not consider types with virtual methods to be "naturally" trivially relocatable.

Appendix D: Implementation experience

A prototype Clang/libc++ implementation is at

Side-by-side case studies of [[trivially_relocatable]], [[maybe_trivially_relocatable]], and [[trivially_relocatable(bool)]] are given in [P1144R6].

As of November 2018, libstdc++ performs the vector::resize optimization for any type which has manually specialized std::__is_bitwise_relocatable. (See this Godbolt.) Manual specialization is also the approach used by [Qt], [Folly], and [BSL]. As of May 2023, the only libstdc++ library type for which __is_bitwise_relocatable has been specialized is deque; see [Deque].

Clang trunk provides a compiler builtin type trait __is_trivially_relocatable(T) (see [D114732]), which is largely the same as the trait proposed in this paper. (There are slight differences; e.g., Clang reports polymorphic types, reference types, and incomplete types as trivially relocatable, whereas P1144 does not. I’m not aware that any of Clang’s differences were intentional.) Clang trunk has no equivalent of the [[trivially_relocatable]] attribute, so __is_trivially_relocatable(T) is true only for trivially copyable types and types marked with the [[clang::trivial_abi]] attribute. As of May 2023, Clang trunk has no conception of a type which is non-trivial for purposes of calls and yet is trivially relocatable. But Clang’s current status is compatible with P1144 (modulo the few unintentional differences in __is_trivially_relocatable mentioned above).

Appendix E: Open questions

In P1144R8, types with vptrs are never naturally trivially relocatable. Arthur claims that types with vptrs should be used only with inheritance, which means they never do value semantics (they slice instead), which means relocating one is always a bug, just like assigning or copying one. But [P2786R0] proposes to make polymorphic types naturally trivially relocatable. Is there any use-case for relocating polymorphic types? Is P1144 being too conservative here?

In P1144R8, std::vector<std::pmr::vector<int>>::erase(pos) can use trivial relocation to "close up the window" in the vector; this is also done in [Qt], [Folly], and [BSL]. The difference between closing-the-window-by-move-assignment and closing-the-window-by-memmove is detectable in a conforming C++ program. Therefore [P2786R0] proposes that relocation should not be allowed here. Is there any implementation experience of P2786R0’s conservative position here? ([BSL]'s bsl::vector follows P1144R8’s liberal position already.)

In P1144R8, std::rotate(a, a+2, a+8) on an array std::pmr::vector<int> a[10] can use trivial relocation to swap the elements. The difference between swap-by-move-assignment and swap-by-trivial-relocation is not detectable in a conforming C++ program, because swapping pmr::vectors with unequal allocators is UB. [P2786R0] conservatively proposes that relocation should not be allowed here, anyway. Is there any usage experience in [Qt], [BSL], [Folly], or elsewhere that bears on this question? (Arthur’s libc++ fork does relocation here, but doesn’t really count as usage experience since he’s the only one using it.)

If std::pmr::vector<int> is considered trivially relocatable, then so would be a Rule-of-Zero type Widget with a pmr::vector data member. This means that we optimize std::rotate(Widget*, ...) (Godbolt), which is a huge performance win but again is observably different behavior from the old way — and in this particular case the old behavior is not undefined! Still, it would be sad to give up this massive performance improvement for pmr types. Can we modify the preconditions of {pmr::vector, swap, swap_ranges, rotate} to make this optimization conforming?

Index

Terms defined by this specification

References

Normative References

[N4944]
Thomas Köppe. Working Draft, Standard for Programming Language C++. 22 March 2023. URL: https://wg21.link/n4944

Informative References

[Announcing]
Arthur O'Dwyer. Announcing "trivially relocatable". July 2018. URL: https://quuxplusone.github.io/blog/2018/07/18/announcing-trivially-relocatable/
[Bench]
Arthur O'Dwyer. Benchmark code from "The Best Type Traits C++ Doesn't Have". April 2018. URL: https://github.com/Quuxplusone/from-scratch/blob/095b246d/cppnow2018/benchmark-relocatable.cc
[Boost.Interprocess]
Ion Gaztañaga. Mapping Address Independent Pointer: offset_ptr. 2005. URL: https://www.boost.org/doc/libs/1_67_0/doc/html/interprocess/offset_ptr.html
[BSL]
Bloomberg. bslmf::IsBitwiseMoveable: bitwise moveable trait metafunction. 2013–2022. URL: https://github.com/bloomberg/bde/blob/962f7aa/groups/bsl/bslmf/bslmf_isbitwisemoveable.h#L8-L48
[CppChat]
Howard Hinnant; Arthur O'Dwyer. cpp.chat episode 40: It works but it's undefined behavior. August 2018. URL: https://www.youtube.com/watch?v=8u5Qi4FgTP8
[D114732]
Devin Jeanpierre. [clang] Mark trivial_abi types as trivially relocatable. November 2021. URL: https://reviews.llvm.org/D114732
[D50119]
Arthur O'Dwyer; Nicolas Lesser; John McCall. Compiler support for P1144R0 __is_trivially_relocatable(T). July 2018. URL: https://reviews.llvm.org/D50119
[Deque]
Marc Glisse. Improve relocation ... (__is_trivially_relocatable): Specialize for deque. November 2018. URL: https://github.com/gcc-mirror/gcc/commit/a9b9381580de611126c9888c1a6c12a77d9b682e
[FixedCapacityVector]
Arthur O'Dwyer. P1144 case study: Moving a `fixed_capacity_vector`. URL: https://quuxplusone.github.io/blog/2019/02/22/p1144-fixed-capacity-vector/
[Folly]
Facebook. Folly documentation on "Object Relocation". URL: https://github.com/facebook/folly/blob/master/folly/docs/FBVector.md#object-relocation
[FollyIssue889]
Arthur O'Dwyer. Traits.h marks std::list as trivially relocatable, but in fact it is not. URL: https://github.com/facebook/folly/issues/889
[N1377]
Howard Hinnant; Peter Dimov; Dave Abrahams. N1377: A Proposal to Add Move Semantics Support to the C++ Language. September 2002. URL: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2002/n1377.htm
[N2754]
Rodrigo Castro Campos. N2754: TriviallyDestructibleAfterMove and TriviallyReallocatable (rev 3). September 2008. URL: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2754.html
[N4158]
Pablo Halpern. N4158: Destructive Move (rev 1). October 2014. URL: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4158.pdf
[P0023R0]
Denis Bider. P0023R0: Relocator: Efficiently Moving Objects. April 2016. URL: http://open-std.org/JTC1/SC22/WG21/docs/papers/2016/p0023r0.pdf
[P0843R5]
Gonzalo Brito Gadeschi. static_vector. July 2022. URL: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p0843r5.html
[P1029R3]
Niall Douglas. P1029R3: move = bitcopies. January 2020. URL: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1029r3.pdf
[P1144R6]
Arthur O'Dwyer. Object relocation in terms of move plus destroy. June 2022. URL: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p1144r6.html
[P2786R0]
Mungo Gill; Alisdair Meredith. Trivial relocatability options. February 2023. URL: https://wg21.link/p2786r0
[P2814R0]
Mungo Gill; Alisdair Meredith. Trivial relocatability — comparing P2786 with P1144. May 2023. URL: https://wg21.link/p2814r0
[Qt]
Qt Base. February 2023. URL: https://github.com/qt/qtbase/
[StdRelocateIsCute]
Arthur O'Dwyer. std::relocate's implementation is cute. May 2022. URL: https://quuxplusone.github.io/blog/2022/05/18/std-relocate/
[Subobjects]
Arthur O'Dwyer. When is a trivially copyable object not trivially copyable?. July 2018. URL: https://quuxplusone.github.io/blog/2018/07/13/trivially-copyable-corner-cases/
[Subspace]
Dana Jansens. Trivially Relocatable Types in C++/Subspace. January 2023. URL: https://danakj.github.io/2023/01/15/trivially-relocatable.html

Issues Index

What new wording is needed to achieve this?
In P1144R8, types with vptrs are never naturally trivially relocatable. Arthur claims that types with vptrs should be used only with inheritance, which means they never do value semantics (they slice instead), which means relocating one is always a bug, just like assigning or copying one. But [P2786R0] proposes to make polymorphic types naturally trivially relocatable. Is there any use-case for relocating polymorphic types? Is P1144 being too conservative here?
In P1144R8, std::vector<std::pmr::vector<int>>::erase(pos) can use trivial relocation to "close up the window" in the vector; this is also done in [Qt], [Folly], and [BSL]. The difference between closing-the-window-by-move-assignment and closing-the-window-by-memmove is detectable in a conforming C++ program. Therefore [P2786R0] proposes that relocation should not be allowed here. Is there any implementation experience of P2786R0’s conservative position here? ([BSL]'s bsl::vector follows P1144R8’s liberal position already.)
In P1144R8, std::rotate(a, a+2, a+8) on an array std::pmr::vector<int> a[10] can use trivial relocation to swap the elements. The difference between swap-by-move-assignment and swap-by-trivial-relocation is not detectable in a conforming C++ program, because swapping pmr::vectors with unequal allocators is UB. [P2786R0] conservatively proposes that relocation should not be allowed here, anyway. Is there any usage experience in [Qt], [BSL], [Folly], or elsewhere that bears on this question? (Arthur’s libc++ fork does relocation here, but doesn’t really count as usage experience since he’s the only one using it.)
If std::pmr::vector<int> is considered trivially relocatable, then so would be a Rule-of-Zero type Widget with a pmr::vector data member. This means that we optimize std::rotate(Widget*, ...) (Godbolt), which is a huge performance win but again is observably different behavior from the old way — and in this particular case the old behavior is not undefined! Still, it would be sad to give up this massive performance improvement for pmr types. Can we modify the preconditions of {pmr::vector, swap, swap_ranges, rotate} to make this optimization conforming?