<charconv>
and char_traits
) to the C++ freestanding library.
The current definition of the freestanding implementation is not very useful. Here is the current high level definition from WG21's [intro.compliance]:
Two kinds of implementations are defined: a hosted implementation and a freestanding implementation.For a hosted implementation, this document defines the set of available libraries.A freestanding implementation is one in which execution may take place without the benefit of an operating system, and has an implementation-defined set of libraries that includes certain language-support libraries ([compliance]).
Similar wording is present in 5.1.2.1 "Freestanding Environment" in WG14 N2454.
In a freestanding environment (in which C program execution may take place without any benefit of an operating system)[...]
The main people served by the current C++ freestanding definition are people writing their own hosted C++ standard library to sit atop the compiler author's freestanding implementation (i.e. the STLport use case). The C++ freestanding portions contain most of the functions and types known to the compiler that can't easily be authored in a cross-compiler manner.
The current set of freestanding libraries provides too little to kernel, micro-controller, and GPU programmers.
Why should a systems programmer need to rewrite std::from_chars
or memcpy()
?
I propose we provide the (nearly) maximal subset of the library that does not require an OS or space overhead. In order to continue supporting the "layered" C++ standard library users, we will continue to provide the (nearly) minimal subset of the library needed to support all the language features, even if these features have space overhead. Language features requiring space overhead or OS support will remain intact.
The C and C++ standard libraries have many generally useful facilities that systems programmers could benefit from. By requiring those functions to be present in freestanding implementations, we make it possible to make higher level programs both easier to write, and portable. Currently, programs that would like to be portable are required to either rely on implementation defined extensions, or provide look-alike implementations.
A freestanding C implementation is required to provide the entirety of the following headers:
<float.h>
<iso646.h>
<limits.h>
<stdalign.h>
<stdarg.h>
<stdbool.h>
<stddef.h>
<stdint.h>
<stdnoreturn.h>
Some additional features are required if the implementation defines the __STDC_IEC_60559_BFP__
(binary floating point) macro or the __STDC_IEC_60559_DFP__
(decimal floating point) macro.
This includes <fenv.h>
, <math.h>
, and parts of <stdlib.h>
. Such implementations indirectly require locale support, as the <stdlib.h>
numeric conversion functions are implemented in terms of isspace
.
The entire core language is required.
This includes _Thread_local
, which requires operating system interaction on multi-threaded systems.
A freestanding C++ implementation is required to provide the entirety of the following headers:
<cstddef>
<cfloat>
<climits>
<limits>
<version>
<cstdint>
<new>
<typeinfo>
<source_location>
<exception>
<initializer_list>
<compare>
<coroutine>
<cstdarg>
<concepts>
<type_traits>
<bit>
Almost all of <atomic>
is required (C does not require <stdatomic.h>
in freestanding implementations).
<cstdlib>
must provide abort
, atexit
, at_quick_exit
, exit
, and quick_exit
.
The entire core language is required. For C++, this is much more onerous than for C, as the C++ core language includes exceptions, RTTI, thread-safe static initialization, and heap allocations.
The in-flight paper P2013 makes it such that the allocating forms of ::operator new
are no longer required.
This requirement often meant that the underlying C implementation of a freestanding C++ library needed to have malloc
and free
implementations.
The in-flight paper P1642 adds many C++ specific facilities, but it also adds _Exit
.
The specification for quick_exit
specifically calls out _Exit
, so this omission is a specification bug.
A freestanding C++ implementation is mostly a superset of a freestanding C implementation, even in the "C" parts of C++. This means that a freestanding C++ implementation can not generally be built on top of a minimal freestanding C implementation. Either the C++ implementation must provide some of the C parts, or the C++ implementation will require a C implementation that provides more than the minimum.
The current scope of this proposal is limited to the freestanding standard library available to micro-controller, kernel, and GPU development.
This paper is currently concerned with the divisions of headers and library functions as they were in C++17. "Standard Library Modules" (P0581) discusses how the library will be split up in a post-modules world. This paper may influence the direction of P0581, but this paper won't make any modules recommendations.
In the C standard library, a new editorial strategy will be used to mark facilities as freestanding.
Prose in the standard will declare various facilities as freestanding library facilities.
Only the primary definition will be declared this way, so we won't be duplicating this prose multiple times for the same facility (e.g. NULL
, size_t
, wchar_t
, etc...).
Prior to this paper, the required contents of the C freestanding library were called out by header, and (conditionally) by clause in the case of <stdlib.h>
numeric conversion functions in 7.22.1.
This editorial strategy is cumbersome for partially required headers.
In the C++ standard library, the editorial strategy described in WG21 P1642 will be used to annotate which facilities are required in freestanding implementations.
C freestanding libraries would be required to provide more facilities than they are currently required to provide. Implementations likely already provide many of these functions due to user demand.
In theory, providing additional headers could silently break customer code that was already providing those headers. Those uses were undefined behavior according to WG14 N2454, 7.1.2 Standard Headers#4.
If a file with the same name as one of the above < and > delimited sequences, not provided as part of the implementation, is placed in any of the standard places that are searched for included source files, the behavior is undefined.
A C program could be using it's own definition of, say, memcpy
, so long as it does not include string.h
.
Implementations that are worried about such cases will need to take care to use macro definitions for most functions that forward to reserved identifier functions, so as to avoid multiple definitions.
C++ standard library headers will likely need to add preprocessor feature toggles to portions of headers that would emit warnings or errors in freestanding mode. The timeliness (compile time vs. link time) of errors remains a quality-of-implementation detail.
A minimal freestanding C17 standard library will not be sufficient to provide the C portions of the C++ standard library.
std::char_traits
and many of the function specializations in <algorithm>
are implemented in terms of non-freestanding C functions.
In practice, most C libraries are not minimal freestanding C17 libraries.
The optimized versions of the <cstring>
and <cwchar>
functions will often be the same for both hosted and freestanding environments.
The main way in which a hosted implementation of (for example) memcpy
could differ between hosted and freestanding is that some freestanding implementations (e.g. kernel implementations) would not want memcpy
to use vector / floating point registers.
My expectation is that no new C++ freestanding library will be authored as a result of this paper. Instead hosted libraries will be stripped down through some feature toggle mechanism to become freestanding.
Even more so than for a hosted implementation; kernel, micro-controller, and GPU programmers do not want to pay for what they don't use. As a consequence, I am not adding features that require global data storage, even if that storage is immutable.
Note that the following concerns are not revolving around execution time performance. These are generally concerns about space overhead and correctness.
This proposal doesn't remove problematic features from the language, but it does make it so that the bulk of the freestanding standard library doesn't require those features. Users that disable the problematic features (as is existing practice) will still have portable portions of the standard library at their disposal.
Note that we cannot just take the list of C++ constexpr
functions and make those functions the freestanding subset. We also can't do the reverse, and make everything freestanding constexpr
or conditionally noexcept
. memcpy
cannot currently be made constexpr
because it must convert from cv void*
to unsigned char[]
. Several floating point functions could be made constexpr
, but would not be permitted in freestanding. constexpr
also allows allocations, which freestanding avoids.
We also cannot just take the list of everything that is conditionally noexcept
and make those functions freestanding. The "Lakos Rule"[Meredith11] prohibits most standard library functions from being conditionally noexcept
, unless they have a wide contract.
Regardless, if a function or class is constexpr
or noexcept
, and it doesn't involve floating point, then that function or class is a strong candidate to be put into freestanding mode.
In the future, it may make sense to allow all constexpr
functions into freestanding, so long as they are used in a constexpr
context and not invoked at runtime.
In C++, to_chars
, from_chars
, and abs
are overloaded on floating point and integral types.
This paper is making the integral overloads required in freestanding implementations.
It would be undesirable for the behavior of a library or program to silently change when porting it from a freestanding implementation to a hosted implementation though.
That could easily happen with this overload set if a user called abs(0.5)
.
If the floating point overloads were merely omitted, then abs(0.5)
would call one of the integral overloads on a freestanding implementation.
To avoid this trap, the floating point overloads will be marked as //freestanding delete.
Freestanding implementations can either =delete
the function, or provide an implementation of the function that meets the hosted requirements.
This will cause accidental uses of these functions to fail to compile, as =delete
functions participate in overload resolution.
Note that split overload set problems already exist in the C++ standard. A translation unit that includes <cinttypes>
and calls abs(0.5)
may end up resolving the overload to abs(intmax_t)
.
Exceptions either require external jump tables or extra bookkeeping instructions. This consumes program storage space.
In the Itanium ABI, throwing an exception requires a heap allocation. In the Microsoft ABI, re-throwing an exception will consume surprisingly large amounts of stack space (2,100 bytes for a re-throw in 32-bit environments, 9,700 bytes in a 64-bit environment). Program storage space, heap space, and stack space are typically scarce resources in micro-controller development.
In environments with threads, exception handling requires the use of thread-local storage.
The heap is a big set of global state. In addition, C++ heap exhaustion is typically expressed via exception. Some micro-controller systems don't have a heap. In kernel environments, there is typically a heap, but there isn't a reasonable choice of which heap to use as the default. In the Windows kernel, the two best candidates for a default heap are the paged pool (plentiful available memory, but unsafe to use in many contexts), and the non-paged pool (safe to use, but limited capacity). The C++ implementation in the Windows kernel forces users to implement their own global operator new
to make this decision.
P2013 allows freestanding C++ implementations to omit the global allocating ::operator new
implementations by default.
Many micro-controller systems don't have floating point hardware. Software emulated floating point can drag in large runtimes that are difficult to optimize away.
Most operating systems speed up system calls by not saving and restoring floating point state. That means that kernel uses of floating point operations require extra care to avoid corrupting user state.
In C, the dynamic floating-point environment has thread storage duration. This drags in the same set of problems that Thread-local storage has.
These functions are not being added to the freestanding library.
Examples are the locale aware functions, the C random number functions, and functions relying on errno
.
The musl, newlib, and uclibc-ng C libraries are all marketed towards embedded use cases, and are all frequently used in embedded environments. All of the C facilities that this paper adds to the freestanding requirements are already present in musl, newlib, and uclibc-ng. This includes memccpy.
The Linux kernel uses a custom C library, though that library is more minimal, and in non-standard locations. The Linux kernel has implementations of bsearch
(in <linux/bsearch.h
) and all of the <string.h>
functions except for memccpy
, though the <string.h>
functions are in <linux/string.h>
. The <wchar.h>
, <inttypes.h>
, and most of the <stdlib.h>
functions were not present.
The Microsoft Windows kernel also has a C implementation that is distinct from the one that ships from Microsoft Visual Studio. That C implementation contains all of the new freestanding requirements with the exception of imaxabs
, imaxdiv
, llabs
, and lldiv
.
On the C++ front, I have successfully tested Visual Studio's char_traits
implementation with a C++14 era set of libc++ tests, all in the Windows kernel. The integral <charconv>
functions have not been tested, but I do not foresee any issues there.
<cstdlib>
- size_t
- div_t
- ldiv_t
- lldiv_t
- NULL
- bsearch
- qsort
- abs(int)
- abs(long int)
- abs(long long int)
- labs
- llabs
- div
- ldiv
- lldiv
All the error #defines
in <cerrno>
, but not errno
.
The errc
enum from <system_error>
.
Portions of <charconv>
.
to_chars_result
from_chars_result
to_chars
(integral)from_chars
(integral)
The char_traits
class from <string>
.
Portions of <cstring>
.
- memcpy
- memmove
- strcpy
- strncpy
- strcat
- strncat
- memcmp
- strcmp
- strncmp
- memchr
- strchr
- strcspn
- strpbrk
- strrchr
- strspn
- strstr
- memset
- strlen
On C, include memccpy
in <string.h>
, in addition to what is mentioned above for <cstring>
.
Portions of <cwchar>
.
- wcscpy
- wcsncpy
- wmemcpy
- wmemmove
- wcscat
- wcsncat
- wcscmp
- wcsncmp
- wmemcmp
- wcschr
- wcscspn
- wcspbrk
- wcsrchr
- wcsspn
- wcsstr
- wcstok
- wmemchr
- wcslen
- wmemset
A small portion of <cmath>
will be present.
- abs(int)
- abs(long int)
- abs(long long int)
<cinttypes>
will be present.
- imaxabs
- imaxdiv
- abs(intmax_t)
- div(intmax_t, intmax_t)
errno
is not included as it is global state. In addition, errno is best implemented as a thread-local variable.
error_code
, error_condition
, and error_category
all have string
in the interface.
Many string functions (strtol
and family) rely on errno
.
strtok
and rand
aren't required to use thread-local storage, but good implementations do. I don't want to encourage bad implementations.
assert
is not included as it requires a stderror stream.
_Exit
is not included as I do not wish to add more termination functions.
I hope to remove most of them in the future.
Program termination requires involvement from the operating system / environment.
<cctype>
and <cwctype>
rely heavily on global locale data.
<cwchar>
<cwchar>
functions are implementable for freestanding environments.
The Microsoft and EFI ecosystems (EFI was the successor to BIOS and the predecessor to UEFI) use wchar_t extensively.
<cmath>
has a dependency on errno
.
errno
and string functions like strtol
errno
is global data, it isn't much global data.
Thread safety is a concern for those platforms that have threading, but don't have thread-local storage.
Environments that don't support arbitrary thread local data could special case errno
.
<stdatomic.h>
<stdatomic.h>
in freestanding implementations, but C++ requires std::atomic
.
I don't currently recommend adding <stdatomic.h>
to freestanding C implementations, as that would also require dealing with non-lock-free atomics.
If others feel strongly about unifying this aspect of C and C++ freestanding implementations, then the facilities could be added.
A freestanding implementation that provides support for this paper shall define the following feature test macros:
Name | Header | Notes |
---|---|---|
__cpp_lib_freestanding_char_traits |
<string> |
|
__cpp_lib_freestanding_charconv |
<charconv> |
|
__cpp_lib_freestanding_cinttypes |
<cinttypes> |
|
__cpp_lib_freestanding_cstdlib |
<cstdlib> and <cmath> |
The only freestanding parts of <cmath> are abs overloads that are also covered in <cstdlib> |
__cpp_lib_freestanding_cstring |
<cstring> |
|
__cpp_lib_freestanding_cwchar |
<cwchar> |
|
__cpp_lib_freestanding_errc |
<cerrno> and <system_error> |
Covers errc and <cerrno> #defines |
The above
macros are useful for detecting the presence of various facilities.
The user can provide a hand-rolled replacement on old or non-conforming implementations, while using the toolchain's facilities when available.
These macros follow the policies proposed in P2198: Freestanding Feature-Test Macros and Implementation-Defined Extensions.
The two forms of conforming implementation are hosted and freestanding. A conforming hosted implementation shall accept any strictly conforming program. A conforming freestanding implementation shall accept any strictly conforming program in which the use of the features specified in the library clause (Clause 7) is confined toChange paragraph 7 as follows:the contents of the standard headersfreestanding library facilities. The strictly conforming programs that shall be accepted by a conforming freestanding implementation may include any standard library header that contains freestanding library facilities. A conforming implementation may have extensions (including additional library functions), provided they do not alter the behavior of any strictly conforming program. All identifiers that are reserved when a standard header is included in a hosted implementation are reserved when it is included in a freestanding implementation.<float.h>
,<iso646.h>
,<limits.h>
,<stdalign.h>
,<stdarg.h>
,<stdbool.h>
,<stddef.h>
,<stdint.h>
, and<stdnoreturn.h>
.
The strictly conforming programs that shall be accepted by a conforming freestanding implementation that defines__STDC_IEC_60559_BFP__
or__STDC_IEC_60559_DFP__
may also use features in the contents of the standard headers<fenv.h>
and<math.h>
and the numeric conversion functions (7.22.1) of the standard header<stdlib.h>
.All identifiers that are reserved when<stdlib.h>
is included in a hosted implementation are reserved when it is included in a freestanding implementation.
<errno.h>
[...]or a program defines an identifier with the nameerrno
, the behavior is undefined.EDOM
,EILSEQ
, andERANGE
are freestanding library facilities.
<float.h>
The macros in <float.h>
are freestanding library facilities.
imaxabs
function
The imaxabs
function is a freestanding library facility.
imaxdiv
function
The imaxdiv
function is a freestanding library facility.
<iso646.h>
The macros in <iso646.h>
are freestanding library facilities.
<limits.h>
The macros in <limits.h>
are freestanding library facilities.
<stdalign.h>
The macros in <stdalign.h>
are freestanding library facilities.
<stdarg.h>
The types and macros in <stdarg.h>
are freestanding library facilities.
<stdbool.h>
The macros in <stdbool.h>
are freestanding library facilities.
<stddef.h>
The types and macros in <stddef.h>
are freestanding library facilities.
<stdint.h>
The types and macros in <stdint.h>
are freestanding library facilities.
<stdlib.h>
[...]which is a structure type that is the type of the value returned by thelldiv
function.div_t
,ldiv_t
, andlldiv_t
are freestanding library facilities.
bsearch
function
The bsearch
function is a freestanding library facility.
qsort
function
The qsort
function is a freestanding library facility.
abs
, labs
, and llabs
functionsTheabs
,labs
, andllabs
functions are freestanding library facilities.
div
, ldiv
, and lldiv
functionsThediv
,ldiv
, andlldiv
functions are freestanding library facilities.
_Noreturn <stdnoreturn.h>
The macros in <stdnoreturn.h>
are freestanding library facilities.
<string.h>
memcpy
functionmemccpy
functionmemmove
functionstrcpy
functionstrncpy
functionstrcat
functionstrncat
functionmemcmp
functionstrcmp
functionstrncmp
functionmemchr
functionstrchr
functionstrcspn
functionstrpbrk
functionstrrchr
functionstrspn
functionstrstr
functionmemset
functionstrlen
function
The __placeholder__
function is a freestanding library facility.
<wchar.h>
[...] which is declared as an incomplete structure type (the contents are described in 7.27.1).Add a sentence to paragraph 3:mbstate_t
andwint_t
are freestanding library facilities.
[...] It is also used as a wide character value that does not correspond to any member of the extended character set.
WEOF
is a freestanding library facility.
For each of the following synopses...
wcscpy
functionwcsncpy
functionwmemcpy
functionwmemmove
functionwcscat
functionwcsncat
functionwcscmp
functionwcsncmp
functionwmemcmp
functionwcschr
functionwcscspn
functionwcspbrk
functionwcsrchr
functionwcsspn
functionwcsstr
functionwcstok
functionwmemchr
functionwcslen
functionwmemset
function
The __placeholder__
function is a freestanding library facility.
Wording is based off WG21 N4878 from 2020-12-15. This paper also assumes that P1642 and P2198 have been accepted and applied.
On a freestanding implementation, a freestanding deleted function is a function that has either a deleted definition or a definition meeting the corresponding requirements in a hosted implementation.In the associated header synopsis for such freestanding deleted functions, the items are followed with a comment that includes freestanding deleted.[ Example:double abs(double j); // freestanding deleted
-end example]
Subclause | Header(s) | |
---|---|---|
[…] | […] | […] |
<cstdlib> | ||
[…] | […] | […] |
?.? [errno] | Error numbers | <cerrno> |
?.? [syserr] | System error support | <system_error> |
?.? [charconv] | Primitive numeric conversions | <charconv> |
?.? [string.classes] | String classes | <string> |
?.? [ratio] | Compile-time rational arithmetic | <ratio> |
?.? [c.strings] | Null-terminated sequence utilities | <cstring>, <cwchar> |
?.? [c.math] | Mathematical functions for floating-point types | <cmath> |
?.? [c.files] | C library files | <cinttypes> |
[…] | […] | […] |
div_t
ldiv_t
lldiv_t
bsearch
qsort
abs(int)
abs(long int)
abs(long longint)
labs
llabs
div
ldiv
lldiv
abs(float)
abs(double)
abs(long double)
Please add the following feature test macros to [version.syn]:
#define __cpp_lib_freestanding_char_traits new-val // freestanding, also in <string> #define __cpp_lib_freestanding_charconv new-val // freestanding, also in <charconv> #define __cpp_lib_freestanding_cinttypes new-val // freestanding, also in <cinttypes> #define __cpp_lib_freestanding_cstdlib new-val // freestanding, also in <cstdlib>, <cmath> #define __cpp_lib_freestanding_cstring new-val // freestanding, also in <cstring> #define __cpp_lib_freestanding_cwchar new-val // freestanding, also in <cwchar> #define __cpp_lib_freestanding_errc new-val // freestanding, also in <cerrno>, <system_error>
E2BIG
EACCES
EADDRINUSE
EADDRNOTAVAIL
EAFNOSUPPORT
EAGAIN
EALREADY
EBADF
EBADMSG
EBUSY
ECANCELED
ECHILD
ECONNABORTED
ECONNREFUSED
ECONNRESET
EDEADLK
EDESTADDRREQ
EDOM
EEXIST
EFAULT
EFBIG
EHOSTUNREACH
EIDRM
EILSEQ
EINPROGRESS
EINTR
EINVAL
EIO
EISCONN
EISDIR
ELOOP
EMFILE
EMLINK
EMSGSIZE
ENAMETOOLONG
ENETDOWN
ENETRESET
ENETUNREACH
ENFILE
ENOBUFS
ENODATA
ENODEV
ENOENT
ENOEXEC
ENOLCK
ENOLINK
ENOMEM
ENOMSG
ENOPROTOOPT
ENOSPC
ENOSR
ENOSTR
ENOSYS
ENOTCONN
ENOTDIR
ENOTEMPTY
ENOTRECOVERABLE
ENOTSOCK
ENOTSUP
ENOTTY
ENXIO
EOPNOTSUPP
EOVERFLOW
EOWNERDEAD
EPERM
EPIPE
EPROTO
EPROTONOSUPPORT
EPROTOTYPE
ERANGE
EROFS
ESPIPE
ESRCH
ETIME
ETIMEDOUT
ETXTBSY
EWOULDBLOCK
EXDEV
errc
entity.
to_chars_result
from_chars_result
to_chars_result to_chars(char* first, char* last, see below value, int base = 10);
from_chars_result from_chars(const char* first, const char* last, see below& value, int base = 10);
to_chars(char* first, char* last, float value)
to_chars(char* first, char* last, double value)
to_chars(char* first, char* last, long double value)
from_chars(const char *first, const char *last, float& value, chars_format fmt = chars_format::general)
from_chars(const char *first, const char *last, double& value, chars_format fmt = chars_format::general)
from_chars(const char *first, const char *last, long double& value, chars_format fmt = chars_format::general)
template<class charT> struct char_traits
template<> struct char_traits<char>
template<> struct char_traits<char8_t>
template<> struct char_traits<char16_t>
template<> struct char_traits<char32_t>
template<> struct char_traits<wchar_t>
memcpy
memmove
strcpy
strncpy
strcat
strncat
memcmp
strcmp
strncmp
memchr
strchr
strcspn
strpbrk
strrchr
strspn
strstr
memset
strlen
NULL
strcoll
strxfrm
strtok
strerror
size_t
mbstate_t
wint_t
wcscpy
wcsncpy
wmemcpy
wmemmove
wcscat
wcsncat
wcscmp
wcsncmp
wmemcmp
wcschr
wcscspn
wcspbrk
wcsrchr
wcsspn
wcsstr
wcstok
wmemchr
wcslen
wmemset
NULL
WCHAR_MAX
WCHAR_MIN
WEOF
struct tm
fwprintf
fwscanf
swprintf
swscanf
vfwprintf
vfwscanf
vswprintf
vswscanf
vwprintf
vwscanf
wprintf
wscanf
fgetwc
fgetws
fputwc
fputws
fwide
getwc
getwchar
putwc
putwchar
ungetwc
wcstod
wcstof
wcstold
wcstol
wcstoll
wcstoul
wcstoull
wcscoll
wcsxfrm
wcsftime
btowc
wctob
mbsinit
mbrlen
mbrtowc
wcrtomb
mbsrtowcs
wcsrtombs
abs(int)
abs(long int)
abs(long long int)
abs(float)
abs(double)
abs(long double)
imaxabs
imaxdiv
abs(intmax_t)
div(intmax_t, intmax_t)