CWG Issue 931

This is an unofficial snapshot of the ISO/IEC JTC1 SC22 WG21 Core Issues List revision 114a. See http://www.open-std.org/jtc1/sc22/wg21/ for the official list.

2024-04-18

931. Confusing reference to the length of a user-defined string literal

Section: 5.13.9 [lex.ext] Status: CD2 Submitter: Alisdair Meredith Date: 6 July, 2009

[Voted into WP at March, 2010 meeting.]

5.13.9 [lex.ext] paragraph 5 says,

If L is a user-defined-string-literal, let str be the literal without its ud-suffix and let len be the number of characters (or code points) in str (i.e., its length excluding the terminating null character).

The length of a null-terminated string is defined in 16.3.3.3.4.2 [byte.strings] as the number of bytes preceding the terminator, but a single code point in a UTF-8 string can require more than one byte, so this sentence is inconsistent and needs to be revised to make clear which definition is in view.

Proposed resolution (October, 2009):

Change 5.13.9 [lex.ext] paragraph 5 as follows:

If L is a user-defined-string-literal, let str be the literal without its ud-suffix and let len be the number of ~~characters (or code points)~~ code units in str (i.e., its length excluding the terminating null character)...