[text] Create "Text processing library" clause #5226

jensmaurer · 2022-01-21T19:57:24Z

This issue is further to #5124.

The proposal is to create a top-level clause entitled "Text processing library" [text] in the Working Draft at the current location of [localization] that contains the following, in order:

[charconv]
[localization]
[format]
[re]
C library
- [cctype.syn]
- [cwctype.syn]
- [cwchar.syn]
- [cuchar.syn]
- [c.mb.wcs]

About 90 pages in total.

Mick235711 · 2022-02-10T08:22:13Z

After this change, [strings] will only have ~40 pages and be the third-to-smallest library clause (just above [concepts], [diagnostics]). Did string classes really fit in text formatting? At least they are tightly related.

I think probably either merge [strings] with [text] or at least move these two clauses together (I lean towards not merging but adjacent, since strings are fairly self-contained). Currently purposed [localization] position seems to be too far from [strings], I'd say... (But just personal opinion)

jwakely · 2022-02-10T10:38:59Z

Did string classes really fit in text formatting?

No. They are containers of characters, not necessarily text. I can have a std::string containing invalid UTF-8, for example.

tahonermann · 2022-02-10T22:54:24Z

No. They are containers of characters, not necessarily text. I can have a std::string containing invalid UTF-8, for example.

I don't find that a persuasive argument for separating strings and text. The string related features are clearly intended to hold and work with text despite the lack of invariants to ensure well-formedness with respect to any particular encoding. I imagine that if we introduce additional text related containers in the future that do have strong encoding associations, they will likewise eschew enforcement of well-formedness due to performance considerations. In those cases, we'll probably relegate violations to library UB. Error handling (or lack there of) doesn't strike me as a good basis for library separation.

That being said, I don't have strong opinions regarding this organization so long as it continues not to impact header or module naming.

jensmaurer · 2022-02-23T22:46:10Z

Postponed to C++26.

tkoeppe · 2022-02-23T23:19:44Z

Given that we'll only need to send the new draft in the March mailing, I wouldn't be entirely opposed to still doing this now, but we should feel unreservedly that we're making an improvement, without any caveats or regrets.

I'd be happy to hear alternative suggestions (e.g. how about merging strings and text), and also positions on the status quo (@tahonermann?).

jensmaurer · 2022-02-24T00:09:19Z

I'm ok with moving all of [strings] (~50 pages) into [text], with the idea that they are intended to represent text. This gets the total to ~140 pages, with plans to grow further (e.g. regex v2, encoding conversion facilities, possibly more from the scope of ICU).

However, I do like the general idea of having [text] cover everything text-related, even if that means we're heading for a fairly large clause.

We had suggestions to make [filesystem] top-level; this appears to be a reasonable idea, too, but seems not really urgent. Maybe future network facilities also fit under an input/output umbrella, or at least fit together with [filesystem] into a fresh clause.

cor3ntin · 2022-07-14T12:49:20Z

I really like this direction. But I agree with @jwakely. strings can remain their own section, that wouldn't be terrible. Otherwise they could fit in containers, but they are certainly orthogonal to unicode / text

tkoeppe · 2022-07-14T12:53:54Z

Maybe we can talk about this again at a future editorial meeting. Last time I asked there wasn't a lot of interest, but we can certainly try this again for 26.

AlisdairM · 2023-03-22T21:18:30Z

Another vote to move [strings] into [text] if we go in this direction, that the vocabulary type for much of [text] would be defined in [strings]. Also, basic_string is no longer the only container not defined in the [containers] clause, so it would no longer be surprising to find a container elsewhere.

Failing that, I would hope to at least see [strings] and [text] as adjacent clauses in any such reorganization.

Agree with moving [filesystem] to a top level clause as part of such a restructuring.

cor3ntin · 2023-11-27T08:41:15Z

@tkoeppe Following discussion in Kona, [text.encoding] should also move there, But the rest of the organization outlined by Jens still looks good to me me.
Let me know how we can move forward with that.

jensmaurer changed the title ~~Create "Text processing library" clause~~ [text] Create "Text processing library" clause Jan 24, 2022

jensmaurer pinned this issue Feb 6, 2022

jensmaurer added this to the C++23 milestone Feb 6, 2022

jensmaurer added this to Open in C++23 clause renumbering Feb 6, 2022

jensmaurer modified the milestones: C++23, C++26 Feb 23, 2022

jensmaurer unpinned this issue Feb 23, 2022

jensmaurer pinned this issue Feb 24, 2022

jensmaurer unpinned this issue Mar 22, 2022

jensmaurer mentioned this issue Mar 24, 2022

Restructuring clauses for C++26 #5315

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[text] Create "Text processing library" clause #5226

[text] Create "Text processing library" clause #5226

jensmaurer commented Jan 21, 2022 •

edited

Mick235711 commented Feb 10, 2022 •

edited

jwakely commented Feb 10, 2022

tahonermann commented Feb 10, 2022

jensmaurer commented Feb 23, 2022

tkoeppe commented Feb 23, 2022

jensmaurer commented Feb 24, 2022

cor3ntin commented Jul 14, 2022

tkoeppe commented Jul 14, 2022

AlisdairM commented Mar 22, 2023

cor3ntin commented Nov 27, 2023

[text] Create "Text processing library" clause #5226

[text] Create "Text processing library" clause #5226

Comments

jensmaurer commented Jan 21, 2022 • edited

Mick235711 commented Feb 10, 2022 • edited

jwakely commented Feb 10, 2022

tahonermann commented Feb 10, 2022

jensmaurer commented Feb 23, 2022

tkoeppe commented Feb 23, 2022

jensmaurer commented Feb 24, 2022

cor3ntin commented Jul 14, 2022

tkoeppe commented Jul 14, 2022

AlisdairM commented Mar 22, 2023

cor3ntin commented Nov 27, 2023

jensmaurer commented Jan 21, 2022 •

edited

Mick235711 commented Feb 10, 2022 •

edited