Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[format.string.escaped] Fix invalid example #5890

Merged
merged 2 commits into from Oct 10, 2022

Conversation

mordante
Copy link
Contributor

@mordante mordante commented Oct 5, 2022

Example 5 should only have one invalid code unit. The second code unit is a valid code point. This issue was already in the paper
P2286R8 Formatting Ranges

Example 5 should only have one invalid code unit. The second code unit
is a valid code point. This issue was already in the paper
  P2286R8 Formatting Ranges
@tkoeppe
Copy link
Contributor

tkoeppe commented Oct 5, 2022

@cor3ntin, could you please take a look?

@cor3ntin
Copy link
Contributor

cor3ntin commented Oct 5, 2022

@tkoeppe This is borderline design as the wording does not specify what constitute an invalid code unit sequence
https://eel.is/c++draft/format.string.escaped#2.2

I think the proposed change is consistent with sg-16 consensus on what the behavior should be.
Either way the change is better than status quo given the associated wording

A conservative approach would be to remove the example entirely.

@tahonermann
Copy link
Contributor

The proposed change illustrates the intended behavior. The example was previously discussed on the SG16 mailing list (see the thread with subject "Handling ill-formed Unicode in the library") and multiple people expressed support for the change while no one voiced opposition. Corentin is correct that the specification lacks a concrete description of how the boundaries of an ill-formed code unit sequence are determined. Addressing that will require normative changes to [format.string.escaped]. I'm confident that any changes made there would be consistent with the amended example; SG16 is highly unlikely to endorse any ill-formed code unit sequence definition that is not aligned with Unicode PR-121.

@tkoeppe
Copy link
Contributor

tkoeppe commented Oct 10, 2022

OK, thanks, everyone! @tahonermann: are you tracking the specification deficiencies in an issue elsewhere? I'll apply this change and close this PR in the meantime.

@tkoeppe tkoeppe merged commit f6c5d45 into cplusplus:main Oct 10, 2022
@tahonermann
Copy link
Contributor

@tkoeppe, I am now: sg16-unicode/sg16#80.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants