Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lex.charset] Define 'valid encoding' #5101

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jensmaurer
Copy link
Member

and use the term for executation character sets introduced
in [character.seq].

Fixes #4924

@wg21bot wg21bot added the needs rebase The pull request needs a git rebase to resolve merge conflicts. label Nov 19, 2021
and use the term for executation character sets introduced
in [character.seq].
@jensmaurer jensmaurer removed the needs rebase The pull request needs a git rebase to resolve merge conflicts. label Nov 19, 2021
@tkoeppe
Copy link
Contributor

tkoeppe commented Aug 19, 2022

@jensmaurer Is this PR still appropriate after the recent round of papers that touch encodings?

@tkoeppe tkoeppe added the needs rebase The pull request needs a git rebase to resolve merge conflicts. label Aug 19, 2022
@jensmaurer
Copy link
Member Author

I think we still want this.

@tkoeppe
Copy link
Contributor

tkoeppe commented Aug 19, 2022

OK! (I'll need a rebase then.)

@tkoeppe
Copy link
Contributor

tkoeppe commented Aug 19, 2022

Also, @tahonermann, @cor3ntin, could you kindly take a look?

@cor3ntin
Copy link
Contributor

The term "valid" is often use to describe a properly encoded sequence, or a sequence of byte that can be cleanly decoded with a given encoding and applying that terms to the encoding itself is confusing.
I'd suggest a different term, such as "conforming encoding", "compliant encoding".

And... all encodings describe in the standard should be conforming, tautologically.
So... I question we do need a definition at all.

All literal and execution encodings shall [bullet list of requirements - which is correct as modified]

I would keep The ordinary and wide literal encodings are valid encodings, but are otherwise \impldef{ordinary and wide literal encodings} in lib-intro.tex, as we have been trying to not mention them in the core wording too much. ie, in lib-intro

The sets of additional elements (if any) are locale-specific. The encodings of the execution character sets are locale-specific and implementation-defined.

@tahonermann
Copy link
Contributor

I agree with Corentin's review; I found the proposed wording with regard to valid encodings confusing. I do like the idea of naming the conditions an encoding must meet to be acceptable as a literal or execution encoding though. Perhaps "basis encoding"? The intent is to define a set of minimum requirements that can be extended; I think "basis" captures that intent. The wording might then look something like:

A basis encoding is a character encoding that satisfies the following conditions
...
The ordinary and wide literal encodings are \impldef{ordinary and wide literal encodings} basis encodings\iref{lex.charset}.
...
Those encodings are basis encodings\iref{lex.charset}.
...
The encodings of the execution character sets are locale-specific basis encodings\iref{lex.charset}.

@tkoeppe tkoeppe added the decision-required A decision of the editorial group (or the Project Editor) is required. label Sep 16, 2022
@tkoeppe
Copy link
Contributor

tkoeppe commented Nov 8, 2022

Editorial meeting: the objections to the overly generic term "valid" are well taken, and we should look for a better term. The overall goal of factoring out the core notion in question is valuable, but we need to find the right way to express it.

Ideas welcome!

@tkoeppe tkoeppe removed the decision-required A decision of the editorial group (or the Project Editor) is required. label Nov 8, 2022
@tahonermann
Copy link
Contributor

@tkoeppe,

Ideas welcome!

The prior comment included a suggestion.

@tkoeppe
Copy link
Contributor

tkoeppe commented Nov 8, 2022

Ah yes! Hm, "basis" is only one letter away from "basic", which may have too much historic ballast in this domain? But yes, variations on that approach occurred to us, too. E.g. "admissable encoding".

@cor3ntin
Copy link
Contributor

cor3ntin commented Nov 8, 2022

If we need to keep a terminology, "admissible", "conforming", "supported", or variations on that theme do seem like good solutions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs rebase The pull request needs a git rebase to resolve merge conflicts.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Follow-on to P2314R4: properly define encoding restrictions
5 participants