Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lex] Unicode character names #5093

Closed
jensmaurer opened this issue Nov 6, 2021 · 1 comment · Fixed by #5094
Closed

[lex] Unicode character names #5093

jensmaurer opened this issue Nov 6, 2021 · 1 comment · Fixed by #5094
Assignees

Comments

@jensmaurer
Copy link
Member

From 24.1 "Character Names List" of the Unicode Standard 14.0 (the upstream document that seems to be well maintained)

Normative Aliases
A normative character name alias is a formal, unique, and stable alternate name for a character. In limited circumstances, characters are given normative character name aliases where there is a defect in the character name. These normative aliases do not replace the character name, but rather allow users to refer formally to the character without requiring the use of a defective name. For more information, see Section 4.8, Name.

Normative aliases which provide information about corrections to defective character names or which provide alternate names in wide use for a Unicode format character are printed in the character names list, preceded by a special symbol ". Normative aliases serving other purposes, if listed, are shown by convention in all caps, following an “=”. Normative aliases of type “figment” for control codes are not listed. Normative aliases which represent commonly used abbreviations for control codes or format characters are shown in all caps, enclosed in parentheses. In contrast, informative aliases are shown in lowercase. For the definitive list of normative aliases, also including their type and suitable for machine parsing, see NameAliases.txt in the UCD.

This makes it clear that stuff in parentheses is an abbreviation and not part of the official character name or alias. Clean up [lex] as modified by "P2314R4 Character sets and encodings" (commit 3505e2a) to remove the abbreviations.

@jensmaurer
Copy link
Member Author

See also #5032

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant