Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we change all "source character set" to "basic source character set"? #2857

Closed
xskxzr opened this issue May 1, 2019 · 3 comments · Fixed by #2928
Closed

Should we change all "source character set" to "basic source character set"? #2857

xskxzr opened this issue May 1, 2019 · 3 comments · Fixed by #2928
Assignees

Comments

@xskxzr
Copy link

xskxzr commented May 1, 2019

I have searched over [lex] and found there is no definition for "source character set", but this word is used in many contexts, for example,

[lex.phases]/5:
Each source character set member in a character literal or a string literal, ...

Unlike execution character set, "extended source character set" is useless since any source file character not in the basic source character set is replaced by a universal-character-name. So should we change all "source character set" to "basic source character set", or vice versa?

@jensmaurer
Copy link
Member

There are special cases where we actually consider characters outside of the basic source character set, for example in raw string literals [lex.pptoken] p3. The conversion to universal-character-names is reverted in that case.

So, in a raw string literal, we could have characters beyond the basic source character set, and those are (also) mapped to the execution character set.

The remaining (small) issue here is the missing definition of "source character set". We could say "set of physical source file characters"; see [lex.phases] p1.1.

@jensmaurer jensmaurer added the decision-required A decision of the editorial group (or the Project Editor) is required. label May 2, 2019
@jensmaurer jensmaurer removed the decision-required A decision of the editorial group (or the Project Editor) is required. label Jun 5, 2019
@jensmaurer
Copy link
Member

Editorial meeting: Fix "source character set" -> "basic source character set" where that is the obvious correct fix. Leave other cases to eventual CWG cleanup.

@jensmaurer
Copy link
Member

The remaining issues are whether h-char, q-char, and r-char can be letters outside of the basic source character set. It seems to me the answer to the latter is clearly "yes"; the answer to the former two might be "no".

@jensmaurer jensmaurer self-assigned this Jun 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants