Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[basic.fundamental] Lacking a definition of overflow #4857

Closed
AaronBallman opened this issue Sep 1, 2021 · 20 comments
Closed

[basic.fundamental] Lacking a definition of overflow #4857

AaronBallman opened this issue Sep 1, 2021 · 20 comments

Comments

@AaronBallman
Copy link
Contributor

WG14 has been discussing what constitutes overflow (for example, can an unsigned integer overflow or is there something about twos complement semantics that imply this cannot overflow due to the wrapping behavior?), so I checked the C++ standard to see whether we have a useful definition of the term. The Index lists overflow twice. They both link to different parts of [expr.pre]p4 where the word "overflow" never appears. Presumably we should define this term because it's used elsewhere in the standard, like [basic.fundamental]p2 Unsigned arithmetic does not overflow. Overflow for signed arithmetic yields undefined behavior ([expr.pre]).

@jensmaurer
Copy link
Member

It's a note in [basic.fundamental] p2, so it's supposed to be explanatory rather than normative. If you don't understand "overflow" there, just ignore the note; the preceding normative text tells you what to do ("arithmetic modulo 2^N").

Similar for [expr.pre] p5; the mentions of "overflow" there is also in a note. The preceding normative text in p4 says "If during the evaluation of an expression, the result is not mathematically defined or not in the range of
representable values for its type, the behavior is undefined."

The model is that we perform all arithmetic in the mathematical domain of infinite numbers and then try to map the result back to the type in which the result is supposed to fit. If it doesn't fit, it's undefined behavior.
(For unsigned arithmetic, we expressly say we use modulo 2^N arithmetic, which is a mathematical concept, and it so happens that mapping the value back to the type never yields a "doesn't fit" situation.)

I don't believe a normative definition of "overflow" would help anything.

@jensmaurer
Copy link
Member

(The very few normative uses of "overflow" in the standard library should probably be changed to "not representable in the range of type T".)

@jensmaurer jensmaurer added the decision-required A decision of the editorial group (or the Project Editor) is required. label Sep 1, 2021
@AaronBallman
Copy link
Contributor Author

The model is that we perform all arithmetic in the mathematical domain of infinite numbers and then try to map the result back to the type in which the result is supposed to fit. If it doesn't fit, it's undefined behavior.
(For unsigned arithmetic, we expressly say we use modulo 2^N arithmetic, which is a mathematical concept, and it so happens that mapping the value back to the type never yields a "doesn't fit" situation.)

That's been my understanding, but it's rather subtle. It's pretty easy to read [expr.pre]p4 as being UB for unsigned integers because [basic.fundamental]p2 says The range of representable values for the unsigned type is... and [expr.pre]p4 talks about when a mathematical value is outside the range of representable values. [basic.fundamental]p2 has a very load-bearing "arithmetic is performed" clause. I don't have a suggestion for improvement, however, so I think it's fine to ignore this "problem".

I don't believe a normative definition of "overflow" would help anything.

I'd be happy if we removed it from the index. I mostly am grousing about the fact that it's an overloaded term that's used (even if nonnormatively). It's used colloquially to describe (at least) overflow for floating-point computations that may result in an infinity, integer computations that may result in either UB or modulo-2 arithmetic, and pointer arithmetic that may result in UB.

@jensmaurer
Copy link
Member

"overflow" is in the index exactly because it's used colloquially, and the index points to the normative sections where overflow is discussed. [basic.fundamental] p2 has a note immediately following the load-bearing "arithmetic is performed" wording that clarifies the overflow behavior.

If you have specific improvement suggestions, I'm all ears.

@AaronBallman
Copy link
Contributor Author

WG14 discussed this today and one interesting thing that was pointed out is that overflow is actually a defined term of art in ISO 2382, which we normatively reference in [intro.refs]p1.1:

overflow
<arithmetic and logic operations> portion of a numeric word expressing the result of an arithmetic operation by which its word length exceeds the word length provided for the number representation

So our colloquial uses of the term seem to conflict with that definition in some places.

@jensmaurer
Copy link
Member

Yes. We should all be happy we're not using "overflow" normatively, so that there is not actually a conflict with ISO 2382. (Personally, I feel the use of the plain word "overflow" for what seems to be defined as an overflow flag is unfortunate.)

@AaronBallman
Copy link
Contributor Author

So the nonnormative uses aren't a problem despite using a different meaning than the defined term? That surprises me -- the whole impetus for this is because the ISO 2382 definition is what (at least some) people think the colloquial definition of overflow is, but then there are uses in the standard that use a novel, undefined meaning of the term causing some confusion.

@jensmaurer
Copy link
Member

jensmaurer commented Sep 2, 2021

The non-normative uses of the term "overflow" are certainly not great, because (as just discovered) they appear to be in conflict with ISO 2382 (a normative reference of C++). If we want to fix that, we need to reword the notes to avoid the term "overflow". I'm not sure that would improve clarity for the majority of readers in the C++ audience.

Could you paraphrase in your own words what you believe ISO 2382 claims "overflow" is? "Portion of a numeric word" certainly sounds like 1970s parlance and could be interpreted as "the part resulting from mathematical arithmetic that is larger than the actual representation". That's a noun.

C++ seems to use it as a verb: "does not overflow". Maybe that saves us from a conflict with ISO 2382; the latter only defines the noun, but not the verb.

@jwakely
Copy link
Member

jwakely commented Sep 2, 2021

I don't think the non-normative uses of "overflow" are novel. That use might be in conflict with a term defined in another standard, but it is widely understood and widely used in industry. I have never come across the use of "overflow" as defined in 2382 (I don't even know what a "numeric word" is).

Maybe we can use "integer overflow" for the verb used in the notes in the C++ standard, to distinguish it from the definition in 2382.

@jwakely
Copy link
Member

jwakely commented Sep 2, 2021

https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html is an example of "overflow" and "overflowing" as a verb in the field. I would expect the vast majority of C++ programmers to be a lot more familiar with that usage than whatever it is 2382 is trying to define.

@AaronBallman
Copy link
Contributor Author

@jwakely -- I think that the ISO 2382 definition of overflow matches the use in the GCC documentation regarding unsigned integers, while the C++ notion does not.

From the GCC docs:
The following built-in functions allow performing simple arithmetic operations together with checking whether the operations overflowed.

C++ claims unsigned integers do not overflow (in a note), so the unsigned operations in GCC's list should all claim there's no overflow by that usage of the word. With the ISO 2382 definition of the term as a noun (that can be verbified through the usual abuse of English), the unsigned operations in GCC's list all make sense to me.

Could you paraphrase in your own words what you believe ISO 2382 claims "overflow" is? "Portion of a numeric word" certainly sounds like 1970s parlance and could be interpreted as "the part resulting from mathematical arithmetic that is larger the actual representation". That's a noun.

I read the ISO 2382 definition as saying that the overflow (noun) is the bits that can't be stored when the mathematical result of an operation cannot be represented by the result type due to lack of sufficient storage capacity for the data. I reason that overflow (verb) is the act of creating such bits through evaluation of an expression.

@jensmaurer
Copy link
Member

I read the ISO 2382 definition as saying that the overflow (noun) is the bits that can't be stored when the mathematical result of an operation cannot be represented by the result type due to lack of sufficient storage capacity for the data.

Great. And the normative C++ text says that operations on unsigned integers are evaluated using "modulo 2^N" arithmetic, which is not the usual primary school arithmetic, but it's still a widely understood mathematical concept from Basic Algebra 101. And this arithmetic happens to never generate those excess bits, so no overflow (ISO 2382) appears.

Note that the gcc page referenced above says "These built-in functions promote the first two operands into infinite precision signed type and perform addition on those promoted operands. The result is then cast to the type the third pointer argument points to and stored there. If the stored result is equal to the infinite precision result, ..."

Note these built-ins actually perform signed operations ("infinite precision signed type"), which the C++ standard never claims to have special properties (beyond undefined overflow).

@jensmaurer jensmaurer changed the title Lacking a definition of overflow [basic.fundamental] Lacking a definition of overflow Sep 4, 2021
@languagelawyer
Copy link
Contributor

The non-normative uses of the term "overflow" are certainly not great, because (as just discovered) they appear to be in conflict with ISO 2382 (a normative reference of C++).

BTW, is it ok when the C++ Standard introduces own, non-equvalent definition for a term defined in ISO 2382?

@jensmaurer
Copy link
Member

jensmaurer commented Sep 4, 2021

I think it's ok if C++ makes a term defined in ISO 2382 more precise or otherwise applicable to C++, by providing a separte definition. We should try hard to avoid actual conflicts, though.
In the particular case of "overflow", further analysis suggests there isn't actually a conflict, so we're good.

@tkoeppe
Copy link
Contributor

tkoeppe commented Sep 24, 2021

@AaronBallman: It appears like there is no conflict between our use of "overflow", and also our lack of overflow in unsigned, modular arithmetic, with ISO 2382. Is there anything else you find incorrect or inaccurate?

@AaronBallman
Copy link
Contributor Author

I'm less convinced than others. The ISO definition says (in part) "...by which its word length exceeds the word length provided for the number representation." and [basic.fundamental]p2 says "The range of representable values for the unsigned type is..." So it still very much sounds to me like the issue is with the representation of the mathematical value. aka, you perform the notional arithmetic, you check whether the value is within the range of representable values, and if not overflow occurs and the value is changed to then fit within the range. If overflow does not occur, the value already fit within the range of representable values.

That said, I should note that WG14 has at least one paper in the works on this definition. We should probably watch WG14 N2811 (not yet published, but expected for the Nov 2021 WG14 mailing) to see what the committee sentiment is over there. I feel far less strongly about the definition of this than I do about ensuring the definitions match across C and C++.

@jensmaurer
Copy link
Member

jensmaurer commented Sep 24, 2021

As explained above, "perform the notional arithmetic" is not the primary school arithmetic for unsigned values, but is "modulo 2^N" arithmetic. By construction, there cannot be overflow when checking whether the result is representable, then.

I note that the current C++ wording has arrived through P1236R1 Alternative Wording for P0907R4 Signed Integers are Two's Complement. WG14 has not yet moved to restrict to two's complement, I believe.

@AaronBallman
Copy link
Contributor Author

WG14 restricted C23 to twos complement when we adopted http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2412.pdf

@jensmaurer jensmaurer removed the decision-required A decision of the editorial group (or the Project Editor) is required. label Nov 19, 2021
@tkoeppe
Copy link
Contributor

tkoeppe commented Nov 19, 2021

@AaronBallman: After the present discussion, have we all agreed that the current note is accurate and meaningful (in the context of the modular arithmetic that applies here), and no further clarification of "overflow" is needed? I'll close this issue for now, but if you still have suggestions, please feel free to reopen with concrete change proposals. Thanks!

@tkoeppe tkoeppe closed this as completed Nov 19, 2021
@AaronBallman
Copy link
Contributor Author

I think it's fine to close this issue now. For clarity, WG14 saw two papers related to this today:

Can Signed Integers Overflow [N2817]
Clarifying integer terms v2 [N2837] (the follow-up to N2811)

We adopted the entirety of N2837 and rejected N2817, which I believe addresses my concerns about the lack of a definition here in terms of possible confusion across C and C++.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants