Document number:

ISO/IEC/JTC1/SC22/WG21/P2717R1

Date:

2023-05-17

Audience:

SG15

Reply-to:

René Ferdinand Rivera Morell - grafikrobot at gmail dot com

1. Abstract

We propose to add a mechanism for C++ tools to communicate what capabilities a tool implements from the Ecosystem IS. [1]

2. Revision History

2.1. Revision 1 (May 2023)

Addition of scope, functionality levels, use cases, and wording. Simplified introspection and declaration interfaces to make implementing introspection trivial and declaration straightforward. The simplification removes the bounded introspection interface as superfluous.

2.2. Revision 0 (December 2022)

Initial text.

3. Motivation

C++ tools will implement the aspects of the Ecosystem IS [1] that are relevant to the particular tool. And when they implement those aspects they may implement a particular edition of them. In order to allow other tools to adjust their behavior to accommodate such differences we need a mechanism of introspection for all tools. Additionally when one tool requests to use another tool’s Ecosystem IS aspect it’s desirable to consistently communicate which edition(s) of that aspect it can use.

4. Scope

This proposal aims to specify a method for tools to communicate which specific aspects of the Ecosystem IS they support and adhere to consumers (either other tools or users). It does not prescribe which aspects of the Ecosystem IS the tools must support or adhere to except to prescribe that supporting any capability of the Ecosystem IS must also support this aspect. Ultimately it wants to make it possible to address two cases:

  • What does the tool support and adhere to?

  • The tool should adhere to what the consumer asks if possible.

5. Design

There are two aspects that this proposal covers:

Introspection

A tool reporting its capabilities to a consumer.

Declaration

A consumer specifying the capability edition and version.

Introspection would allow a consumer to ask the target tool what versions of of capabilities it supports. The target tool would respond with the range of capabilities, or nothing, that it supports. With that information the consumer can go ahead and follow the defined standard, in the Ecosystem IS [1], to further interact with the target tool.

For declaration a consumer can specify a particular capability and a version to interact with. And if the target tool recognizes the specification it can continue to process the consumer’s use of that capability.

Even though these are two separate functions they are by necessity tied to each other. In order for this pairing to work, and generally for tool interoperability to work, the tool consumers and target tools must operate on this minimal pair of functions to bootstrap their interactions. To make that possible, this design follows some basic tenets:

Minimal

The interface of the target tool is a single universal command line argument for each of the two operations.

Concise

The information communicated to and from the target tool and consumer is as brief as needed to convey the required information.

Robust

The interface and information should not result in failure conditions for either the consumer or target tool. Both ends of the interactions need to rely on the stability of the interface to then be able to interoperate.

5.1. Introspection

We used to include a bounded introspection option. But turned out to be not worth the added complexity in the consumer and tool.

The consumer can use a single method to query the target tool and obtain all the capabilities that are available or specifically requested. The use case supported is for unbounded introspection of the available capabilities with a single valueless --std-info option.

And unbounded introspection simply returns everything the tool is capable of doing. The tool has the option to respond with either all minimal single (aka bare) versions or full version ranges. Either can be trivially implemented by tools as most time it can be a hard-wired response text.

Running a tool with the option would look like the following:

$ tool --std-info

And could produce this as a minimal JSON output to indicate the single version of the capabilities it supports:

{
  "$schema": "https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_info-1.0.0.json",
  "std:info": "1.0.0"
}

Or could produce this as a JSON output in the case of full version ranges:

{
  "$schema": "https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_info-1.0.0.json",
  "std:info": "[1,2.5]"
}

Which would minimally indicate that the tool only supports the introspection capability at versions "1.0.0" through "2.5.0".

5.2. Declaration

The consumer can inform, i.e. declare, to the target tool that specific capabilities should use particular versions when responding with information using one or more --std-info=<VersionSpec> options. The declarations can only exist in tandem with options for the mentioned capabilities. It’s expected that a consumer will first introspect a target tool to discover what it supports. Followed by the consumer declaring to the target tool what version(s) of the capabilities it is willing to consume. The target tool can then either accept the declared capability versions or indicate an error.

An exchange between a consumer and target tool would begin with the introspection:

tool "--std-info"

With a target tool response:

{
  "$schema": "https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_info-1.0.0.json",
  "std:info": "[1,2)",
  "gcc:extra": "[2.1]"
}

Which the consumer can use to declare the specific capability versions:

tool "--std-decl=std:info=2.0.0" "--std-decl=gcc:extra=2.1.0" ...

5.3. Levels

For some use cases it helps to simplify the extent of information the introspection understands. While it would be reasonable to expect a tool written in a modern general purpose programming language to fully implement all aspects of the introspection. It would not be practical to have a shell script parse and recognize the more challenging aspect of parsing version number ranges and matching them together. To support such use cases the introspection has to support levels "min" and "full".

Obviously the "full" level equates to the tool understanding all the arguments and values. The "min" level only understands these:

  • Only introspection --std-info option.

  • Single version number in the responses for --std-info.

This has the effect that a tool which only support the "min" level can only support specific versions of the capabilities it implements. But it also means that consumers will need to adjust their behavior to the tool instead of being able to ask the tool to adjust to the consumer. Consequently the consumer will likely have the more complex logic to do that adjustment.

5.4. Capabilities

For this proposal capabilities refers to any published coherent target tool interface. This can include any single interface, like a single target tool option. Or it can include a collective interface of the target tool that covers many options. A capability is specified as a series of "scoped" identifiers separated by colons (":"). The capability must match this regular expression: [2]

^[a-z_]+(:[a-z_]+)+$

At minimum a capability has two components. The first component is a general scope that identifies if the capability is one in the IS, or if it’s a tool vendor capability.

Standard

A capability with a scope of std indicates that it’s defined in the IS. [1]

Vendor

Any other capability, i.e. other than std, is available for vendors to use as extensions outside the IS. [1]

5.5. Version Specification

When indicating the version, or versions, to the target tool or the consumer the version information is specified in two possible forms: a single version, or a single version range.

5.5.1. Single Version

A single version in this proposal is composed of a one to three dotted whole numbers. The numbers are expected to be strictly increasing. But otherwise do not impart any meaning to the components. Specifically this does not impart any form of semantic meaning between versions. Although the specification of the capabilities themselves may define such a semantic meaning. The format for the version must match the regular expression: [2]

^[0-9]+([.][0-9]+){0,2}$

5.5.2. Version Range

A version range in this proposal indicates a lower and upper bound of versions. It is composed of a pair of versions, separated by a comma, and bracketed by either an inclusive or exclusive symbol. This matches the intuition of a mathematic interval, but with the use of the version triplet number line. [3] Like the interval notation the () brackets indicate an exclusive point. And the [] brackets indicate an inclusive point. As versions are decidedly not single integers we use a , (comma) to separate the start and end of the range instead of using ... Hence the format for the version range must match the regular expression: [2]

^[[(][0-9]+([.][0-9]+){0,2},[0-9]+([.][0-9]+){0,2}[)\\]]$

5.6. Version Matching

When given two version specifications tools will need to match the two to determine the sub-range that are compatible with both. There are two aspects to doing that matching: comparing the two single versions, and evaluating the sub-range interval.

5.6.1. Single Version Comparison

Comparing two single versions equates to three-way comparing each of the components of both, a and b, as:

  1. If the whole numbers of the first components, i and j, are not equal the comparison is either a < b or a > b if i < j or i > j respectively. Otherwise,

  2. If the whole numbers of the second components, k and l, are not equal the comparison is either a < b or a > b if k < l or k > l respectively. Otherwise,

  3. If the whole numbers of the third components, m and n, are not equal the comparison is either a < b or a > b if m < n or m > n respectively. Otherwise,

  4. The versions are equal, i.e. a == b.

5.6.2. Range Comparison

Tools will need to compare either a single version to a version range, or a version range to another range to determine the overlapping version sub-range. The single version to a version range comparison can be reformulated to a range-to-range comparison. I.e. a comparison of a single range a to a range b is equivalent to a comparison of range [a,a] to range b. Hence we only need to consider the range-to-range comparison. Although implementations may use special case for comparing single-to-range and range-to-single. Range-to-range should follow something like the following to compare a range a,b to m,n, with some varied inclusive or exclusive ends:

  1. If b < m or n < a the range is empty.

  2. Otherwise, assign a partial range x,y = max(a,m), min(b,n).

  3. If a or m are inclusive, then:

    1. If b or n are inclusive, then the range is [x,y].

    2. Otherwise, the range is [x,y).

  4. Otherwise, if b or n are inclusive, then the range is (x,y].

  5. Otherwise, the range is (x,y).

5.7. Format

The information reported by introspection is a JSON [4] format document. Some advantages to using JSON:

  • It is widely used and available either natively or through libraries in many programming languages. Which is particularly important as C++ tools are written in an array of differing programming languages.

  • It is a simple format to understand by both programs and humans.

In maintaining our goals of the interface being minimal, concise, and robust, the format for communicating the capabilities is a single key/value collection, i.e. a JSON object. [4]

Capability Identifier

The key is a string with the capability identifier. The format of the is as described in the Capabilities section.

Version Specification

The value indicates the versions supported by the tool for the capability. The versions follows the format described in the Version Specification section.

In addition to the capability identifier / version specification members, there are additional special members:

Schema

The document can also specify a reference to a JSON Schema. [5] For this the key would be $schema, and the value would a URI to a published stable schema (https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_info-1.0.0.json).

There is one designated capability that is required to appear in the document: The std:info capability with a corresponding version specification. This requirement allows a consumer to identify the format of the rest of the document at all times.

This is a minimal conforming document:

{
  "std:info": "1.0.0"
}

This is also a minimal conforming document. But specifies a range of versions supported for the std:info capability:

{
  "std:info": "[1.0.0,2.0.0)"
}

This example adds a custom vendor capability and the schema reference:

{
  "$schema": "https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_info-1.0.0.json",
  "std:info": "[1.0.0,2.0.0)",
  "gcc:extra": "1.5.0"
}

See the Wording for a JSON Schema for this format.

5.8. Capability Versions

The capabilities and their version is expected to work similar to how C++ feature macro version ([version.syn]) in that it specifies if a feature of a standard is implemented and at what version. Although the meaning of the capability version is not defined, it’s recommended that it follow some simple rules:

  • The major-number should only change for large changes.

  • The minor-number should only change for fixes that are significant, but not large.

  • The patch-number should only change for fixes that are simple and small.

That is, it should roughly follow the industry understanding of sematic versioning. [6]

  • Each part of the version number should always increment, but;

  • The minor-number should reset to zero when the major-number increases, equivalently for the patch-number and minor-number.

These rules set it apart from the C++ feature macros that they impart some meaning to a version relative to other versions.

5.9. Impact On The Standard

This specification adds new functionality that is partly required for programs. Other specifications that define program behavior will need to follow this specification for conformance to the Ecosystem IS.

6. Implementation Experience

None yet.

7. Polls

7.1. SG15: P2717R0 (2023-01-27)

SG15 wants to pursue defining in the Tooling IS a way for tools to provide portable information about which parts of the Tooling IS and vendor extensions they support.

SF F N A SA

5

3

0

0

0

8. Wording

Wording is relative to ecosystem-is/86cfbd7. [7]

8.1. Normative references

In [intro.refs] add:

JSON

ISO/IEC 21778:2017, Information technology — The JSON data interchange syntax

POSIX

ISO/IEC 9945:2009, Information technology — Portable Operating System Interface (POSIX®) Base Specifications, Issue 7

8.2. Specification: Conformance

Insert clause before Terms and definitions [intro.defs].

8.2.1. Conformance [cnf]

A conforming implementation shall meet the following criteria for conformance to this standard:

— An application shall support the minimum level functionality of introspection (intspct.min).

8.3. Definitions

Add the following to Terms and definitions [intro.defs].

8.3.1. application [defns.application]

a computer program that performs some desired function.

[ Note 1: From POSIX. — end note ]

8.3.2. capability [defns.capability]

an aspect of an overall specification that defines a subset of the entire specification.

8.3.3. file [defns.file]

an object that can be written to, or read from, or both.

[ Note 1: From POSIX. — end note ]

8.4. Specification: Introspection

Insert clause after Terms and definitions [intro.defs].

8.4.1. Introspection [intspct]

8.4.1.1. Preamble [intspct.pre]

This clause describes options, output, and formats that describe what capabilities of this standard an application supports. An application shall support the minimum level functionality (intspct.min). An application can support the full level functionality (intspct.full).

This clause specifies the std:info capability (intspct.cap).

8.4.1.2. Overview [intspct.overview]

application [--std-info[=declaration]] [--std-info-out=file]

8.4.1.3. Options [intspct.options]

The following options shall be supported:

--std-info [intspct.opt.info]

Outputs the version information of the capabilities supported by the application. The option can be specified zero or one time. The application shall support the option for minimum level (intspct.min) functionality.

--std-info-out=file [intspct.opt.out]

The pathname of a file to output the information to. If file is ‘-’, the standard output shall be used. The application shall support the option for minimum level (intspct.min) functionality. Not specifying this option while specifying the --std-info option (intspct.opt.info) shall be equivalent to also specifying a --std-info-out=- option.

The following options should be supported:

--std-info=declaration [intspct.opt.decl]

Declares the required capability version of the application. The option can be specified any number of times. The application shall support the option for full level (intspct.full) functionality.

8.4.1.4. Output [intspct.output]

An application shall output a valid JSON text file that conforms to the introspection schema (intspct.schema) to the file specified in the options (intspct.opt.out).

8.4.1.5. Schema [intspct.schema]

An introspection JSON text file shall contain one introspection JSON object (intspct.schema.obj).

8.4.1.5.1. Introspection Object [intspct.schema.obj]

The introspection object is the root JSON object of the introspection JSON text.

An introspection object can have the following fields.

JSON Schema Field [intspct.schema.schema]

Name: $schema
Type: string
Value: The value shall be a reference to a JSON Schema specification.
Description: An introspection object can contain this field. If an introspection object does not contain this field the value shall be a reference to the JSON Schema corresponding to the current edition of this standard.

capability [intspct.schema.cap]

Name: capability-identifier (intspct.cap)
Type: string
Value: The value shall be a version-number for minimum level functionality. Or the value shall be a version-range for full level functionality.
Description: An introspection object can contain this field one or more times. When the field appears more than one time the name of the fields shall be unique within the introspection object.

8.4.1.6. Capabilities [intspct.cap]
capability-identifier:

name scope-designator name sub-capability-identifier

sub-capability-identifier:

scope-designator name sub-capability-identifier

name: one or more of

a b c d e f g h i j k l m
n o p q r s t u v w x y z
_

scope-designator:

:

A capability-identifier is composed of two or more scope-designator delimited name parts.

The name std in a capability-identifier is reserved for capabilities defined in this standard.

Applications can specify vendor designated name parts defined outside of this standard.

8.4.1.7. Versions [intspct.vers]

A version shall be either a single version number (intspct.vers.num) or a version range (intspct.vers.range).

A single version number shall be equivalent to the inclusive version range spanning solely that single version number.

[ Note 1: That is the version number i.j.k is equivalent to version range [i.j.k,i.j.k]. — end note ]

8.4.1.7.1. Version Number [intspct.vers.num]
version-number:

major-number . minor-number . patch-number

major-number:

digits

minor-number:

digits

patch-number:

digits

digits: one or more of

0 1 2 3 4 5 6 7 8 9

A version number is composed of 1, 2, or 3 decimal numbers (digits) separated by a period (.).

A version number composed of 1 decimal number is equivalent to that decimal number followed by .0.0.

[ Note 1: That is the version number N is equivalent to N.0.0. — end note ]

A version number composed of 2 decimal number parts is equivalent to those decimal number parts followed by .0.

[ Note 2: That is the version number N.M is equivalent to N.M.0. — end note ]

Version numbers define a total ordering where version number a with parts i, j, k is ordered before version number b with parts p, q, r when: i < p, or i == p and j < q, or i == p and j == q and k < r.

Otherwise version number a is ordered before version number b when: i > p, or i == p and j > q, or i == p and j == q and k > r.

Otherwise version number a is the same as version number b.

8.4.1.7.2. Version Range [intspct.vers.range]
version-range:

version-range-min-bracket version-min-number , version-max-number version-range-max-bracket

version-min-number:

version-number

version-max-number:

version-number

version-range-min-bracket: one of

[ (

version-range-max-bracket: one of

) ]

A version range is composed of either one version number bracketed, or two version numbers separated by a comma (,) and bracketed.

[ Example 1:
[1.0.0]
A version range with a single version number. — end example ]

[ Example 2:
[1.0.0,2.0.0]
A version range with a two version numbers. — end example ]

A version range a that is [i,j] makes i and j inclusive version range numbers.

A version range a that is (i,j) makes i and j exclusive version range numbers.

A version range a that is (i,j] makes i an exclusive version number.

A version range a that is [i,j) makes j an exclusive version number.

A version range with a single inclusive version number x is equivalent to the version range [x,x].

A version range with a single exclusive version number x is invalid.

An exclusive version number x does not include the version number x when compared to another version number y.

A version range a with version numbers i and j when compared to a version range b with version number m and n will result in an empty version range when: j < m or n < i.f

Otherwise if i or m are inclusive version numbers and if j or n are inclusive version numbers the resulting range when a is compare to b is the inclusive version numbers "lesser of i and m" and "lesser of j and n".

Otherwise if i or m are inclusive version numbers and if j or n are inclusive version numbers the resulting range when a is compare to b is the inclusive version number "lesser of i and m" and the exclusive version number "lesser of j and n".

Otherwise if j or n are inclusive version numbers the resulting range when a is compared to b is the exclusive version number "lesser of i and m" and the inclusive version number "lesser of j and n".

Otherwise the resulting range when a is compared to b is the exclusive version numbers "lesser of i and m" and "lesser of j and n".

8.4.1.8. Minimum Level [intspct.min]

An application that supports the minimum level functionality indicates it by specifying a single version ([intspct-vers-single]) as the value of the std:info capability (intspct.cap).

[ Example 1:
{ "std:info": "1.0.0" }
end example ]

8.4.1.9. Full Level [intspct.full]

An application can support the full level functionality as defined in this section. An application that reports supporting the full level functionality shall support all of the functionality in this section.

An application that supports the full level functionality indicates it by specifying a version range ([intspct-vers-single]) as the value of the std:info capability (intspct.cap).

[ Example 1:
{ "std:info": "[1.0.0]" }
end example ]

8.4.1.10. Introspection Information [intspct.info]

An application shall output an introspection schema (intspct.schema) that contains one capability field for each capability that the application supports when given the --std-info option (intspct.opt.info).

An application shall indicate the single version (intspct.vers.num) or version range (intspct.vers.range) of each capability it supports as the value of the capability field.

8.4.1.11. Introspection Declaration [intspct.dcl]

An application that supports the full level functionality when given one or more --std-info=declaration options shall conform its functionality to the indicated edition of this standard in the given declaration version-number for the given capability.

declaration

capability-identifier = version-number

An application, when not given a --std-info=declaration option for a capability it supports, should conform its functionality to the most recent version of the standard it supports for that capability.

An application, when given a capability declaration option and the given version is outside of the version range that the application supports, should indicate an error.

8.5. JSON Schema

Insert clause before Bibliography.

8.5.1. Annex A (informative) Tool Introspection JSON Schema [intsjschm]

8.5.1.1. General [intsjschm.general]

This Annex describe defines the introspection capability schema (intspct.schema) in terms of a JSON Schema. A JSON Schema refers to the IETF RFC draft "JSON Schema: A Media Type for Describing JSON Documents" as specified in https://json-schema.org/draft/2020-12/json-schema-core.html.

This JSON Schema can be referenced as the $schema field with URI value of "https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_info-1.0.0.json".

8.5.1.2. JSON Schema Specification [intsjschm.spec]
{
	"$schema": "https://json-schema.org/draft/2020-12/schema",
	"$id": "https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_info-1.0.0.json",
	"title": "Tool Introspection Version 1.0.0 JSON Schema",
	"required": [
		"$schema",
		"std:info"
	],
	"$defs": {
		"VersionSpec": {
			"type": "string",
			"pattern": "^([0-9]+([.][0-9]+){0,2})|([[(][0-9]+([.][0-9]+){0,2},[0-9]+([.][0-9]+){0,2}[)\\]])$"
		}
	},
	"anyOf": [
		{
			"type": "object",
			"properties": {
				"$schema": {
					"description": "The URI of the JSON schema corresponding to the version of the tool introspection format.",
					"type": "string",
					"format": "uri"
				},
				"std:info": {
					"description": "The Tool Introspection format version.",
					"$ref": "#/$defs/VersionSpec"
				}
			}
		},
		{
			"type": "object",
			"propertyNames": {
				"type": "string",
				"pattern": "^[a-z_]+(:[a-z_]+)+$"
			},
			"patternProperties": {
				"": {
					"$ref": "#/$defs/VersionSpec"
				}
			}
		}
	]
}

9. Examples

9.1. Portable Command Lines

Assuming that the Ecosystem IS specifies a common set of portable command line compiler options an interaction between a build system (or user at a command prompt) and a compiler could look like:

Build systems asks the compiler for supported capabilities:

$ c++ --std-info
{ "std:info": "[1]", "std:cli:c++", "[1]" }
The build system would likely want to cache that information as it’s likely to be static for the release of the compiler.

The build system could then declare and use any such portable compiler options:

$ c++ --std-info=std:cli:c++=1 -std=c++26 -I /home/user/boost -o myapp main.cpp

The interaction when the compiler tool only supports the minimum level would be:

$ c++ --std-info
{ "std:info": "1", "std:cli:c++", "1" }
$ c++ -std=c++26 -I /home/user/boost -o myapp main.cpp

This example predicts that it might be useful outside of the C++ ecosystem by using std:cli:c++ to indicate a C++ specific command line. It could be that Fortran uses a different, but possibly overlapping CLI:

$ gcc --std-info
{ "std:info": "1", "std:cli:c++", "1", "gcc:cli:fortran", "1" }
$ gcc -std=f2018 -o myapp main.fpp

1. https://wg21.link/P2656 C++ Ecosystem International Standard
2. ECMAScript® 2022 language specification, 13th edition, June 2022 (https://www.ecma-international.org/publications-and-standards/standards/ecma-262/)
3. Wikipedia: Interval (mathematics) (https://en.wikipedia.org/wiki/Interval_(mathematics))
4. ISO/IEC 21778:2017 Information technology — The JSON data interchange syntax, (https://www.iso.org/standard/71616.html)
5. JSON Schema: A Media Type for Describing JSON Documents (http://json-schema.org/latest/json-schema-core.html)
6. Semantic Versioning (https://semver.org/)
7. Working Draft, C++ Ecosystem International Standard 2023-04-01 (https://github.com/cplusplus/ecosystem-is/tree/86cfbd72b9fae73554935d5f7039a798b5583628)