Document number:

ISO/IEC/JTC1/SC22/WG21/P2717R0

Date:

2022-12-16

Audience:

SG15

Reply-to:

René Ferdinand Rivera Morell - grafikrobot at gmail dot com

1. Abstract

We propose to add a mechanism for C++ tools to communicate what capabilities a tool implements from the Ecosystem IS. [1]

2. Revision History

2.1. Revision 0 (December 2022)

Initial text.

3. Motivation

C++ tools will implement the aspects of the Ecosystem IS [1] that are relevant to the particular tool. And when they implement those aspects they may implement a particular edition of them. In order to allow other tools to adjust their behavior to accommodate such differences we need a mechanism of introspection for all tools. Additionally when one tool requests to use another tool’s Ecosystem IS aspect it’s desirable to consistently communicate which edition(s) of that aspect it can use.

4. Design

There are two aspects that this proposal covers:

Introspection

A tool reporting its capabilities to a consumer.

Declaration

A consumer specifying the capability edition and version.

Introspection would allow a consumer to ask the target tool if it implements a particular set of capabilities. The target tool would respond with the range of capabilities, or nothing, that it supports. With that information the consumer can go ahead and follow the defined standard, in the Ecosystem IS [1], to further interact with the target tool.

For declaration a consumer can specify a particular capability and a version to interact with. And if the target tool recognizes the specification it can continue to process the consumer’s use of that capability.

Even though these are two separate functions they are by necessity tied to each other. In order for this pairing to work, and generally for tool interoperability to work, the tool consumers and target tools must operate on this minimal pair of functions to bootstrap their interactions. To make that possible, this design follows some basic tenants:

Minimal

The interface of the target tool is a single universal command line argument for each of the two operations.

Concise

The information communicated to and from the target tool and consumer is as brief as needed to convey the required information.

Robust

The interface and information should not result in failure conditions for either the consumer or target tool. Both ends of the interactions need to rely on the stability of the interface to then be able to interoperate.

4.1. Introspection

The consumer can use a single method to query the target tool and obtain all the capabilities that are available or specifically requested. The following two use cases are supported:

  1. Unbounded introspection of the available capabilities with a single valueless --std-info option.

  2. Bounded introspection of particular capabilities with a single query valued --std-info=<VersionSpec> option.

4.1.1. Unbounded

An unbounded introspection is the simplest form of obtaining the capabilities. It is expected that this will be the most commonly used and implemented method of obtaining this information. It simply returns everything the tool is capable of doing. This is because it is the easiest to implement for tools. As it’s simply having a hard-wired result ready to output when needed. The drawback though is that the consumer has more information to parse and compare to decide how to interface with the target tool.

Running a tool with the option would look like the following:

$ tool --std-info

And could produce this as a JSON output:

{
  "$schema": "https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_info-1.0.0.json",
  "std:info": "[1.0.0,2.5.0]"
}

Which would minimally indicate that the tool only supports the introspection capability at versions "1.0.0" through "2.5.0".

4.1.2. Bounded

A bounded introspection makes it possible to specify particular capabilities that a consumer is looking for in a target tool. By giving a query to the target tool the consumer cat get an answer for just the capabilities they care about. This is particularly useful in cases where the consumer only supports some versions of a capability and prefers to not implement the version comparison logic to determine this from the unbounded introspection.

The bounded introspection works equivalently to the unbounded case except the --std-info option takes a version specification value to say what capabilities to filter the results by. For example running:

tool "--std-info=std:info==[1.0.0,2.1.0)"

Could produce this as a JSON output:

{
  "$schema": "https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_info-1.0.0.json",
  "std:info": "[1.5.0,2.0.0]"
}

Here the tool is saying that it only supports a subset of what the consumer asked about. It should also be possible to query about multiple capabilities of the target tool by using multiple --std-info options.

tool "--std-info=std:info=[1.0.0,2.1.0)" "--std-info=gcc:extra[2.0.0,2.1.0]"

In this example the target tool would return multiple capabilities, if supported:

{
  "$schema": "https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_info-1.0.0.json",
  "std:info": "[1.0.0,2.0.0)",
  "gcc:extra": "2.1.0"
}

4.2. Declaration

The consumer can inform, i.e. declare, the target tool that specific capabilities should use particular versions when responding with information using one or more --std-decl=<VersionSpec> options. The declarations can only exist in tandem with options for the mentioned capabilities. It’s expected that a consumer will first introspect a target tool to discover what it supports. Followed by the consumer declaring to the target tool what version(s) of the capabilities it is willing to consume. The target tool can then respond with the versions of the capabilities that satisfies the consumer and its own preference.

An exchange between a consumer and target tool would begin with the introspection:

tool "--std-info=std:info=[1.0.0,2.1.0)" "--std-info=gcc:extra[2.0.0,2.1.0]"

With a target tool response:

{
  "$schema": "https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_info-1.0.0.json",
  "std:info": "[1.0.0,2.0.0)",
  "gcc:extra": "2.1.0"
}

Which the consumer can use to declare the specific capability versions:

tool "--std-decl=std:info=2.0.0" "--std-decl=gcc:extra=2.1.0" ...

4.3. Capabilities

For this proposal capabilities refers to any published coherent target tool interface. This can include any single interface, like a single target tool option. Or it can include a collective interface of the target tool that covers many options. A capability is specified as a series of "scoped" identifiers separated by colons (":"). The capability must match this regular expression: [2]

^[a-z_]+(:[a-z_]+)+$

At minimum a capability has two components. The first component is a general scope that identifies if the capability is one in the IS, or if it’s a tool vendor capability.

Standard

A capability with a scope of std indicates that it’s defined in the IS. [1]

Vendor

Any other capability, i.e. other than std, is available for vendors to use as extensions outside the IS. [1]

4.4. Version Specification

When indicating the version, or versions, to the target tool or the consumer the version information is specified in two possible forms: a single version, or a single version range.

4.4.1. Single Version

A single version in this proposal is composed of a dotted triplet of whole numbers. The numbers are expected to be strictly increasing. But otherwise do not impart any meaning to the components. Specifically this does not impart any for of semantic meaning between versions. Although the specification of the capabilities themselves may define such a semantic meaning. The format for the version must match the regular expression: [2]

^[0-9]+[.][0-9]+[.][0-9]+$

4.4.2. Version Range

A version range in this proposal indicates a lower and upper bound of versions. It is composed of a pair of versions, separated by a comma, and bracketed by either an inclusive or exclusive symbol. This matches the intuition of a mathematic interval, but with the use of the version triplet number line. [3] Like the interval notation the () brackets indicate an exclusive point. And the [] brackets indicate an inclusive point. As versions are decidedly not single integers we use a , (comma) to separate the start and end of the range instead of using ... Hence the format for the version range must match the regular expression: [2]

^([0-9]+[.][0-9]+[.][0-9]+)|([[(][0-9]+[.][0-9]+[.][0-9]+,[0-9]+[.][0-9]+[.][0-9]+[)\\]])$

4.5. Version Matching

When given two version specifications tools will need to match the two to determine the sub-range that are compatible with both. There are two aspects to doing that matching: comparing the two single versions, and evaluating the sub-range interval.

4.5.1. Single Version Comparison

Comparing two single versions equates to three-way comparing each of the components of both, a and b, as:

  1. If the whole numbers of the first components, i and j, are not equal the comparison is either a < b or a > b if i < j or i > j respectively. Otherwise,

  2. If the whole numbers of the second components, k and l, are not equal the comparison is either a < b or a > b if k < l or k > l respectively. Otherwise,

  3. If the whole numbers of the third components, m and n, are not equal the comparison is either a < b or a > b if m < n or m > n respectively. Otherwise,

  4. The versions are equal, i.e. a == b.

4.5.2. Range Comparison

Tools will need to compare either a single version to a version range, or a version range to another range to determine the overlapping version sub-range. The single version to a version range comparison can be reformulated to a range-to-range comparison. I.e. a comparison of a single range a to a range b is equivalent to a comparison of range [a,a] to range b. Hence we only need to consider the range-to-range comparison. Although implementations may use special case for comparing single-to-range and range-to-single. Range-to-range should follow something like the following to compare a range a,b to m,n, with some varied inclusive or exclusive ends:

  1. If b < m or n < a the range is empty.

  2. Otherwise, assign a partial range x,y = max(a,m), min(b,n).

  3. If a or m are inclusive, then:

    1. If b or n are inclusive, then the range is [x,y].

    2. Otherwise, the range is [x,y).

  4. Otherwise, if b or n are inclusive, then the range is (x,y].

  5. Otherwise, the range is (x,y).

4.6. Format

The information reported by introspection is a JSON [4] format document. Some advantages to using JSON:

  • It is widely used and available either natively or through libraries in many programming languages. Which is particularly important as C++ tools are written in an array of differing programming languages.

  • It is a simple format to understand by both programs and humans.

In maintaining our goals of the interface being minimal, concise, and robust, the format for communicating the capabilities is a single key/value collection, i.e. a JSON object. [4]

Capability Identifier

The key is a string with the capability identifier. The format of the is as described in the Capabilities section.

Version Specification

The value indicates the versions supported by the tool for the capability. The versions follows the format described in the Version Specification section.

In addition to the capability identifier / version specification members, there are additional special members:

Schema

The document can also specify a reference to a JSON Schema. [5] For this the key would be $schema, and the value would a URI to a published stable schema (https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_info-1.0.0.json).

There is one designated capability that is required to appear in the document: The std:info capability with a corresponding version specification. This requirement allows a consumer to identify the format of the rest of the document at all times.

This is a minimal conforming document:

{
  "$schema": "https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_info-1.0.0.json",
  "std:info": "1.0.0"
}

This is also a minimal conforming document. But specifies a range of versions supported for the std:info capability:

{
  "$schema": "https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_info-1.0.0.json",
  "std:info": "[1.0.0,2.0.0)"
}

This example adds a custom vendor capability:

{
  "$schema": "https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_info-1.0.0.json",
  "std:info": "[1.0.0,2.0.0)",
  "gcc:extra": "1.5.0"
}

See the Wording for a JSON Schema for this format.

4.7. Impact On The Standard

5. Implementation Experience

None yet.

6. Polls

None yet.

7. Wording

None yet.

7.1. Schema

{
	"$schema": "https://json-schema.org/draft/2020-12/schema",
	"$id": "https://raw.githubusercontent.com/cplusplus/ecosystem-is/release/schema/std_info-1.0.0.json",
	"title": "Tool Introspection Version 1.0.0 JSON Schema",
	"required": [
		"$schema",
		"std:info"
	],
	"$defs": {
		"VersionSpec": {
			"type": "string",
			"pattern": "^([0-9]+[.][0-9]+[.][0-9]+)|([[(][0-9]+[.][0-9]+[.][0-9]+,[0-9]+[.][0-9]+[.][0-9]+[)\\]])$"
		}
	},
	"anyOf": [
		{
			"type": "object",
			"properties": {
				"$schema": {
					"description": "The URI of the JSON schema corresponding to the version of the tool introspection format.",
					"type": "string",
					"format": "uri"
				},
				"std:info": {
					"description": "The Tool Introspection format version.",
					"$ref": "#/$defs/VersionSpec"
				}
			}
		},
		{
			"type": "object",
			"propertyNames": {
				"type": "string",
				"pattern": "^[a-z_]+(:[a-z_]+)+$"
			},
			"patternProperties": {
				"": {
					"$ref": "#/$defs/VersionSpec"
				}
			}
		}
	]
}

8. Acknowledgements

None yet.


1. https://wg21.link/P2656 C++ Ecosystem International Standard
2. ECMAScript® 2022 language specification, 13th edition, June 2022 (https://www.ecma-international.org/publications-and-standards/standards/ecma-262/)
3. Wikipedia: Interval (mathematics) (https://en.wikipedia.org/wiki/Interval_(mathematics))
4. ISO/IEC 21778:2017 Information technology — The JSON data interchange syntax, (https://www.iso.org/standard/71616.html)
5. JSON Schema: A Media Type for Describing JSON Documents (http://json-schema.org/latest/json-schema-core.html)