Named arguments

Document number:N4172
Authors:Ehsan Akhgari (Mozilla), <ehsan@mozilla.com>
Botond Ballo (Mozilla), <botond@mozilla.com>
Date:2014-10-07

Abstract

We propose a new mechanism for associating function arguments with function parameters: by name, with the syntax name : value. We believe using named arguments will increase the readability of function calls, as well as making it easier to write them and reducing the likelihood of mistakes. Parameter names are associated with a function for the purpose of a call by looking at the declarations of the function visible from the call site; use of named arguments is allowed if all declarations use consistent names.

Motivation

Function arguments are currently associated with function parameters by position. This has implications for writing and reading function calls.

When writing a function call, one needs to be careful to write the arguments in the same order as the parameters in the function declaration; depending on the types of the parameters, the compiler can catch some mistakes, but other times they lead to runtime bugs that are hard to diagnose and prevent.

When reading a function call, the only information about the purpose of the argument that is present at the call site is its value and order relative to other arguments; for more information, the reader needs to refer to the corresponding parameter's name in the function declaration.

Function declarations are typically "far away" from function calls in code, meaning elsewhere in the file or, more often, in a different file. Therefore, we believe that a mechanism for associating function arguments with function parameters that reduces the need to refer to the function declaration when writing and reading function calls, would make both jobs easier.

This is particularly the case for large software projects, where there can be thousands of functions in play at once, and reliance on conventions (such as "top, right, left, bottom") doesn't scale; and particularly the case for C++, where gotchas such as implicit conversions mean that extra care is required for things like argument order.

Another example of where this proposal is useful is with boolean arguments. Some large software projects discourage the usage of the bool type as an argument and require authors to use enums or consts to convey what the boolean arguments mean at the call site. With this proposal, however, boolean arguments can be used without the loss of readability at the call site.

We specifically propose that function arguments can be associated with function parameters by using the parameter's name at the call site.

Example

Below is an example of the kind of code that can be written with what is being proposed here.

void draw_rect(int left, int top, int width, int height, bool fill_rect = false); int main() { // Both of the function calls below pass the same set of arguments to draw_rect. draw_rect(top: 10, left: 100, width: 640, height: 480); draw_rect(100, 10, height: 480, fill_rect: false, width: 640); }

Proposal

Syntax
The syntax of expression-list is extended to accept elements of the form identifier: expression. Such elements are only allowed as arguments to a function call. Arguments written this way are referred to as named arguments. Non-named arguments are referred to as positional arguments.
Basic semantics
Positional arguments cannot appear after a named argument. If a function has an ellipsis parameter or is a variadic template function, all of its arguments at call sites must be positional. If a function declaration includes parameters with default arguments, named arguments may give values to some but not all of the optional parameters. Here is an example:

void foo(int a, char b, std::string c); void bar(int a, char b='x', float c=0.1); void baz(int a, ...); template <class ...T> void qux(T... a); void test() { foo(1, 'c', "s"); // valid foo(1, b: 'c', c: "s"); // valid foo(a: 1, 'c', "s"); // invalid -- named argument followed by positional foo(1, c: "s", b: 'c'); // valid foo(1, 'c', c: "s"); // valid bar(1, c: 0.2); // valid bar(1, b: 'c'); // valid bar(1, c: 0.2, b: 'c'); // valid bar(a: 1); // valid bar(a: 1, 'c'); // invalid -- named argument followed by positional baz(a: 1); // invalid -- ellipsis parameter baz(1, foo: 1); // invalid -- ellipsis parameter qux(a: 1); // invalid -- variadic template function }

If the same function is declared more than once, it can only be called using named arguments if the names or lack thereof for each parameter is exactly the same in each declaration. Here is an example:

void foo(int a, char b, std::string c); void foo(int a, char b, std::string x); void bar(int a, char b); void bar(int x, char b); void test() { foo(1, 'c', "s"); // valid foo(1, 'c', c: "s"); // invalid -- different names for an argument in multiple declarations foo(1, 'c', x: "s"); // invalid -- different names for an argument in multiple declarations bar(1, b: 'c'); // invalid -- different names for an argument in multiple declarations }

If the parameter list for a function drops some of the parameter names, assuming that parameter N is the last to not have a name (nor a default argument), all well-formed calls to that function must provide at least N positional arguments. Here is an example:

void foo(int, char b, std::string c); void bar(int, char b='c', float c=0.1); void test() { foo(1, c: "s", b: 'c'); // valid bar(1, c: 0.2); // valid bar(c: 0.2); // invalid - number of positional arguments (0) less than the index of the last unnamed parameter (1) }

As an exception, if a declaration does not provide names for any arguments, it is ignored.

void foo(int a, char b, std::string c); void foo(int, char, std::string); void test() { foo(a: 1, b: 'c', c: "s"); // valid }

Functions definitions do not affect calling functions with named arguments in any way. Example:

void foo(int a, char b, std::string c); void test() { foo(c: "s", a: 1, b: 'c'); // valid } void foo(int x, char y, std::string z) {}

This is true even if the definition is located prior to the call site:

void foo(int a, char b, std::string c); void foo(int x, char y, std::string z) {} void test() { foo(c: "s", a: 1, b: 'c'); // still valid foo(x: 1, y: 'c', z: "s"); // still invalid }

However, if a function definition is the first declaration of a function, the names of the parameters in the definition are considered. Example:

void foo(int a, char b, std::string c) {} void test() { foo(c: "s", a: 1, b: 'c'); // valid }

If a following declaration then defines different names for the parameters, using named arguments to call that function would be an error. Example:

void foo(int a, char b, std::string c) {} void foo(int x, char y, std::string z); void test() { foo(c: "s", a: 1, b: 'c'); // invalid -- different names for an argument in multiple declarations }

All of the above applies to function declarations/definitions in a single translation unit. The same function may be declared in different translation units with different parameter names, and each translation unit can have call sites using the argument names that match the declaration in that translation unit.

Overload resolution

Overload resolution needs to be modified slightly in order to handle named arguments correctly. We do not propose to modify the rules for determining the set of candidate functions and argument lists that are the inputs to the overload resolution process [over.match.funcs], nor the rules for selecting the best viable function [over.match.best]. We only propose to modify the rules for determining which candidates are viable to begin with [over.match.viable], and our modifications only apply to calls that involve named arguments.

Specifically, for calls that involve named arguments, we define rules for matching arguments to parameters, and consider candidates where not all arguments and parameters are matched according to these rules, to be not viable.

Let's consider a function call with M positional arguments and N named arguments. A given candidate function is viable if:

For a candidate function which is viable according to the above rules, we adjust the argument list as follows: the N named arguments are reordered to match the order of the parameters they have been matched with; if, for some i < M + N, the i'th parameter of the function has no matching argument - in which case, according to the above rules, it must have a default argument - we match it with a synthetic argument using the value of said default argument.

After this adjustment, the rest of the overload resolution process proceeds as before.

Note that for function templates, the above matching and adjustment process happens before template argument deduction (which requires matched argument/parameter pairs).

Interactions with other language features

Constructor calls with { ... } syntax

As part of C++11 "uniform initialization", constructors can be called with a { } syntax in addition to a ( ) syntax:

struct S { S(int a, std::string b); ... }; S s(1, "foo"); S s{0, "bar"};

We propose allowing the use of named arguments with both syntaxes:

S s(a: 1, b: "foo"); // valid S s{a: 0, b: "bar"}; // also valid
Aggregate initialization

However, we do not propose allowing the use of named arguments with aggregate initialization:

struct S { int a; std::string b; }; S s{1, "foo"}; // valid C++11 S s{a: 1, b: "foo"}; // not valid

The rationale is that the names a and b here are names of data members, whereas our proposal is about associating arguments with parameter names.

We realize that this choice implies that the presence or absence of a constructor in an aggregate class becomes more noticeable to users than it was before. However, we believe that allowing the use of data member names in the same syntax as named arguments has its own problems. For example, adding a constructor to an aggregate class where the constructor parameter names are different from the data member names could change the meaning of the program. We believe disallowing this is the lesser of two evils.

Calls through function pointers

Consider the following code:

void func(int a, int b); using funcptr_t = void (*)(int x, int y); funcptr_t f = &func; f(2, 3);

Is it reasonable to allow a call to f to use named arguments? If so, what names would be allowed?

Clearly, the names a and b are out of the question, as they are the names of the function that the function pointer points to, and what function a function pointer points to is a runtime property of the program which the compiler cannot reason about.

However, perhaps it's reasonable to use the names x and y, that is, the names of the parameters used in the declaration of the function pointer, or (in case of a declaration using a typedef, as here) in the declaration of the function pointer type?

We propose not allowing x and y either, because it would require compilers to track parameter names in typedef declarations and associate different declarations of a typedef with each other to verify that the names are consistent. This would cover new ground by giving typedef names a use they don't currently have.

In summary, we propose disallowing named arguments for calls through function pointers altogether.

Perfect forwarding

We've had a request to interoperate with perfect forwarding in such a way that the following works:

struct Rect { Rect(int width, int height); ... }; auto p = make_shared<Rect>(width: 100, height: 200);

While we agree that having this work would be desirable, unfortunately we are not aware of a way to make this work without making parameter names part of a function's type, something we definitely do not propose (see Design Choices below).

Backwards Compatibility

The changes in this proposal are fully backwards-compatible. All existing programs will be valid with named arguments added to the language. Only call sites which use named arguments are affected by this proposal.

Design Choices

Syntax

The name : value syntax was chosen because there is precedent in other languages for using it to associate a key and a value (e.g. Python dictionaries, and because it does not create any ambiguity. name=value was considered, but rejected because "=" is an operator and thus there would be an ambiguity. A ":" only current appears at the beginning of a ctor-initializer, after an access-specifier, or in a labeled-statement, all contexts which are disjoint from expression-list, so there is no ambiguity.

Not making parameter names part of a function's type

In this proposal, the association of parameter names with a function for the purpose of making calls with named arguments to that function happens locally in each translation unit, and new declarations within a translation unit can change a function's ability to be called with named arguments at call sites below the new declaration (by using different names than a previous declaration).

A possible alternative would have been to encode parameter names in a function's type, and thus have them be coupled more closely to the function as an entity, possibly including across translation units.

We chose not to pursue this alternative because we felt it would be a far more complex and invasive change to the language, with relatively little gain. It would also likely be non-backwards-compatible both in terms of source code backwards compatibility and ABI compatibility considering mangling of names having to take the argument names into account.

Automatically allowing existing functions to be called via named arguments

This proposal allows calling existing functions with named arguments, without the author of the function having to "opt-in" to this in any way.

We chose this because we believe this feature has sufficient value that programmers should be able to start using it right away, without having to wait for libraries and APIs to opt-in to named arguments. We also believe that the potential for use in a way the author doesn't intend is low.

Potential objections, and our responses

Objection #1: This feature caters to having functions with many parameters - a programming style that should be discouraged.

We agree that having functions with many parameters should be discouraged. However, the reality is that many legacy APIs that programmers have to work with and will have to work with for a long time, have functions with many arguments, and making it easier to deal with such functions would solve a real problem and be materially useful.

More importantly, however, this feature has a lot of value for functions with few arguments, too. Even for functions with few arguments, when reading a call site one has relatively little information about the roles of the arguments. Having names of arguments present would make call sites more readable, regardless of the number of arguments.

Objection #2: This proposal competes with C99 designated initializers, which are being proposed for C++

This proposal and designated initializers have in common that they both allow the use of names to allow specifying a list of entities in an order that's not necessarily the order of their declaration.

For C99 designated initializers, the entities being named are data members of a structure (and this is reflected in the .name = value syntax). In our proposal, the entities being named are always parameters of a function (note that while we support constructor calls via the { ... }, we do not support aggregate initialization (see "Interactions with other language features" above)).

The two proposals therefore have no overlap, and we believe they are complementary.

Objection #3: Named arguments make parameter names part of a function's interface, so that changing the parameter names can affect call sites

We were initially concerned about this as well. To get an idea about the magnitude of this problem, we surveyed some large open-source libraries to see how often parameter names change in the declarations of the public API functions (i.e., the names that users would use when making calls to these functions).

For each library, we chose a recent release and a release from several years ago, examined the diff between the two versions, and recorded the number of parameter name changes. Only name changes that preserved the types of the parameters were considered, as a change to the type of a parameter typically breaks a function's interface already. The table below summarizes our findings. For more details, please see [1].

Library Version Range Number of Parameter
Name Changes
Start End
Eigen 2.0.0 (2009) 3.0.0 (2011) 4
wxWidgets 2.5.0.1 (2003) 3.0.0 (2013) 12
Boost.Asio 1.50.0 (2012) 1.55.0 (2013) 1
Boost.DateTime 1.50.0 (2012) 1.55.0 (2013) 1
Boost.Filesystem 1.45.0 (2010) 1.55.0 (2013) 0
Boost.GIL 1.45.0 (2010) 1.55.0 (2013) 0
Boost.Math 1.45.0 (2010) 1.55.0 (2013) 22
Boost.Regex 1.45.0 (2010) 1.55.0 (2013) 0
Boost.Thread 1.45.0 (2010) 1.55.0 (2013) 4

Given the low number of parameter name changes relative to the sizes of these libraries over periods of several years, we believe that code breakage due to parameter name changes would not be a significant problem in practice.

Implementation Status

This proposal is not implemented yet. Most of the effort would be in changes to the overload resolution phase of compilers. One of the authors of this proposal has looked at the overload resolution implementation of clang, and we believe the changes will be relatively straightforward to implement. We have tried to ensure that the proposal makes as few changes to the language as possible to avoid unnecessary implementation complexity.

Standard Wording

There is no standard wording for this proposal yet. At this stage, we are mostly interested in getting general feedback.

Possible extensions and spin-off ideas

Non-trailing default arguments

With named arguments, it might become sensible to allow default arguments in non-trailing positions. For example,

void f(int a = 10, int b);

could be called like so:

f(b: 5);

We have no objections to this, but would prefer leaving it to a follow-up proposal to avoid scope creep.

Named template arguments

The arguments in favour or allowing named arguments for function calls also apply to template arguments. We can imagine a similar proposal that allows specifying template parametes via named arguments. For example:

template <typename Key, typename Value> struct map { ... }; map<Key: int, Value: string> m;

We think it makes sense to pursue this as a separate proposal, independent of this one.

Acknowledgements

We are grateful for the feedback received from numerous members of the Mozilla community as well the C++ standards community.

References

  1. A study performed on the changes to parameter names of the functions in the public interface of some large open-source projects.