Skip to content

Returning multiple values: syntax

This is an announcement of an upcoming feature.

One of experimental, but almost universally supported WASM features is multi-value, i.e. the ability of a WASM function to return more than one value directly, that is without allocating storage and passing around references to storage. WCPL uses return-no-values variant for void functions, but we would like to be able to support more than one value too.

To add this feature to WCPL, we will need a way to represent it in the syntax of the language, and do it unambiguously and without affecting backward compatibility. This means that we can’t use comma-separated values in round parentheses, because this syntax would be undistinguishable from a use of comma operator:

  return (true, x+2); /* comma operator! */
  return true, x+2;   /* comma operator! */

We can use a keyword as a prefix, or use square brackets instead of parentheses, but there is a better way: we can repurpose the display notation used to initialize compound values such as structures and arrays:

  return {true, x+2}; /* looks vaguely familiar */

Ideally, we’d like to be able to use the same notation on the receiving end, e.g.:

void foo() {
  {bool ok, int val} = ret2(/*...*/); /* receive 2 values from ret2 call */
  /* ... */
}

This would not only look nice and match the multi-value return syntax, but also be easily distinguishable from block statement, because in both C and WCPL block statements cannot end in an identifier and be followed by =.

One complication is parsing: block statements can start with declarations, so we won’t be able to tell what we are dealing with until we parse the first declaration. But, since we already relaxed the declaration syntax, we now can parse the whole list before we see the closing } which tells us that we’re dealing with multi-value declaration, not with a block. Naturally, a shortcut notation can be used too:

void bar() {
  {bool ok, int x, y} = ret3(/*...*/); /* init 1 bool, 2 int vars */
  /* ... */
}

A similar parsing trick can be used while parsing left-hand-side of an assignment: if a block consists only of comma-separated argument expressions ending in } followed by =, WCPL treats it as a left-hand-side of a multiple-value assignment, not as a block, e.g.:

void bar() {
  bool ok; int a[2];
  {ok, a[0], a[1]} = ret3(/*...*/); /* assign 1 bool, 2 ints */
  /* ... */
}

The remaining piece of the puzzle is declaring functions and function types. Fortunately, the same trick works here: we can parse the declaration in a relaxed way, allowing it to omit names of the variables, and consider it a tuple type if we hit the closing ‘}’. If, while doing this, we meet an initializer or step on closing semicolon, we’ll know that this is the first declaration of a block statement (and check that all vars are named). Here is a complete example:

{bool, int, int} ret3(int x) {
  if (x > 0) return {true, x, -x};
  return { false, 0, 0 };
}

void foo(int a) {
  {bool ok, int x, y} = ret3(a);
  if (ok) printf("OK, x = %d, y = %d\n", x, y);
  else printf("FAILURE!\n");
}

There is one corner case that we missed: what if we want to return zero values? With no declarations inside, ‘{}’ can be either a block statement or a nullary tuple type. However, this ambiguity is of purely theoretical nature: we already have void type which can serve this purpose well. The case of singular tuple is indistinguishable from the regular return, so we can insist on our tuples to have two or more elements without losing any functionality.

There is also a question of macros: macro uses look like function calls, so it would be nice if a form receiving multiple values can have a macro use on the right side. If such a macro expands into a call, things are easy. But what if we want to supply the values directly? To support this, we may allow macros to return displays as if they were not just initializers, but bona fide expressions, also allowing displays on the right side of multi-value initialization:

#define FZZ() ({false, 0, 0))

void bar() {
  {bool ok, int x, y} = FZZ();
  ...
}

More on this later…