(Also see the discussion page.)
This proposal extends the for-in
loop in ES1-3 and the for-each-in
loop from E4X (ECMA-357) by imitating Python’s iteration protocol. It builds on this protocol with support for generators (see PEP 255 and PEP 342) and array comprehensions modeled on Python’s list
comprehensions. Where language differences or sufficiently better ideas provide strong enough reason, this proposal differs in substance from the Python prior art.
The premises of this proposal are:
for-in
loop is broken, because it does enumeration, not iteration, of property identifiers.for-in
loops and are anathema to Ajax gurus.string
type, when it could more usefully remain, e.g., uint
for Array
elements.for-in
loops.for-in
should work as before.for-each-in
loops (from E4X) also enumerate, but (this is the first premise again) were intended to iterate.XMLList
element values.for-in
iterates over Array
values, not indexes. Resolvable by using for-each-in
, but more usably fixed if the language supported customizable iteration.
The conclusion embodied by this proposal is that there should be an iteration protocol used by the for-in
loop for all object types, defaulting to enumeration, which may be overridden. The ES4 spec and new code may define objects that are iterators, which implement the protocol. Further, there should be a way to write generator functions that yield values via special syntax, and that may participate in the iteration protocol. Finally, comprehension syntax for populating arrays from iterators is proposed as a nicety familiar to Python and other programmers.
Spec location: Chapter 6 Types: Iterator types
Let iterator
name a built-in namespace:
namespace iterator;
In this proposal, the following pragma is in effect except where overridden explicitly:
use default namespace iterator;
Let IterableType
be the following structural type:
type IterableType.<T> = { iterator::get: function (boolean=) : IteratorType.<T> };
The iterator::get
method finds or creates an iterator for the iterable. Its boolean
parameter, which by convention defaults to false
, controls whether the iterator considers prototype objects. Getting an iterator for an iterable is thus delegated to the iterable’s implementation.
Let IteratorType
be the following structural type:
type IteratorType.<T> = { next: function () : T };
By convention, an iterator is iterable and returns itself as the result of its iterator::get
method. So viewed as an iterable, an iterator is its own iterator.
Not all iterables are iterators. Only those iterables that both
iterator::get
methods, andIteratorType
, that is, have a next
method taking no arguments and returning a valueare iterators.
Let StopIteration
be a global constant pre-bound to a singleton exception object:
class StopIterationClass { public function toString() : string "[object StopIteration]"; } public const StopIteration : StopIterationClass = new StopIterationClass;
An iterator returns zero or more values from its next
method, throwing StopIteration
when it is exhausted. Repeated calls to next
on an exhausted iterator throw StopIteration
for each call.
Note that there is no Iterator
nominal type (contrast with JS1.7). It is misleading to programmers unfamiliar with structural type systems to have such a type; such programmers are likely to want to subclass Iterator
to implement an iterator type, when there is no such requirement.
The for-in
loop specified in ES3 12.6.4 can be written as a while
loop using the iteration protocol:
for (i in o) ...
is equivalent to:
let ($it = o.iterator::get(true)) { while (true) { try { i = $it.next(); } catch (e : iterator::StopIterationClass) { break; } ... } }
where $it
is guaranteed not to collide with any name bound in ...
, or shadow any name used in ...
.
Enumerator
is a parameterized class that may be instantiated with the iteration result type T
. Once instantiated, an Enumerator.<T>
type may be constructed via new
with a result function f
to get an iterator over the enumerable properties of an object v
, and optionally (depending on the e
flag) of objects along the initial object’s prototype chain. Each iteration step invokes the result function on the property’s identifier and the initial object value passed to the constructor (v
, if it is of type Object!
). The Enumerator
class is callable as a shorthand for constructing a new instance.
type EnumerableId = (int, uint, string, Name); type EnumerableIdArray = [EnumerableId]; class Enumerator.<T> { type ResultFun = function(EnumerableId, Object!) : T; function Enumerator(v, f : ResultFun, e : boolean = false) { initial_obj = (v is Object) ? v : null; current_obj = initial_obj; current_ids = magic::getEnumerableIds(initial_obj); result_fun = f; enumerate = e; } function call Enumerator(v) : IteratorType.<T> new Enumerator(v); iterator function get(e : boolean = false) : IteratorType.<T> (e == enumerate) ? this : new Enumerator.<T>(inital_obj, result_fun, e); public function next() : T { if (current_obj === null) throw StopIteration; loop: while (true) { if (current_index === current_ids.length) { if (!enumerate) throw StopIteration; // No more properties in current_obj: try walking up the prototype chain. current_obj = magic::getPrototype(current_obj); if (current_obj === null) throw StopIteration; current_ids = magic::getEnumerableIds(current_obj); current_index = 0; } let id : EnumerableId = current_ids[current_index++]; // Check for a shadowing property from initial_obj to current_obj on the prototype chain. for (let obj : Object = initial_obj; obj !== current_obj; obj = magic::getPrototype(obj)) { if (magic::hasOwnProperty(obj, id)) continue loop; } // Check whether name is still bound in order to skip deleted properties. if (magic::hasOwnProperty(current_obj, id)) return result_fun(id, initial_obj); } private var initial_obj : Object, current_obj : Object, current_ids : EnumerableIdArray, current_index : uint, result_fun : ResultFun, enumerate : boolean; }
The magic::getEnumerableIds
primitive has type function (Object) : EnumerableIdArray
and returns an array of values identifying enumerable properties in its argument object, or an empty array for null
. The identifiers are arrayed in property creation order. If an identifier fits in int
, it is stored using type int
, likewise for uint
; otherwise it is stored as a string
unless it is a namespace-qualified identifier, in which case it is stored as a Name object
(this is prefigured by the QName
object in E4X).
The magic::getEnumerableIds
primitive is a specification device only. Implementations may avoid ever allocating anything like the array that it returns, so long as they observably behave the same as the reference implementation that is based on this primitive.
All objects are iterable by the for-in
loop. If an object does not customize iterator::get
, it delegates to Object.prototype.iterator::get
, which has the following implementation (note that Object
is defined in the unnamed package, not in the iterator
namespace):
dynamic class Object { ... prototype.iterator::get = iterator::DEFAULT_GET; prototype.propertyIsEnumerable(Name(iterator, 'get'), false); }
The DEFAULT_GET
constant function is defined (in default namespace iterator
) as follows:
type EnumeratedId = (string, Name); const function DEFAULT_GET(e : boolean = false) : IteratorType.<EnumeratedId> new Enumerator.<EnumeratedId>(this, function (id : EnumerableId, obj : Object!) : EnumeratedId (id is Name) ? id : id to string, e);
Thus the for-in
loop enumerates as specified by prose in ECMA-262 Edition 3, section 12.6.4, the last two paragraphs. In contrast to ES1-3, the current proposal specifies and implements enumeration using self-hosted code based on only the iteration protocol and the new magic::getEnumerableIds
primitive.
(Note that Object.prototype.iterator::get
could be left undefined, and the logic underlying for-in
loops could be modified to test whether iterator::get
is bound in the right operand of in
, and default to enumeration if there is no such method. This is how Python detects its __iter__
method, which is not defined for all Python types. Nevertheless, for philosophical unity of iteration and enumeration both when teaching or learning ES4, the current proposal defines Object.prototype.iterator::get
.)
E4X specifies the for-each-in
loop as iterating over values of enumerable properties, rather than names (see ECMA-357 section 12.3 and note the erratum: deleted properties not yet visited MUST NOT be visited, per ECMA-262 12.6.4, the next-to-last paragraph). Call this kind of iteration “itemization”. Since ES4’s grammar includes for each
loop syntax, and it is part of several existing implementations (AS3, JS1.6+), we propose to support it as follows:
type KeyType = EnumerableId, ValueType = *, ItemType = [KeyType, ValueType]; const function values(v : *, e : boolean = false) : IteratorType.<ValueType> { if (v is ItemizableType.<KeyType, ValueType, ItemType>) return v.iterator::getValues(e); return null; } const function DEFAULT_GET_VALUES(e : boolean = false) : IteratorType.<ValueType> new Enumerator.<ValueType>(this, function (id : EnumerableId, obj : Object!) : ValueType obj[id], e);
As with iterator::get
, there is a default iterator::getValues
method defined on Object.prototype
:
dynamic class Object { ... prototype.iterator::getValues = iterator::DEFAULT_GET_VALUES; prototype.propertyIsEnumerable(Name(iterator, 'getValues'), false); }
Using iterator::values
, a for-each-in
loop can be rewritten as a for-in
loop. This loop:
for each (v in o) ...
is equivalent to this loop:
for (v in iterator::values(o)) ...
The destructuring assignment proposal introduces another variant of the for-in
loop:
for ([k, v] in o) ...
This form iterates k
and v
over the key (index, identifier, or qualified name) and value of each enumerable property in o
. To generalize it via the iteration protocol, we introduce iterator::getItems
in the same way that iterator::getValues
underlies for-each-in
:
dynamic class Object { ... prototype.iterator::getItems = iterator::DEFAULT_GET_ITEMS; prototype.propertyIsEnumerable(Name(iterator, 'getItems'), false); }
In the default iterator
namespace:
const function items(v, e : boolean = false) : IteratorType.<ItemType> { if (v is ItemizableType.<KeyType, ValueType, ItemType>) return v.iterator::getItems(e); return null; } const function DEFAULT_GET_ITEMS(e : boolean = false) : IteratorType.<ItemType> new Enumerator.<ItemType>(this, function (id : EnumerableId, obj : Object!) : ItemType [id, obj[id]], e);
ISSUE: users of JS1.7 (see Mozilla bug 366941) want general destructuring of iterated values, e.g. for ([s,v,o] in triple_db) ...
and are surprised that using a [k, v]
pattern has special meaning, or is the only destructuring pattern allowed.
[k, v]
itemization special case?[k, v]
itemization special form?
This proposal specifies that general destructuring is allowed, but that there is not a [k, v]
special case. Thus, in general, for-in
loops with destructuring patterns on the left of in
rewrite from:
for (pattern in o) ...
to:
let ($it = o.iterator::get(true)) { while (true) { try { pattern = $it.next(); } catch (e : iterator::StopIterationClass) { break; } ... } }
Users who want to iterate over key-value items must write for ([k, v] in iterator::items(o)) ...
or equivalent. This avoids any confusing double-meaning of the length-2 array pattern.
For symmetry with values
and items
, there is a keys
function in iterator
, which delegates to a getKey
method on its v
parameter:
function keys(v, e : boolean = false) : IteratorType.<KeyType> { if (v is ItemizableType.<KeyType, ValueType, ItemType>) return v.iterator::getKeys(e); return null; } const function DEFAULT_GET_KEYS(e : boolean = false) : IteratorType.<KeyType> new Enumerator.<KeyType>(this, function (id : EnumerableId, obj : Object!) : KeyType id, e);
The iterator::getKeys
method has a default in Object.prototype
, as usual:
dynamic class Object { ... prototype.iterator::getKeys = iterator::DEFAULT_GET_KEYS; prototype.propertyIsEnumerable(Name(iterator, 'getKeys'), false); }
The getKeys
, getValues
, and getItems
protocols imply a structural type for itemizable objects:
type ItemizableType.<K, V, I> = { iterator::getKeys: function (boolean=) : IteratorType.<K>, iterator::getValues: function (boolean=) : IteratorType.<V>, iterator::getItems: function (boolean=) : IteratorType.<I> };
Given the default values assigned to these names in Object.prototype
, all objects are itemizable. But if an object were to override any method name with an incompatibly-typed value, that object would no longer be itemizable. This may be desirable, e.g., for an unordered set data type.
Note that DEFAULT_GET_KEYS
and DEFAULT_GET_ITEMS
return an iterator yielding (in whole or in part) EnumerableId = (int, uint, string, Name)
– contrast with DEFAULT_GET
which returns EnumeratedId = (string, Name)
as required for backward compatibility with the ES1-3 for-in
loop.
Using the itemizable protocol, collection classes can customize all the loop forms. To customize for-in
, set iterator::get
and iterator::getKeys
to the same function that yields keys. To customize for-each-in
, set iterator::getValues
. To customize for ([k, v] in o)
loops and comprehensions, set iterator::getItems
.
In any event, iterator::get
may be set to make the shortest kind of for-in
loop perform the default or natural iteration for the given type on the right of in
. Some types will want for-in
to iterate values by default, others keys – still others items. This proposal intentionally supports all combinations.
As specified above under Enumeration, magic::getPropertyIds
returns an array of values identifying properties indexed in the order in which those properties were created. This matches behavior required for inter-operation among browser-based ES1-3 implementations.
Strict compatibility with ES1-3 as written (see 12.6.4 step 3, and 9.9 ToObject
) requires that a TypeError
be thrown for any loop over null
or undefined
, but inter-operation on the Web imposes an additional requirement beyond visiting properties in creation order: browser implementations must tolerate for (i in null)
and for (i in undefined)
, iterating zero times. ES4 should take this bug fix by amending the translation from for (i in o)
loop to a while
loop as follows:
let ($v = o) { if ($v !== null && $v !== undefined) { let $it = $v.iterator::get(true); while (true) { try { i = $it.next(); } catch (e : iterator::StopIterationClass) { break; } ... } } }
Note that ES3 specifies the in
operator as throwing a TypeError
on null
and undefined
, and no inter-operation pressure exists to change its behavior on these inputs. Thus for-in
and in
disagree at these edge cases.
Apart from the null
and undefined
edge cases, for-in
and the in
operator agree by default when enumerating. That is,
let a = [3, 4, 5]; for (let i in a) { assert(i in a); assert(a.indexOf(a[i]) != -1); }
Iteration and in
-tested membership are coherent.
E4X’s for-each-in
did not alter in
or add an each in
(sic) operator, resulting in loss of coherent meaning of in
when itemizing:
for each (let v in a) { let i = a.indexOf(v); assert(i != -1); assert(i in a); }
Customization of the iteration protocol can result in similarly incoherent in
operator results. Therefore ES4 supports a companion protocol for extending the in
operator, which is distinct from the operators that can be overridden using static class methods. This difference is intended to facilitate retrofitting, e.g.:
Array.prototype.iterator::get = iterator::DEFAULT_GET_VALUES; Array.prototype.iterator::contains = function (v) this.indexOf(v) != -1;
Note how operand order from operator in
is reversed when calling iterator::contains
: a in b
may translate to b.iterator::contains(a)
.
The implementation of the in
operator checks for a binding of iterator::contains
on its right operand and invokes that method if found. For backward compatibility with ES3 11.8.7, Object.prototype
contains a default binding as follows:
dynamic class Object { ... prototype.iterator::contains = iterator::DEFAULT_CONTAINS; prototype.propertyIsEnumerable(Name(iterator, 'contains'), false); }
In the default iterator
namespace:
const function DEFAULT_CONTAINS(v) : boolean { let id : EnumeratedId = (v is Name) ? v : v to string; let obj : Object = this; do { if (magic::hasOwnProperty(obj, id)) return true; obj = magic::getPrototype(obj); } while (obj !== null); return false; }
Thus there exists a container structural subtype of the iterable type (in default namespace iterator
as usual):
type ContainerType.<T> = { iterator::get: function (boolean=) : IteratorType.<T>, iterator::contains: function () : boolean };
ISSUE: as with the iteration_protocol, an object used on the right of the in
operator may fail to be compatible with this type. Mutable prototypes make it easy to break core language semantics. This is true to some extent in ES1-3, e.g. by overriding Object.prototype.toString
or Object.prototype.hasOwnProperty
. One way to enforce default semantics would be to say that if the right operand of in
is incompatible with iterator::ContainerType
, then in
defaults to ES3 semantics. But if we say this, there is no point in defining Object.prototype.contains
.
The same argument could be made, as noted above re: Python’s __iter__
probing, for the case where a for-in
loop head’s right operand is incompatible with iterator::IterableType
.
For an interesting attempt to unify iteration and contains-testing, see JMatch, which uses modal declarations for forward (contains-testing) and backward (iteration) directions of operation; boolean formulas may be used to unify both modes in one expression.
Let a new reserved identifier keyword yield
introduce a variant of AssignmentExpression:
AssignmentExpression ::= "yield" AssignmentExpression
A function containing a yield
expression is a generator function, which when called binds formal parameters to actual arguments but evaluates no part of its body. Instead, it returns a generator-iterator of nominal type Generator
:
class Generator.<O, I, E> { public function next() : O; public function send(i: I) : O, public function throw(e : E) : O, public function close() : void };
The aspects of this class are as follows:
O
(”output”) is the type of values being generated via evaluated operands to yield
.I
(”input”) is the type of values being sent to the generator via parameters to send
.E
(”exception”) is the type of exceptions being thrown at the generator via parameters to throw
.send(x)
resumes evaluation of the generator function with the value x
as the result of the last yield
, and returns the operand of the next yield
.next()
is equivalent to the method call send(undefined)
.throw(x)
forces the generator to throw the exception value x
in its current suspended context (as if the yield
that suspended were a throw x
statement), and then returns the result of the next yield
.close()
forces the generator to close itself. The effect of close()
is to “return” from the generator where it’s suspended. This is so that finally
clauses active in the generator function can be run. If such a finally
clause throws an exception other than StopIteration, the exception is propagated to the caller of close()
.
When it is written above that a method call “returns the operand of the next yield” the call may instead terminate abruptly with a StopIteration
exception.
A generator must be started by a method call to next()
or send(undefined)
. It is a TypeError
to send
a value other than undefined
to a newborn generator.
An example that prints the first ten Fibonacci numbers:
function fib() { let i = 0, j = 1; while (true) { yield i; [i, j] = [j, i + j]; } } let g = fib(); for (let i = 0; i < 10; i++) print(g.next());
Another example, a natural number generator:
function naturals(n : uint) { for (let i : uint = 0; i < n; i++) yield i; }
The method close()
is called automatically only for a generator iterator that is started by a for-in
or for-each-in
loop or comprehension, upon normal or abrupt termination of the loop. All but the first call to close()
have no effect. Thus the for-in
loop over a generator iterator can be translated from:
for (i in naturals(20)) ...
to:
let ($gen = naturals(20)) { try { while (true) { try { i = $gen.next(); } catch (e : iterator::StopIterationClass) { break; } ... } } finally { $gen.close(); } }
Note that close
is called whether or not the generator iterator denoted by $gen
can be reached via other references.
This translation is a specification device only. Implementations cannot analyze all for-in
loops and discern generator iteration statically, so must check at runtime for a newborn generator iterator on the right of in
.
ES4 provides syntactic sugar for the common operation of populating a new Array
instance using an expression that depends on an iterator. The initialiser below shows an example of an array comprehension:
let squares = [i * i for (i in naturals(10))];
The squares
declaration is equivalent to:
let $arr = []; for (let i in naturals(10)) $arr[$arr.intrinsic::length] = i * i; let squares = $arr;
(Again assume that $arr
is a generated, non-colliding name.)
Note that the variable i
is scoped as if by let
but only within the []
enclosing the comprehension. Note also the evaluation of the expression i * i
in the body of the loop, and the use of intrinsic::length
.
for-in
loop heads may be nested within a comprehension:
let flat_mat = [i * j for (i in naturals(8)) for (j in naturals(8))];
A final if (
condition)
may follow the last for-in
head to filter elements out of the resulting array:
let funny_mat = [i * j for (i in naturals(8)) for (j in naturals(8)) if (i != 0 && j != 0)];
The comprehension variables are lexically bound within a single block scope conceptually delimited by the []
brackets.
Each variable name in a for-in
head may be followed by an optional type annotation.
The each
contextual keyword may be used after for
to iterate over values rather than names when itemizing:
let odd_squares = [i * i for each (i in [1, 3, 5, 7])];
Here’s an example of a common DOM programming style that’s made simpler with iterators and generators: imagine an animation that flashes a DOM element on and off n times:
function animate(n) { let i = 0; let id = window.setInterval(function() { if (i >= n) window.clearInterval(id); else { if (isHighlighted()) unhighlight(); else { highlight(); i++; } } }, 100); } animate(3);
Using yield
, we could implement this with a loop instead:
let id; function makeAnimator(n) { let i = 0; while (i <= n) { if (isHighlighted()) unhighlight(); else { highlight(); i++; } yield; } window.clearInterval(id); } id = window.setInterval(makeAnimator(3), 100);
(This assumes a slight DOM extension for window.setInterval
to accept an IteratorType
as its first argument. Otherwise you’d just use function(){animator.next()}
.)