Iterators

One of the factors which motivated this revision of Collection classes was a discussion on PEG which convinced me that I should allow for multiple iterators. In the Java Collection classes, Iterator is a separate interface and the GetIterator() method implements that interface to provide an Iterator for any given collection class. Since Iterator is separate and not limited to one instance, one has the structure to provide multiple simultaneous iterators for the same collection. In starting to work on this revision, we assumed that we would take this approach and you can see this reflected in the previously published material on this project.

However, in approaching actual coding of the project, we realized that there is an inherent problem because of the "intimacy" of iterators with respect to their collections. How can an iterator navigate the implementation structure of the collection without either having parts of that implementation be public, which would violate encapsulation, or having the collection have navigation methods, which would seem to violate normal form since both the collection and the iterator would be exposing very similar operations.

One solution proposed would be to have the navigation methods on the collection not be public to any except the iterator. That would still violate notions of OO purity, though less blatantly, but this is an academic question in the end since ABL has no level of protection which would provide this kind of limited access.

Another proposed solution was to have the navigation methods on the collection reference an "ID", the ID identifying the "current" record for a particular iterator. I.e., GetNext(ID) is GetNext relative to ID. If this ID was the actual object identifier, i.e., int(Obj), then the Iterator could determine the ID for the current element on its own. Otherwise, it would have to obtain this ID from the collection. Unfortunately, with the proposed array implementation for the most common type of collection, having the object ID would require an expensive search of the ID to identify the current record. Only an ID of the current element would be efficient there.

It was then observed that, while multiple iterators might sometimes be required of some types of collections, they seem highly unlikely in a relationship collection. They are unlikely because a relation is a connection between two object types in a parent child type of connection and the normal expectation is that if the parent does some operation on the children, it will do it to each of them, i.e., that one will navigate from the beginning to the end, an most often within the scope of a single method.

This raised the possibility that we would move the iterator back into the collection instead of making it something separate. Having thought of this, it was immediately appealing since it would guarantee that the navigation would be efficient relative to the specific implementation, but that all details of the implementation would be encapsulated within the class. This lead me to think that the Java structure is actually a less than ideal decomposition of the problem space since the need for navigation is inherent in the collection and separating them requires an overly intimate knowledge to be shared.

Having made this observation, I has now occurred to me that it would be possible to support multiple iterators within the collection itself by the use of named or numbered iterators. For example, GetIteratorID() could return a sequentially assigned integer which identifies a particular iterator. A small array in the collection, indexed by this integer, could then contain a value for the current record on that iterator. For an array implementation, it might be the current element number. For a work-table implementation it might be the Object ID to use in a find. For a temp-table implementation it might be the same object ID or a recid. Alternatively for a temp-table, one might have a separate query per iterator, but that seems like it would add weight. While the object ID and recid would require a FIND, this would be a C level operation and only one ABL instruction.

Revision of other parts of this specification is pending confirmation of our adoption of this approach.