Exploring Iteration - 7.9 Looping Through Multiple Iterables in Parallel
(Page 2 of 4 )
Problem
You want to traverse multiple iteration methods simultaneously, probably to match up the corresponding elements in several different arrays.
Solution
The SyncEnumerator class, defined in the generator library, makes it easy to iterate over a bunch of arrays or otherEnumerableobjects in parallel. Itseach method yields a series of arrays, each array containing one item from each underlyingEnumerableobject:
require 'generator'
enumerator = SyncEnumerator.new(%w{Four seven}, %w{score years},
%w{and ago})
enumerator.each do |row|
row.each { |word| puts word }
puts '---'
end
# Four
# score
# and
# ---
# seven
# years
# ago
#---
enumerator = SyncEnumerator.new(%w{Four and}, %w{score seven years ago})
enumerator.each do |row|
row.each { |word| puts word }
puts '---'
end
# Four
# score
# ---
# and
# seven
# ---
# nil
# years
# ---
# nil
# ago
# ---
You can reproduce the workings of aSyncEnumeratorby wrapping each of yourEnumerableobjects in aGeneratorobject. This code acts likeSyncEnumerator#each, only it yields each individual item instead of arrays containing one item from eachEnumerable:
def interosculate(*enumerables)
generators = enumerables.collect { |x| Generator.new(x) }
done = false
until done
done = true
generators.each do |g|
if g.next?
yield g.next
done = false
end
end
end
end
interosculate(%w{Four and}, %w{score seven years ago}) do |x|
puts x
end
# Four
# score
# and
# seven
# years
# ago
Discussion
Any object that implements the each method can be wrapped in a Generator object. If you’ve used Java, think of aGeneratoras being like a JavaIteratorobject. It keeps track of where you are in a particular iteration over a data structure.
Normally, when you pass a block into an iterator method likeeach, that block gets called for every element in the iterator without interruption. No code outside the block will run until the iterator is done iterating. You can stop the iteration by writing abreak statement inside the code block, but you can’t restart a broken iteration later from the same place—unless you use aGenerator.
Think of an iterator method like each as a candy dispenser that pours out all its candy in a steady stream once you push the button. The Generator class lets you turn that candy dispenser into one which dispenses only one piece of candy every time you push its button. You can carry this new dispenser around and ration your candy more easily.
In Ruby 1.8, theGenerator class uses continuations to achieve this trick. It sets bookmarks for jumping out of an iteration and then back in. When you callGenerator#nextthe generator “pumps” the iterator once (yielding a single element), sets a bookmark, and returns control back to your code. The next time you callGenerator#next, the generator jumps back to its previously set bookmark and “pumps” the iterator once more.
Ruby 1.9 uses a more efficient implementation based on threads. This implementation calls eachEnumerableobject’seach method (triggering the neverending stream of candy), but it does it in a separate thread for each object. After each piece of candy comes out, Ruby freezes time (pauses the thread) until the next time you callGenerator#next.
It’s simple to wrap an array in a generator, but if that’s all there were to generators, you wouldn’t need to mess around withGenerators or evenSyncEnumerables. It’s easy to simulate the behavior ofSyncEnumerablefor arrays by starting an index into each array and incrementing it whenever you want to get another item from a particular array. Generator methods are truly useful in their ability to turn any type of iteration into a single-item candy dispenser.
Suppose that you want to use the functionality of a generator to iterate over an array, but you have an unusual type of iteration in mind. For instance, consider an array that looks like this:
l = ["junk1", 1, "junk2", 2, "junk3", "junk4", 3, "junk5"]
Let’s say you’d like to iterate over the list but skip the “junk” entries. Wrapping the list in a generator object doesn’t work; it gives you all the entries:
g = Generator.new(l)
g.next # => "junk1"
g.next # => 1
g.next # => "junk2"
It’s not difficult to write an iterator method that skips the junk. Now, we don’t want an iterator method—we want aGeneratorobject—but the iterator method is a good starting point. At least it proves that the iteration we want can be implemented in Ruby.
def l.my_iterator
each { |e| yield e unless e =~ /^junk/ }
end
l.my_iterator { |x| puts x }
# 1
# 2
# 3
Here’s the twist: when you wrap an array in aGeneratoror aSyncEnumerableobject, you’re actually wrapping the array’seachmethod. TheGeneratordoesn’t just happen to yield elements in the same order aseach: it’s actually callingeach, but using continuation (or thread) trickery to pause the iteration after each call toGenerator#next.
By defining an appropriate code block and passing it into theGenerator constructor, you can make a generation object of out of any piece of iteration code—not only theeachmethod. The generator will know to call and interrupt that block of code, just as it knows to call and interrupteachwhen you pass an array into the constructor. Here’s a generator that iterates over our array the way we want:
g = Generator.new { |g| l.each { |e| g.yield e unless e =~ /^junk/ }}
g.next # => 1
g.next # => 2
g.next # => 3
TheGeneratorconstructor can take a code block that accepts the generator object itself as an argument. This code block performs the iteration that you’d like to have wrapped in a generator. Note the basic similarity of the code block to the body of thel#my_iteratormethod. The only difference is that instead of theyield keyword we call theGenerator#yieldfunction, which handles some of the work involved with setting up and jumping to the continuations (Generator#nexthandles the rest of the continuation work).
Once you see how this works, you can eliminate some duplicate code by wrapping thel#my_iteratormethod itself in aGenerator:
g = Generator.new { |g| l.my_iterator { |e| g.yield e } }
g.next # => 1
g.next # => 2
g.next # => 3
Here’s a version of theinterosculatemethod that can wrap methods as well as arrays. It accepts any combination ofEnumerableobjects andMethodobjects, turns each one into aGenerator object, and loops through all theGenerator objects, getting one element at a time from each:
def interosculate(*iteratables)
generators = iteratables.collect do |x|
if x.is_a? Method
Generator.new { |g| x.call { |e| g.yield e } }
else
Generator.new(x)
end
end
done = false
until done
done = true
generators.each do |g|
if g.next?
yield g.next
done = false
end
end
end
end
Here, we passinterosculatean array and aMethodobject, so that we can iterate through two arrays in opposite directions:
words1 = %w{Four and years}
words2 = %w{ago seven score}
interosculate(words1, words2.method(:reverse_each)) { |x| puts x }
# Four
# score
# and
# seven
# years
# ago
See Also
- Recipe 7.5, “Writing an Iterator Over a Data Structure”
- Recipe 7.6, “Changing the Way an Object Iterates”
Next: 7.10 Hiding Setup and Cleanup in a Block Method >>
More Ruby-on-Rails Articles
More By O'Reilly Media
|
This article is excerpted from chapter eight of the Ruby Cookbook, written by Lucas Carlson and Leonard Richardson (O'Reilly, 2006; ISBN: 0596523696). Check it out today at your favorite bookstore. Buy this book now.
|
|