Wednesday, August 16, 2006
Monday, August 14, 2006
Eager Comprehensions for Black Belts - 4. Advanced Generators
4. Advanced Generators
Local variables
Consider this scenario, where we want to produce a list of the even results of a long computation:
The above solution will call long-computation twice with the same value. Introducing a local variable with the :let-generator (:let <vars> <expression>) solves this.
Parallel loops
The generator (:parallel <generator>*) runs several generators in parallel. It advances its subgenerators in one step.
The (:parallel ...) generator stops, when any of its sub-generators stop.
The generator (:while <generator> <expression>) runs the generator <generator> while <expression> return non-#f. This can be used to stop a generator before it is done.
Since :while stops the inner generator, it is better to use :while than to use a qualifier to skip all values after a certain point. Compare these timings:
The :while-generator tests the expression before advancing the generator. The :until-generator tests after.
The simple :do loop
The generator :do is a "do-while"-generator (it runs while the test is true - in contrast the normal Scheme do operator is a "do-until", it runs until the test becomes true).
The simple form is (:do (<lb>*) <ne1?> (<ls>*)). Here <lb>* is zero or more loop bindings, <ne1?>; is the "while" expression and determines whether the loop stops. Finally <ls>* are "loop-steppers", expressions whose evaluation results are bound to the loop variables.
A simple example (:do () #f ()) has no loop variables, and stops immediately, since the "while"-expression is #f.
In contrast (:do () #t ()) will give an infinite loop.
A loop that counts upwards forever: (:do ((x 0)) #t ((+ x 1))). To test it we put it inside a :while.
It is however better to use that :do is a "do-while".
Updating two variables by repeatedly subtracting the smaller from the larger leads to Euclid's algorithm for calculating the greatest common divisor. The :do-generator makes it easy to store the intermediate steps in the calculation.
so the greatest common divisor of 8333 and 10897 is 641.
Just as in a normal do-loop all the step-expressions are evaluated before they are bound to the loop- variables.
If x had been updated before (+ x y) were evaluated the result would have been
((0 1) (1 2) (2 4) (4 8) (8 16)).
The syntax of the simple :do-generator is
Lets take a look under the hood. [We'll need that later]
The generator expands to this loop:
where payload depends on the enclosing comprehension.
E.g. if the outer comprehension is (list-ec (:do ...) x) then the payload is (set! result (cons x result)).
The :do-generator is the one to turn to, if none of the builtin generators suit your needs. However the simple form turns out to be too simple for some cases.
Consider this attempt to write :list in terms of :do :
The problem here is that 4 is missing from the result list.
The advanced :do loop
This section is for black belts. The advanced :do generator is meant to be used only by people interested in defining their own custom generators - and even then, in most cases a custom generator can be built without the advanced :do generator.
The advanced :do generator is more flexible than the simple, but requires a little time getting used to. It is worth learning though, as we will se a little later.
In fact all the pre-defined generators are defined in terms of the advanced :do-generator. Let us examine how the :list-generator works.
The (cleaned up version of) the expansion of
We notice that list-ec sets up a result list, into which elements are consed using (set! result (cons x result)). When the list is returned it is reversed.
The loop itself has
The syntax of the advanced :do is:
That is (list-ec (:list x '(1 2 3 4)) x) is equivalent to
Consider now the expansion of the general
The outer bindings is used to set up helper variables, before the loop.
The outer commands for once-only side effects before the loop starts.
The loop bindings <lb>* are used for variables updated in the loop.
The before "while"-expression <ne1?> is tested before the payload.
The inner bindings <ib*> is used to set up temporary variables
for use in the loop body.
The inner commands <ic>* is used for side effects which is to be
executed in the loop.
The payload is work to do be done for the comprehension.
The after "while"-expression <ne1?> is tested after the payload.
The loop step expressions determine the next value of the loop
variables.
Lost your breath? Sebastian Egner puts it like this: "The design of :DO is a compromise between how complicated structures can be expressed, while retaining the ability to merge the loops in parallel". The following example will hopefully clear things up. It will later be reused to illustrate how to make a custom generator.
Combinations
The problem of generating all k combinations of the n numbers
0,1,...,n-1 provides a nice example of the advanced :do-generator.
The list of 3,5-combinations are
The first combination is #(0 1 2) and the last combination is #3(2 3 4).
Given helper funcions first-combination, last-combination?, and
next-combination we can use the advanced :do-generator as follows.
Here are the helper functions.
Local variables
Consider this scenario, where we want to produce a list of the even results of a long computation:
(list-ec (:list x '(1 2 3 4 5))
(if (even? (long-computation x)))
(long-computation x))
The above solution will call long-computation twice with the same value. Introducing a local variable with the :let-generator (:let <vars> <expression>) solves this.
(list-ec (:list x '(1 2 3 4 5))
(:let r (long-computation x))
(if (even? r))
r)
Parallel loops
The generator (:parallel <generator>*) runs several generators in parallel. It advances its subgenerators in one step.
(list-ec (:parallel
(: x '(one two three))
(: y '(1 2 3)))
(list x y))
; => ((one 1) (two 2) (three 3))
The (:parallel ...) generator stops, when any of its sub-generators stop.
(list-ec (:parallel
(: x '(one two three))
(: y '(1 2)))
(list x y))
; => ((one 1) (two 2))
The generator (:while <generator> <expression>) runs the generator <generator> while <expression> return non-#f. This can be used to stop a generator before it is done.
(list-ec (:while (: i 1 100000)
(< i 5))
i) ;=> (1 2 3 4)
Since :while stops the inner generator, it is better to use :while than to use a qualifier to skip all values after a certain point. Compare these timings:
(time (sum-ec (:while (: i 1 10000000)
(< i 100000))
i))
; cpu time: 47 real time: 47 gc time: 0
#;
(time (sum-ec (: i 1 10000000)
(if (< i 100000))
i))
; cpu time: 3672 real time: 3687 gc time: 0
The :while-generator tests the expression before advancing the generator. The :until-generator tests after.
(list-ec (:until (: i 1 100000)
(= i 5))
i) ;=> (1 2 3 4 5)
The simple :do loop
The generator :do is a "do-while"-generator (it runs while the test is true - in contrast the normal Scheme do operator is a "do-until", it runs until the test becomes true).
The simple form is (:do (<lb>*) <ne1?> (<ls>*)). Here <lb>* is zero or more loop bindings, <ne1?>; is the "while" expression and determines whether the loop stops. Finally <ls>* are "loop-steppers", expressions whose evaluation results are bound to the loop variables.
A simple example (:do () #f ()) has no loop variables, and stops immediately, since the "while"-expression is #f.
(list-ec (:do () #f ())
1) ; => ()
In contrast (:do () #t ()) will give an infinite loop.
(list-ec (:do () #t ())
1) ; => loops till ram is exhausted
A loop that counts upwards forever: (:do ((x 0)) #t ((+ x 1))). To test it we put it inside a :while.
(list-ec (:while (:do ((x 0)) #t ((+ x 1)))
(< x 5))
x)
;=> (0 1 2 3 4)
It is however better to use that :do is a "do-while".
(list-ec (:do ((x 0)) (< x 5) ((+ x 1)))
x) ;=> (0 1 2 3 4)
Updating two variables by repeatedly subtracting the smaller from the larger leads to Euclid's algorithm for calculating the greatest common divisor. The :do-generator makes it easy to store the intermediate steps in the calculation.
(define (update a b) (if (< a b) a (- a b)))
(list-ec (:do ((x 8333)
(y 10897))
(not (= x y))
((update x y)
(update y x)))
(list x y))
; => ((8333 10897) (8333 2564) (5769 2564) (3205 2564)
; (641 2564) (641 1923) (641 1282))
so the greatest common divisor of 8333 and 10897 is 641.
Just as in a normal do-loop all the step-expressions are evaluated before they are bound to the loop- variables.
(list-ec (:do ((x 0) (y 1))
(< x 10)
(y ; -> x
(+ x y))) ; -> y
(list x y))
; => ((0 1) (1 1) (1 2) (2 3) (3 5) (5 8) (8 13))
If x had been updated before (+ x y) were evaluated the result would have been
((0 1) (1 2) (2 4) (4 8) (8 16)).
The syntax of the simple :do-generator is
(:do (<lb>*) ; loop bindings
<ne1?> ; "while"-expression
(<ls>*)) ; loop-step-expression
Lets take a look under the hood. [We'll need that later]
The generator expands to this loop:
(let loop (<lb>*)
(if <ne1?>
(let ()
payload
(loop <ls>*))))
where payload depends on the enclosing comprehension.
E.g. if the outer comprehension is (list-ec (:do ...) x) then the payload is (set! result (cons x result)).
The :do-generator is the one to turn to, if none of the builtin generators suit your needs. However the simple form turns out to be too simple for some cases.
Consider this attempt to write :list in terms of :do :
(list-ec (:do ((xs (list 2 3 4))
(x 1))
(not (null? xs))
((cdr xs) ; -> xs
(car xs))) ; -> x
x)
; => (1 2 3)
The problem here is that 4 is missing from the result list.
The advanced :do loop
This section is for black belts. The advanced :do generator is meant to be used only by people interested in defining their own custom generators - and even then, in most cases a custom generator can be built without the advanced :do generator.
The advanced :do generator is more flexible than the simple, but requires a little time getting used to. It is worth learning though, as we will se a little later.
In fact all the pre-defined generators are defined in terms of the advanced :do-generator. Let us examine how the :list-generator works.
The (cleaned up version of) the expansion of
is:
(list-ec (: x '(1 2 3 4)) x)
(reverse
(let ((result '()))
(let loop ((t '(1 2 3 4)))
(if (not (null? t))
(let ((x (car t)))
(set! result (cons x result))
(loop (cdr t)))))
result))
We notice that list-ec sets up a result list, into which elements are consed using (set! result (cons x result)). When the list is returned it is reversed.
The loop itself has
- a set of loop-bindings ((t '(1 2 3 4)))
- a "while"-expression (not (null? t))
- a set of inner bindings ((x (car t)))
- a payload from the comprehension (set! result (cons x result))
- a set of loop-step-expressions (cdr t)
The syntax of the advanced :do is:
(:do (let (<ob>*) <oc>*) (<lb>*) <ne1?> (let (<ib>*) <ic>*) <ne2?> (<ls>*))
<ob>* outer bindings
<oc>* outer commands
(let loop ((t '(1 2 3 4))) <lb>* loop bindings
(if (not (null? t)) <ne1?>
(let ((x (car t))) <ib>* inner bindings
<ic>* innner commands
(set! result (cons x result)) payload from comprehension
(if #t <ne2?>
(loop (cdr t)))))) <ls>* loop steps
That is (list-ec (:list x '(1 2 3 4)) x) is equivalent to
(list-ec (:do (let ())
((t '(1 2 3 4)))
(not (null? t))
(let ((x (car t))))
#t
((cdr t)))
x)
; => (1 2 3 4)
Consider now the expansion of the general
(:do (let (<ob>*) <oc>*) (<lb>*) <ne1?> (let (<ib>*) <ic>*) <ne2?> (<ls>*))
(let (<ob>*)
<oc>*
(let loop (<lb>*)
(if <ne1?>
(let (<ib>*)
<ic>*
payload
(if <ne2?>
(loop <ls>*) )))))
The outer bindings is used to set up helper variables, before the loop.
The outer commands for once-only side effects before the loop starts.
The loop bindings <lb>* are used for variables updated in the loop.
The before "while"-expression <ne1?> is tested before the payload.
The inner bindings <ib*> is used to set up temporary variables
for use in the loop body.
The inner commands <ic>* is used for side effects which is to be
executed in the loop.
The payload is work to do be done for the comprehension.
The after "while"-expression <ne1?> is tested after the payload.
The loop step expressions determine the next value of the loop
variables.
Lost your breath? Sebastian Egner puts it like this: "The design of :DO is a compromise between how complicated structures can be expressed, while retaining the ability to merge the loops in parallel". The following example will hopefully clear things up. It will later be reused to illustrate how to make a custom generator.
Combinations
The problem of generating all k combinations of the n numbers
0,1,...,n-1 provides a nice example of the advanced :do-generator.
The list of 3,5-combinations are
(#3(0 1 2) #3(0 1 3) #3(0 1 4) #3(0 2 3) #3(0 2 4) #3(0 3 4)
#3(1 2 3) #3(1 2 4) #3(1 3 4)
#3(2 3 4))
The first combination is #(0 1 2) and the last combination is #3(2 3 4).
Given helper funcions first-combination, last-combination?, and
next-combination we can use the advanced :do-generator as follows.
(list-ec (:do (let ((k 3) (n 5)))
((c (first-combination k n)))
c ; first-combination returns #f if k<=0 or k>n
(let ())
(not (last-combination? k n c))
((next-combination k n c)))
c)
Here are the helper functions.
(define (vr v i) (vector-ref v i))
(define (vs! v i x) (vector-set! v i x))
(define (incrementable? v i k n) (< (vr v i) (+ n (- k) i)))
(define (last-combination? k n v) (= (vr v 0) (- n k)))
(define (first-combination k n)
(if (<= 1 k n)
(vector-ec (: i 0 k) i)
#f))
(require (lib "43.ss" "srfi")) ; for vector-copy
(define (next-combination k n v)
(last-ec #f ; default, when there is no next combination
(:let v (vector-copy v))
; find the last incrementable index
(:let i (last-ec #f (:until (: i (- k 1) -1 -1)
(incrementable? v i k n))
i))
(if i)
; increment index i and fix indices to the right of i
(:parallel (: j i k)
(: vj (+ (vr v i) 1) n))
(begin (vs! v j vj))
; if all indices is fixed we have a new combination
(if (= j (- k 1)))
; return the new combination
v))
Labels: eager comprehension
Saturday, August 12, 2006
Eager Comprehensions for Black Belts - 3. Multiple generators (nested loops) and qualifiers (filtering)
3. Multiple Generators (nested loops) and qualifiers (filtering)
Comprehensions allow multiple generators. Consider:
Consider the problem of finding small integers such that:
The following qualifiers are available for filtering:
(begin <sequence>) and (nested <qualifier>*) are also qualifiers.]
Comprehensions allow multiple generators. Consider:
The inner (last) generator spins faster than the outer generator. Since the scope of variables bound by a generator begins after the generator expression, it is possible to refer to variables bound by previous generators.
(list-ec (: x 2)
(: y 3)
(list x y)) ; => ((0 0) (0 1) (0 2)
; (1 0) (1 1) (1 2))
At this point we are capable of generating loads of values, but sometimes we want to skip some of them.
(list-ec (: x 3)
(: y (+ x 1))
(list x y)) ; => ((0 0)
; (1 0) (1 1)
; (2 0) (2 1) (2 2))
Consider the problem of finding small integers such that:
x^2 + y^2 = z^2 .A valid strategy is to let x, y and z run from, say, 1 to 100, and skip the triples which doesn't fulfill the equation. To mimic this strategy we will use qualifiers.
The following qualifiers are available for filtering:
[Besides the filtering qualifiers all generators, as well as
(if <test>)
(not <test>*)
(and <test>*
(or <test>*)
(begin <sequence>) and (nested <qualifier>*) are also qualifiers.]
The common cases (if (not <test>)), (if (and <test>*)), and (if (or <test>*)) are abbreviated by (not <test>), (and <test>*), and (or <test>*).
(list-ec (: x 1 100)
(: y x 100)
(: z y 100)
(if (= (+ (* x x) (* y y))
(* z z)))
(list x y z)) ; => ((3 4 5)
; (5 12 13)
; (6 8 10)
; ...
; (60 63 87)
; (65 72 97))
(list-ec (: x 1 4)
(: y 1 4)
(not (= x y))
(list x y)) ; => ((1 2) (1 3)
; (2 1) (2 3)
; (3 1) (3 2))
Labels: eager comprehension
Wednesday, August 09, 2006
Eager Comprehensions for Black Belts - 2. Basic Comprehensions
2. Basic Comprehensions
Comprehensions accumulate the values generated into a result. So far the only comprehension used have been list-ec. There are multiple comprehensions to choose from, when deciding how to accumulate the computed values.
The comprehensions vector-ec and string-ec are analogous to list-ec. The accumulated values of string-ec must all be characters.
There are two boolean comprehensions any?-ec and every?-ec. They test whether at-least-one or all of the computed values were true.
Comprehensions accumulate the values generated into a result. So far the only comprehension used have been list-ec. There are multiple comprehensions to choose from, when deciding how to accumulate the computed values.
The comprehensions vector-ec and string-ec are analogous to list-ec. The accumulated values of string-ec must all be characters.
If the length of the result vector is known ahead of time, one can use vector-of-length-ec, which is more efficient than vector-ec.
(vector-ec (: i 5)
i) ; => #5(0 1 2 3 4)
(string-ec (: c '(#\c #\a #\r))
c) ;=> "car"
Instead of first generating a list of lists, or a list of strings and then applying append or string-append, it is more convenient to use append-ec and string-append-ec.
(vector-of-length-ec 3 (: x 3)
x) ; => #3(0 1 2)
The comprehensions sum-ec, product-ec, min-ec and max-ec work on numbers.
(append-ec (: x '((1 2) (3 4 5) (6)))
x) ; => (1 2 3 4 5 6)
(string-append-ec (: x '("foo" "bar" "qux"))
x) ; => "foobarqux"
For min-ec and max-ec the the generated sequence of values must be non-empty.
(sum-ec (: x '(1 2 3 4))
x) ;=> 10
(product-ec (: x '(1 2 3 4))
x) ; => 24
(min-ec (: x '(1 2 3 4))
x) ; => 1
(max-ec (: x '(1 2 3 4))
x) ; => 4
There are two boolean comprehensions any?-ec and every?-ec. They test whether at-least-one or all of the computed values were true.
Note: any?-ec and every?-ec (just as their cousins or and and) are "early stopping". As soon as any?-ec encounters a #t it stops. every?-ec stops when a #f is seen.
(every?-ec (: x 1 10)
(even? x)) ; => #f
(any?-ec (: x 1 10)
(even? x)) ; => #t
Labels: eager comprehension
Tuesday, August 08, 2006
Eager Comprehensions for Black Belts - 1. Basic Generators
1. Basic Generators
Range generators
A generator binds a variable (or more) to a sequence of values.
The generator :range is used to generate the values of a numerical range. The simplest form is (:range <vars> <stop>) which generates the values 0, 1, ..., <stop>-1.
start value is different from 0.
[See also :integers and :char-range]
Element generators
A very common use of generators is to run through the elements of a data structure such as a list, a string or a vector.
The simplest form of :list runs through the elements of a list.
Scope of variables bound by a generator
A variable bound by a generator can be used after the closing parenthesis of the generator expression and extends to the end of the comprehension.
Range generators
A generator binds a variable (or more) to a sequence of values.
The generator :range is used to generate the values of a numerical range. The simplest form is (:range <vars> <stop>) which generates the values 0, 1, ..., <stop>-1.
(list-ec (:range i 5)The form (:range <vars> <start> <stop>) is used when the
i) ; => (0 1 2 3 4)
start value is different from 0.
(list-ec (:range i 3 7)The form (:range <vars> <start> <stop> <step>) allows other step sizes than +1.
i) ; => (3 4 5 6)
(list-ec (:range i 0 8 2)The generator stops when <stop> is reached or crossed. This rule also applies when the step size is negative:
i) ; => (0 2 4 6)
(list-ec (:range i 0 9 2)
i) ; => (0 2 4 6 8)
(list-ec (:range i 5 0 -1)Note that :range works for integers only, simple stepping leads to accumulation of rounding errors when using reals. To generate a sequence of reals, use :real-range instead. It calculates the value from the index.
i) ; => (5 4 3 2 1)
(list-ec (:real-range i 0 2 0.5)The special generator : which uses the types of its arguments to dispatch to various generators, uses :range and :real-range when given numerical arguments.
i) ; => (0.0 0.5 1.0 1.5)
(list-ec (: i 5 0 -1)The dispatch happens at run time, so : is slightly slower than using :range directly.
i) ; => (5 4 3 2 1)
(list-ec (: i 0 2 0.5)
i) ; => (0.0 0.5 1.0 1.5)
[See also :integers and :char-range]
Element generators
A very common use of generators is to run through the elements of a data structure such as a list, a string or a vector.
The simplest form of :list runs through the elements of a list.
(list-ec (:list x '(1 2 3))It is possible to run through more than one list at a time:
(* x 2)) ; (2 4 6)
(list-ec (:list x '(1 2) '(3 4) '(5 6))The generators :vector and :string works in the same way, but on vectors and strings.
(* x 2)) ; => (2 4 6 8 10 12)
(list-ec (:vector x '#(1 2 3))The dispatching generators uses :list, :vector and :strings, when the type of its arguments are lists, vectors and strings respectively.
x) ; => (1 2 3)
(list-ec (:string x "abc" "def")
x) ; => (#\a #\b #\c #\d #\e #\f)
(list-ec (: x '(1 2 3))
(* x 2)) ; => (2 4 6)
Scope of variables bound by a generator
A variable bound by a generator can be used after the closing parenthesis of the generator expression and extends to the end of the comprehension.
Labels: eager comprehension
Eager Comprehensions for Black Belts - An SRFI 42 Tutorial
Eager comprehensions from SRFI-42 provide a convenient notation for writing loops generating a sequence of values, and optionally accumulating the values into a result. Inspired by Peter Seibel's chapter "LOOP for Black Belts" in "Practical Common Lisp" the following is an attempt to show the utility of eager comprehensions.
Eager comprehensions (think: loops) can sequentially bind variables to numbers in a numerical range
The generator (: i 5) binds in turn i to 0, 1, 2, 3, and 4.
The expression (* i 2) is evaluated for each binding of i.
The computed values 0, 2, 4, 6 and 8 are accumulated by the comprehension list-ec. The list comprehension list-ec accumulates the computed values into a list.
The master plan is now as follows:
References:
Eager comprehensions (think: loops) can sequentially bind variables to numbers in a numerical range
- sequentially bind variables to elements in a data structure
- execute arbitrary Scheme expressions
- conditionally perform parts of the loop
- a comprehension, that accumulates the result
- some generators, that produces sequences of values
- an expression that uses the generated values, to
compute a value, the comprehension can accumulate
(require (lib "42.ss" "srfi"))
(list-ec (: i 5)
(* i 2))
; => (0 2 4 6 8)
The generator (: i 5) binds in turn i to 0, 1, 2, 3, and 4.
The expression (* i 2) is evaluated for each binding of i.
The computed values 0, 2, 4, 6 and 8 are accumulated by the comprehension list-ec. The list comprehension list-ec accumulates the computed values into a list.
The master plan is now as follows:
- Basic generators
- Basic comprehensions
- Multiple generators (nested loops) and qualifiers (filtering)
- Advanced Generators
- Multiple accumulators
- Defining custom comprehensions
References:
- SRFI 42 by Sebastian Egner
- "Eager Comprehensions in Scheme: The design of SRFI-42" by Sebastian Egner
Labels: eager comprehension