List Comprehension PatternsMany programming languages have some variant of list comprehension syntax. For example Haskell:
[ 2 * x | x <- [2,3,5,7,11]]The part after the
|is the scan specifier
x <- [2,3,5,7,11], where the scan variable
xranges over the elements of the scan sequence
[2,3,5,7,11]. The result expression
2 * xis evaluated for each element of the resulting sequence, using the scan variables from the scan specifier(s).
Kawa's new (and experimental)
List Comprehension Patterns
separate these two parts by making the scan specifier part of
the more general pattern-matching feature.
The result expression can be widely separated from the scan specifier,
as long is it within the lexical scope of any scan variable it uses.
For example they can be in separate REPL commands
#|kawa:N# is the REPL prompt).
First, some simple variable definions, using the Kawa-specific
(which is like
define extended to patterns):
#|kawa:1|# (! A [2 3 5 7 11]) #|kawa:2|# (! [A0 A1 A2 A3 A4] A)The first line binds the variable
Ato the result of evaluating the
[2 3 5 7 11]. The second illustrates matching the pattern
[A0 A1 A2 A3 A4]against the value of
A, which causes
A4to be matched against corresponding elements of
#|kawa:3|# (+ A1 A4) 14Finally, here is a list comprehension (or scan) pattern:
#|kawa:4|# (! [a ...] A)This declares
ato be a scan variable that ranges over the elements of the scan sequence
A. You can only use a scan variable inside a scan context:
#|kawa:5|# a /dev/stdin:2:1: using scan variable a while not in scan contextA
scan contextis an expression that is followed by ellipsis. The expression is evaluated once for each value of the corresponding scan variable(s), and the result sequence is
splicedinto its context. Here the scan context is the expression
(* 2 a), which doubles the corresponding element
#|kawa:6|# [(* 2 a) ...] [4 6 10 14 22]
This use of ellipsis is inspired by the
macro transformers of Scheme.
A scan variable can be used multiple times in the same scan context (ellipsis expression):
#|kawa:7|# [a ... a ...] [2 3 5 7 11 2 3 5 7 11] #|kawa:8|# [(* a a) ...] [4 9 25 49 121]Ellipses expressions are useful not just in sequence literals, but in the argument list of a procedure call, where the resulting sequence is spliced into the argument list. This is especially useful for functions that take a variable number of arguments, because that enables a convenient way to do fold/accumulate/reduce operations. For example:
#|kawa:9|# (+ a ...) 28because 28 is the result of
(+ 2 3 5 7 11).
An elegant way to implement dot product:
(define (dot-product [x ...] [y ...]) (+ (* x y) ...))When an ellipse expression references two or more distinct scan variables then they are scanned
Sub-patterns in ellipsis patterns
The form before
... in a pattern is itself a (sub-)pattern.
For example, the sub-pattern may include a type specifier,
which applies to each element:
#|kawa:11|# (define (isum [x::integer ...]) (+ x ...)) #|kawa:12|# (isum [4 5 6]) 15 #|kawa:12|# (isum [4 5.1 6]) Argument #1 (null) to 'isum' has wrong type at gnu.mapping.CallContext.matchError(CallContext.java:189) at atInteractiveLevel-6.isum$check(stdin:11) ...(The stack trace line number
stdin:11is that of the
You can nest ellipsis patterns, allowing matching against sequences whose elements are sequences.
#|kawa:31|# (define (fun2 [[x ...] ...] [y ...]) #|.....32|# [[(+ x y) ...] ...]) #|kawa:33|# (fun2 [[1 2 3] [10 11 12]] [100 200]) [[101 102 103] [210 211 212]]
x is double-nested, while
y is singly-nested.
The following does not work at time of writing, but will soon:
(! [[x y] ...] sequence-of-pairs) ;;Not yet working
Filtering the scan sequence
Often you only want to use some of the elements from the scan sequence.
For example SQL's
select has a
where with a
boolean expression; only elements for which the expession is true are included.
With scan patterns we have the option to filter the values either where the
scan variables are introduced (the pattern), or when they are used.
(The examples in this section are not working - and the design is undecided.)
Kawa has a splice operator
that takes a sequence and spreads it out to multiple arguuments.
It is useful to allow splice operators in scan expressions,
and even more useful to allow them in one or more branches of an
The following expressions selects the odd elements from
doubling them, but drops even elements.
It does the latter by returning a 0-length splice for each even element:
[(if (odd? a) (* 2 a) @) ...]
Some reasonable syntatic sugar would be to allow leaving out
else expression of an
if, having it default
to a zero-length splice:
[(if (odd? a) (* 2 a)) ...]
APL has an
expand operator that repeats elements of an array
based on another count array.
This functionality can be achived using splices in scan expressions.
For example, the following repeats each odd element, and skips even elements:
[(if (odd? a) @[a a] @) ...]Equivalently (parsing issues aside):
[@(if (odd? a) [a a] ) ...]
It is also useful to filter at the pattern level.
The following hypothetical syntax would create a scan variable
from those elements of the scan sequence
satisfy the boolean expression
(! [a ...] A #!if (P a)) ;; hypotheticalNote the sequence of
avalues used to evaluate
(P a)would be different than the values in scan expressions. That might be an issue.
Instead of processing multiple sequences in parallel,
sometimes you want the
multiplicative combination of elements
from all the sequences.
This is called Cartesian product (or outer product in APL),
or an cross join in database query languages.
This could be handled with a (not implemented) outer-product function:
(outer-product [10 20] [1 2 3]) ⇒ [[10 1] [10 2] [10 3] [20 1] [20 2] [20 3] [30 1] [30 2] [30 3]]We could use it like this:
(! [[x y] ...) (outer-product X Y))
The above syntax is a bit awkward. Some syntatic sugar would help. If so, it should include an option for filtering only the desired combinations based on a boolean expression.
Comprehension as looping
Scan expression can form the basis of a looping framework
in a programming language.
Assume a function
ignores its arguments.
Then this is one way to write a
using its scan expression for its side effects:
(let (([i ...] [0 <: 10])) (ignore (do-something-with i) ...))The scan valuable
irange of the range expression
[0 <: 10]. The scan expression
(do-something-with i)is evaluated for each
A general looping construct needs a way to exit the loop
based on a condition, as in a
Than can be done by having a scan expression return
This assumes the scan elements are evaluated sequentially, which is a reasonable default. It would be useful to have some way to request that evaluation be done in parallel. This could be implemented by translating the code to use Java 8 streams, though for performance reasons we only want to do that when parallel execution is wanted.
A scan variable is implemented as either a
java.util.List or a (Java) array.
Using an array may be more efficient, partly because it easier to avoid boxing;
on the other hand it may require more copying..
Scan patterns are expanded by the compiler to a loop which initializes
the array. Scan expressions expand to
which are generally inlined to loops.
Numerous optimizations and performance improvements are possible.