Per Bothner
Q is primarily written by Per Bothner (bothner@cygnus.com). However, it uses software from a number of other sources. Some code comes from the Free Software Foundation. It follows that Q is subject to the GNU general Public License, which should be in the file COPYING in the Q distribution.
The way you run Q is very similar to how you run a typical shell.
Q [options] [file args]
First Q processes the options (these begin with a hyphen).
If there are no more arguments after processing the options, Q enters interactive mode: If prompts for an expression, evaluates it, and print it, before prompting for the next expression, and so on.
If there are arguments remaining after the options,
the first argument must name a Q source file.
Q evaluates that source file non-interactively,
setting the argv
array to whater arguments are left over.
These options are currently supported by Q:
-e expression
expresion
-f filename
-i
-Lnumber
--Lisp
--lisp
--Lisp
--Scheme
--scheme
--Scheme
--
Identifiers are used to name things in a program.
An identifier consists of one or more of the following:
0
though 9
).
.
) or underscore (_
).
\-
means the same as -
,
except that the -
as thought of as a letter.
An identifier cannot start with a digit, nor can the second character be a digit if the first character is a period. This is because such words are parsed as numbers.
Note that the identifiers .
and ..
are reserved
(they are used to access the filesystem).
A comment is started by #
, and continues to the end
of the line. Note that the comment symbol #
should be followed by
a space (or other white space), though a letter or number is also OK.
The reason is that #
followed by certain special symbols
has special (different) meanings.
For example, #(
begins a nested comment,
which extends to the next (unmatched) #)
.
To include the text in the file named filename, do: #<filename>
. (This doesn't work quite right.)
Statements may be delimited by :
or (more usually) by a new line.
Indentation is significant. ...
A line may be continued with \
.
A \
(unless inside a string literal written using double quotes,
or at the end of the line) means that the next character is
treated as if it were a letter.
Spaces are significant:
Q1> 2 + 3*4 14 Q2> 2+3 * 4 20
A token is a simple expression which doesn't contain any sub-expressions. In this section we will examine the rules for how tokens are combined into more complex expressions.
A primary is either a token or an expression surrounded by parentheses, brackets, or braces.
A word is one or more primaries, with no spaces or other white space characters.
A statement is one or more words, separated by newlines or semi-colons.
Each word or statement consists of a number of sub-expressions (tokens or words, respectively).
Q's mechanisms for control structure are uncommon. This is largely because Q is a logic programming language with functions.
Q is an expression language, which means it does not distinuish statements from expressions. However, it is not a purely functional language: an expression may have a side effect, such as writing a value to a file. So it is useful to combine two or more expressions, evaluating them in order. Such a statement sequence is written as a list of expressions separated by semi-colons or new-lines.
The value that results from evaluating a statement sequence is essentially the concatenation of the values of the sub-expressions. This makes sense because the result of a command that runs a program is the text printed (on standard output) by the command, so the result of running two commands in sequence should be the concatendation of their outputs.
Declarations, assignments, and constraints are kinds of expressions that we evaluate for their side-effects. If they "succeed," we don't want anything printed. We do this by having such expressions evaluate to the null sequence. Concatenating a value with the null sequence yields the orginal value. It is useful to be able to evaluate a series of declarations, assignments, and constraints, and then evaluate a final expression, returning that final result. We can do this if when "concatenating" the sub-expressions of a statement sequence, if all of the results but one are null, return the non-null result. This is not what is normally meant by "concatenation," but it is a convenient extension.
So, in detail, the effect of evaluating expr1; expr2
is the following:
sprintf "%s%s" value1 value2
.
Note: this definition satisfies the following properties:
(a; b); c
= a; (b; c)
[];a
= a;[]
= a
("string1"; "string2")
= "string1string2"
sprintf "%s" (a;b)
= (sprintf "%s" a; sprintf "%s" b)
Note that a statement list delimits the scope of a declaration within it: See section Block structure. Also, if any declaration defined within the list is exported, the block evaluates to a record.
Normally, an expression evaluates to a single result. However, an expression may yield NO results. This is interpreted as "false." All other results are "true." This is similar to the Icon programming language Raising an exception is the same as yielding no results.
Operations that are consider "boolean" conventionally return the empty sequence to signify truth. (The rationale is that we don't constraints that are true to print anything.) (Contrast this to Lisp, where the empty list means false.)
An expression may also yield multiple values, perhaps
using the |
operator:
Q1> 3|4 3|4 Q2> print_readable:=1 Q3> (3>2)|5 []|5 Q3> (3<2)|5 5If the backtracking implementation of multiple values is used, multiple results are yielded on demand.
The implementation of multiple results is very liable to change.
The if-then-else construct has the folowing syntax:
Q4> if 3 > 4 => 5 || 6 6
Logic variables are also implemented.
The _
evaluates to a new unbound logic variable cell.
There is some preliminary support for logical constraints.
Q does not have any special forms for writing loops. Instead, as a number of forms for manipulating sequences. These can be combined in various ways to express iterations.
Examples: ...
Get the elements of X, in order, until the end is reached.
All the the elements are the converted to strings, which
are then concatenated together.
Informally, the same as: (X 0; X 1; X 2; ...)
.
Usually, do
is evaluated for the side effects, when
all the elements evaluate to ""
.
Q has a fairly complete set of numeric types:
Integers can be arbitrarily large.
A fairly complete set of operations on Integers has been implemented.
Other bases than 10 can be specified with the r
prefix:
Q1> 16rff 255For convenience, the
0x
prefix used in C is also recognized
as being the same as 16r
:
Q2> 0xFF 255But a
0
prefix does not indicate an octal number:
0177 = 177
The variable print_base
controls the base used for output:
Q3> print_base:=16 Q4> 255 FF Q5> print_readable:=1 Q6> 255 0xFF Q7> print_base:=8 Q8> 255 8r377
(Implementation note: Arbitrary-precision unsigned integers are implemented using the GNU Multi_Precision Package. Twos-complement integers are implemented using this package. FixInts are BigNums that fit in one word (usually, 32 bits). The most common FixInts (about -100 to +1000) are pre-allocated.)
Fractions are always normalized:
Q1> 6/4 3/2
Basic arithmetic on floating-point numbers is implemented, but many functions (such as the transcental ones) are missing.
Complex numbers may be constructed with the %+
operator:
Q1> 3+%4 * 0+%1 -4+%3
Basic arithmetic on complex numbers is implemented, but many functions (such as the transcendental ones) are missing.
There is preliminary support for "quantities" i.e. numbers involving "physical" units and dimensions:
Q1> 2ft+2yd 8ft
Standard arithmetic operators: a+b a-b a*b a/b
Q1> -3 gcd 6 3
These operations work on integers. Each operand is treated as an semi-infinite list of the bits in the integers twos-complement representation.
All 16 bit-wise operations of two integers are available. These can be specified either by the name of the operation, or by its "number":
x bit'opname y
x (bit opnumber) y
x bit'clr y
== x (bit 0) y
: always 0
x bit'and y
== x (bit 1) y
: "and" of x and y
etc
a shift amount
== floor (a * 2 ** amount)
[x0 ... xn] decode y
[x0 ... xn] decode [y0 ... yn]
[xn .. x0] base y
NOT IMPLEMENTED YET: X encode Y X antibase Y
A symbol is a atomic identifier.
It is denoted with a '
followed by a word.
For example, 'foo
is a symbol.
You can create symbol at runtime from a string.
If s
has the value "abc"
,
then '$s
evalues to 'abc
.
A package system as in Common Lisp has been largely implemented, but isn't really used yet.
A character object is actually a kind of symbol
that is interned in the alphabet
package. (Needs fixing!)
There are no character literals.
Use an expression such as ("A"0)
instead.
A mapping maps a key into a value.
A sequence is a special case of a mapping where
all the keys are consequtive integers starting at 0.
For example, the vector [5 4 3]
is a mapping
with 3 values, and whose keys are 0, 1, and 2.
A sequence is a finite or infinite list of objects. The objects are numbered starting with 0.
There are a number of different kinds of sequence, but these are considered to be different ways of implementing the same abstract sequence type.
Higher-order arrays are considered to be sequences of sequences. (But ...)
A vector can be constructed with the []
constructor:
Q2> [[2 3 4] [5 6 7 4+4]] 2 3 4 5 6 7 8
Indexing uses an implicit "adjacency" operator, just like function calls:
Q1> [3 4 5] 2 5
Returns the number of elements in seq: size [3 4 5]
is 3.
Returns the number of elements in seq: [3 4 5] length
is 3.
Allocate an updatable fixed-length vector of length N.
Initialize the elements using the sequence Q.
Default value for N is the Q length
.
To select the n'th element of a sequence s do: s n
.
Indexing starts at zero, thus the first element is s 0
.
Negative subscripts count from the back, where s -2
is the last element, s -3
is the penultimate element,
and so on. The index -1
represents the non-existent element
at the end of the sequence.
Sometimes it is more helpful to think of the indexes as marking positions between elements. Thus index 0 represents the beginning of the sequence, index 1 is the position just after the first element, and so on. Thus a non-negative index counts the elements from the beginning of the sequence. Then the index -1 represents the end of the sequence (the position just after the last element), and a negative index represents the number of elements from the end of the sequence, counting down from -1.
If an "index" is a sequence, it is used to select
a sequence of elements:
S [I1 I2 ... In]
evaluates to: [(S I1) (S I2) ... (S In)]
.
If extra arguments are given, they are distributed:
S [I1 I2 ... In] X ... Z
evaluates to the same value as:
[(S I1 X ... Z) (S I2 X ... Z) ... (S In X ... Z)]
.
(However, the expressions X ... Y are only evaluated once.
Also, the result applications are evaluated on demand.)
This feature is useful to get a "slice" of a multi-dimentional array:
A [3 4] [1 2]
evaluates to:
[[(A 3 1) (A 3 2)] [(A 4 1) (A 4 2)]]
.
The where
operator is used to select specified elements
of a sequence:
Function: Q where B
Here Q is a sequence of values, and B is a sequence of conditions. The result is those elements of Q whose corresponding element of B is true.
As a convenience, if an element of B is a function
it is first applied with the corresponding element of Q as
left argument. For example:
Q where {F x}
== Q where {Q? F x}
.
As usual, if either argument is not a sequence, it is extended to a sequence. Thus the following selects the positive element of Q:
Q where (> 0)since it is the same as
Q where {> 0}
, which is the same
as Q where {Q? > 0}
.
The result of a where
is a lazy sequence. (??).
The while
operation is used to select the initial elements
of a sequence, but leaving out the first element that fails to satisfy
a condition, as well as all subsequent elements.
Truncates Q from with the first element (inclusive)
that fails the condition B.
See the usage of where
for more details.
Similar as while
, but with the condition negated:
Truncate Q leaving out all elements after
the first element that does satisfy B.
Note that the first element (only) that satisfies B
is included in the output.
Note that while
can be used in place of the while
loops
in many programming languages:
{expr1} while {expr2} do
corresponds exactly to C's:
while (expr2) { expr1; }
Similarly, until
is like the repeat ... until
or
do .. until
of many programming languages:
{expr1} until {expr2} do
corresponds exactly to C's:
do { expr1; } until (expr2);
Q2> [3 4 5 -4 8 -1 10] while (> 0) 3 4 5 Q3> [3 4 5 -4 8 -1 10] until (> 0) 3 Q4> [3 4 5 -4 8 -1 10] until (<= 0) 3 4 5 -4
The for
operator returns a sequence containing
a sub-range of the integers:
Q1> 1 for 10 1 for 10 Q2> (1 for 10)+100 101 102 103 104 105 106 107 108 109 110
Returns an infinite sequence of integers starting with init, increasing by step.
Same as: Q while {<= Limit}
Example: 2 upto 5
== [2 3 4 5]
.
I upto Limit
== (I by 1) upto Limit
.
Example: 1 by 2 upto 10
== [1 3 5 7 9]
Same as: Q while {>= Limit}
I downto Limit
== (I by -1) downto Limit
.
Example: 5 downto 2
== [5 4 3 2]
.
I to J
is like I upto J-1
.
That is, the upper bound J
is not included.
Furthermore, the result is recognized specially by indexing.
Thus for a sequence X, the expression X (I to J)
select the sub-sequence of X between the indexes I and J.
One or both of I and J can be negative, meaning
indexing from the end of X.
For example, X (I to -1)
is the same as X drop I
, if I>=0.
A string is a sequence of characters.
String constants are written using double quotes:
"This is a string"
A dollar-sign ($
) in a string constant prefixes an
expression that should be evaluated. The result is
converted to a string, and inserted in place:
:a="abc" "XX$(a)YY" # Evaluates to "XXabcYY"
Special characters can be notated using \
, as in C.
There are a few useful extensions:
\^X
-- represents the control character Ctrl-X.
A newline following a \
means to skip the newline,
and any following spaces or tabs.
A space or tab following a \
means to skip all spaces and tabs,
then, if the next character is a newline, skip the newline,
and any following spaces and tabs.
Macro: string match pattern [replacement]
Match string against the regular expression pattern. If string matches the entire pattern, return replacement, except that substitution occurs in replacement, as discused <<below>>. If the match fails, raise a comparison failure.
The string is evaluated; pattern and replement are not.
Certain special characters in a word have special meaning.
?
means to match any single character,
except NewLine and NUL. Additionally, when matching against
files names (as in a command line), the ?
doesn't
match a /
or an initial .
.
*
means means to match any number of characters,
with the same exceptions as for ?
.
*(pattern)
means to match any number of matches
for pattern. Note that the left parenthesis must
explicitly follow the star.
This feature subsumes some of the use of the Unix find
program
when its used for matching files names:
Q1> ls -i ../dld/*(*/)Makefile 53639 ../dld/Makefile 51872 ../dld/test/overlay/Makefile 32863 ../dld/test/Makefile 36295 ../dld/test/reload/Makefile 64017 ../dld/test/add1/Makefile 53631 ../dld/test/simple/Makefile 41491 ../dld/test/general/Makefile
?(pattern)
means to optionally match the pattern
(i.e. zero or one times).
+(pattern)
means to match one or more matches of pattern.
[charset]
means to matches any single
character in the charset.
pattern1|pattern2
Quoting a character with \
turns off globbing for it:
Q1> echo Makefi\* Makefi*Alternatively, you can use string quotes:
Q1> echo "Makefi*" Makefi*
Unix and many other operating systems view files as just an unstructured sequence of bytes, Q views such a files as a string (which just happens to be stored in the file system, instead of in main memory).
A word that begins with /
or ./
or ../
is considered to be a filename. The word is not evaluated.
As a convenience, an identifier that is unbound, but which names an existing file, is also treated as a file name.
Note that filenames follow the emacs convention,
where two /
-es means start again from the
file system root, ignoring anything before.
Thus foo//bar/baz
means /bar/baz
.
Hence, if the variable fn is a string,
then ./$(fn)
is the file named by the string fn.
An example of the use of file objects:
./a := ./bwill copy file
b
to a
, just like
cp b aActually, since no flags are preserved, the semantics are more like the shell command:
cat <a >b # /bin/sh syntax, not Q syntax!
One way to byte-reverse a file:
Q1> size ./Makefile 110 Q2> ./Makefile (109 downto 0) Z.rat.Q>c- sserpmoc|derat-non elifekaM \ tset knilQ elifekaM/xinu cod im derat-non X- - fc- v- rat :Z.rat.Q :Z.rat.Q
A cyclic sequence is an infinite sequences whose values repeat in a cyclic manner.
Both F and C are finite sequences. The result is an infinite sequence starting with the elements of F, followed by an infinite number of copies of C.
Return the length of the finite prefix of Q.
Return the length of the repeating prefix of Q. If Q is finite, return 0.
A lazy vector uses the {}
construct.
This takes an expression that gives the vector element
corresponding for each index (given by the ?
construct).
Each element is only evaluated when it is needed, but once
it has been evaluated, its value is remembered for the future:
Q2> :a = [3 4 5] Q3> :b = {a? + 100} Q4> b # No elements are known yet. ... Q5> b 2 105 Q6> b _ _ 105 ... Q7> b 1 104 Q8> b _ 104 105 ...
Note that the operation ?
form is the "inverse" of the operation {...}
.
That is: {exp?}
computes the same value as: exp
,
if exp
evaluates without side-effects to a sequence.
In fact, if the form ?
follows an expression,
that expression is evaluated once only (eagerly),
not once per item (lazily). The compiler is essence treats
{... Exp? ...}
as if it were:
:Temp=CoerceToSequence(Exp); {... Temp? ...}
.
Identifiers declared directly (i.e. not further nested within
parentheses or function definitions) in a {...}
form
are visible to all {...}
in the same block.
k is a vector of integers, which define the bounds of the resulting array. The elements of the aray are taken from the sequence q. (Similar to APL's rho operator.)
(This feature may be removed in favor of using non-intered symbols.)
The expression :%name
evaluates to a new atom,
having the name given by the identifier name.
The atom is different from all other atoms
even those having the same name.
The name is bound to the new atom globally
within the module containing :%name
.
Therefore, a module cannot contain duplicate
instances of :%name
for any name.
(Also note that the new atom is allocated at compile time.
There should also a run-time genatom string
.)
The expression s || t
evaluates to a union type.
y POSTFIX (a || b || ... || z)
means
(U y a) | (U y b) | ... | (U yz)
where (U y a)
is y
if a
is scalar
and y POSTFIX a
if a
is non-scalar.
The form :var
is used to declare a new variable var.
It is an expression which evaluates to the "value" of var.
That value may be unknown when we evaluate :var
,
but at some point in the future it may get a value,
so what we get is a "potential" value.
Operationally, :var
creates a new memory location,
binds name name var to that location, and returns that location.
Usually, the first thing we may want to do with a new variable is to
give it a value. This is easy using the =
operator:
:x = 10
declares that the variable x
has the
same value as 10
. Operationally, the =
operator
is implemented as unification, which causes the value 10
to be placed into the memory location for x
.
(You can think of the expression x=y
as a specification
that x
and y
are equal.)
Through the power of unification, we can do complex pattern-matching:
[:x :y]=[4 5]
has the effect of setting x
to 4
,
and y
is set to 5
. Here, [4 5]
is a list
containing the two elements (4
and 5
), and [:x :y]
yields a list whose two elements are the locations for x
and y
.
Then the unification specifies that the two lists are equal,
which means that x
must equal 4
and y
must equal 5
.
Q is statically scoped.
A variable x
is visible in the entire scope containing
its declaration :x
, not just after the :x
is evaluated.
So the "memory location" for all variables declared in a scope
are actully allocated when the scope is entered.
There can be only one defining instance of a variable in a scope
(i.e. the name of variable must be prefixed by :
once and once only).
It is conventional that the defining instance be the first one,
but that is not required. (It may not be possible in the case
of mutually recursive function.) An example:
(:x = y; :y = 10; x)
evaluates to 10.
(We can say that all declarations are mutually recursive. The one exception is for Q commands typed in interactively. In that case, the declaration is not visible until the command contianing it has been read in, of course!)
A declaration (i.e. a form like :identifier
) is owned
with smallest block containing it. The declaration is
visible ("in scope") inside the entire block (unless it
is overridden by another declararation inside a smaller block).
Blocks are: (incomplete list - may change slightly):
;expr
.)
if
statement).
(The details have not been formalized yet.)
Example: The expression [:x x]
will match a
two-element sequence if and only if the two elements match.
Thus [:x x]=[3 3]
succeeds, but [:x x]=[3 4]
fails.
These expressions also declare x
in the current (surrounding)
scope, because [:x x]
is not a block.
This may be undesirable; we would like to limit the scope
of the x
. Just ([:x x])
will not prevent
the unwanted declaration, but (;[:x x])
will do it.
(NOTE: WORK IN PROGRESS)
A declaration may have the form :|identifier
instead of the usual :identifier
.
The presence of a |
declares the identifier
to be exported.
If a block owns (contains) a declaration that exports a name,
the block is said to export that name.
The result of a block that exports one or more names
is a structure that binds each name to the whatever
value it gets in the block. One can later use the
notation structure'name
to extract
the bound value.
For example, the block in the parentheses here:
:B=([:|a :|b]=[100 200]; :x=10; :|c=(x+a))exports the names
a
, b
, and c
.
The block evaluates to a record with those three names.
Hence, we get that B'a=100
, B'b=200
,
and B'c=110
.
Note that the presence or absence of an exported declaration makes all the difference:
Q1> (:a=3; :b=a+1) Q2> (:a=3; :|b=a+1) [b: 4] Q3> (:|a=3; :|b=a+1) [a: 3 b: 4] Q4> (:a=3; a+1) 4 Q5> (:|a=3; a+1) # Bad! [a: 3]
Note that line 5 is poor style: None of the statements in an exporting block should evaluate to other than the null sequence. (This will probably be prohibited be some point.)
If there is an export declaration in a parameter list, the name is added to the export list of the body. (More ...)
(NOTE: WORK IN PROGRESS)
A Q program can either be typed in as interactive commands, or can be read from a file. A Q program in a file should have the form of an exporting block. Such a program is called a module. (By the way, a session of interactive commands is also a kind of module.)
You can "load" in a module either by naming it on
the command line, or by using a load
command.
Note: the module_name is quoted.
The named module is searched for (using search rules
not yet specified). The searching can resolve to
a source file or a binary (previously compiled) file.
Either way, the value resulting from the load
is (normally?) a record.
Loading a module involved a number of steps. If the module_name is a plain identifier or simple quoted string literal, then the initial parts of loading is done at parse time. This is normally the case. It is also required when there a several modules that mutually reference each other.
Loading a source files includes parsing,
tree-traversal, and evaluation.
If the module_name is known, then
the module is parsed just after the load
is parsed, and tree-traversal is done at the
same time as for the module containing the load
.
In that case, the names exported by the module
are known early, and the names can be inherited
int some other block. Otherwise, all of the actions
have to be postposed until evaluation time,
and the type of the module is unknown until then.
Loading a binary file includes reading in the text and data segments, remembering the labels defined and needed, performing relocation, and running initialization code. If the module_name is known, most of these steps can be done at parse time.
Here is a simple example of how to define a new function:
Q1> :(Factorial :x) = if x=0 => 1 || x * (Factorial x-1)This is the classical factorial function. Here is how to call it:
Q2> Factorial 30 265252859812191058636308480000000
:(F :x1 :x2)=x1-x2 # Definition of function F. F 10 3 # Call of function F; yields 7.
A postfix function takes a single parameter specified before the function name:
:(:arg G)=arg*arg # Definition of function G. 5 G # Call of function G; yields 25.
It is also possible for a function to take parameters specified both before and after the function name. This allows infix functions:
:(:x mod :y) = x - y*floor(x/y) -8 mod 3 # Yields 1.The arguments before the function name are called the left parameters, while those after are called the right parameters. In the
mod
example, x is a left parameter, and
y is a right parameter.
Having both prefix and postfix functions could lead to ambiguity. See <not written> on the details.
:(positional_params name positional_params keyword_params)=body
Here name is the name of the function being defined; the first positional_params is the list of parameter specifiers for the left parameters; while the second positional_params together with the keyword_params is the list of parameter specifiers for the right parameters.
A positional_params is a list of positional_param, with one optional tuple_param.
Parameter lists appear in a number of places:
In general. each parameter is an expression which matches the value supplied when a function is called or an object is created. However, these expression may be interpreted in non-standard ways.
The most common parameter has the form:
:ident~typeHere, ident is an identifier, and type is some type-expression. This means that the actual parameter is coerced to the type type, and the result becomes the value of ident.
Given a sequence X, the form X@
yields
a tuple of the elements of X.
A tuple is a list of elements, just like a sequence.
The main difference is that a tuple of one elements
is equal to that element: [X]@ = X
.
Tuples are not sequences, though you can convert
between them with loss of information.
Tuples are used for the parameter lists of functions, as well the results of multi-valued functions. They are volatile, not real objects, and cannot be assigned to variables or be parts of data structures. (1)
If a parameter is a tuple, it is treated as if there were many parameters, one for each element of the tuple. Thus the following are equivalent:
F 10 [1 2 3]@ 20 F 10 1 2 3 20In both cases,
F
is applied with 5 arguments.
In this respect, the list constructor form [...]
is considered to be a function. Thus:
[10 [1 2 3]@ 20]
evaluates to [10 1 2 3 20]
, a sequence of 5 elements.
Note that [...]
@
cancel each other:
If X is a sequence, then: [X@] == X
.
If T is a tuple expression (there are no tuple objects,
strictly speaking), then [T]@ == T
.
A formal parameter can also be a tuple. This is how you define functions that take variable number of arguments. The tuple formal is bound to the arguments that are "left over" after the other arguments have been bound.
Q1> :(f :x :T :y)= printf "x=%S; T=%S; y=%S" x T y Q2> f 2 3 4 5 6 7 x=2; T=[3 4 5 6]; y=7Only one formal can be a tuple. Note that a tuple formal gives sequence value to the parameter name (T, in the example). This is how it works: The formal
:T@
is unified with the sub-tuple
[3 4 5 6]@
from the actual argument list.
After cancelling the @
, we get the unification:
:T = [2 3 4 5]
.
Note that @
is also use for reduction: See section Reduce.
This usage is related: You can think of the reduction
converting the argument sequence to a tuple.
(NOT IMPLEMENTED!)
Positional parameters correspons to normal tuples. But what about keyword parameters? One possibility is to defined keyword tuples.
An environment is a mapping from a set of names
to values, just like a sequence is a mapping from
(part of) the integers to values.
The tupling operator @
yields a normal tuple
from a sequence; we can also define it to
yield a keyword tuple from an environment.
A keyword parameter name:value
is a keyword tuple (with one element/binding).
There can now be two tuple formals.
The first bundles up the unused positional parameters,
and defines a sequence.
The second (if any) bundles up the keyword parameters,
and defines an environment.
(In addition, there can be a tuple formal for left-hand arguments.)
(If you want no positional parameters, just make the first
tuple formal be []@
.)
[x:3 y:4]
== (:|x=3; :|y=4)
.
A macro is a function that is called at compile time. It is given some input expressions, and uses them to create a new expression. That expression is then compiled as if it were part of the original input.
A macro definition has the form:
:(macro left_arg name right_args)=bodyThis defines name to be a macro function bound to name. A macro function is just like a regular function, except that it is executed by the compiler at bind-time, not at run-time. (Bind-time is the stage after parsing. and before execution or dumping.) When the compiler sees name, it calls the function, passing the other expressions in the same application as parameters. The result should be an expression, which is then traversed further.
An example should make the idea clear, but first we mention
the parse
function which is very useful when writing macros.
Each argi must be either a string or an expression. Informally, all the arguments are concatenated together. The result is parsed, yielding an expression. When, during the parse, an argi that is an expression is seen, that expression is inserted into the resulting expression.
Here are the actual definitions from the Q runtime system,
defining the cd
macro, which is use to change working directories.
:|(macro cd :x@)= parse "__cd (quote " x@ ")" :|(__cd :args)=external CdActionThe function
__cd
does the actual "work:" Its argument
is a string vector, usually of length one, which names
the desired new directory.
An instance of cd
:
cd ~/binThe binder invokes the
cd
macro with an array
whose single element is the expression parsed for ~/bin
.
The parse
function is invoked, and the result
is as if the user had written:
__cd (quote ~bin)which evaluates to the same as:
__cd ["~bin"]
When a Q function is compiled, the compiler emits various
data structures the describe the function object (such as
the kind of parameters it expects). It also emits a C++ function
for the actual body of the Q function. Normally, the compiler
generates a name for the emitted C++ function, but if you
write external name
after the parameter list,
the emitted C++ function will be called name.
Example:
:(foo :x) external Foo = 3 + x
This emits C++ code something like:
Root * Foo(Root *x) { return /* Code to calculate 3 + x */; } // Data structures that define foo function object // These include the symbol 'foo, and a pointer to Foo.
The compiler also emits a .h
file that includes a
declaration for Foo
. You can include this .h
is some other C++ source program file, and in that program
you can call Foo
as if it were written in C++.
The converse problem is when you have a function written
in C++ that you want to use in Q. You can do this by writing
a dummy function definition whose body is just extern name
:
:(bar :x)=external Bar
You can then call bar
is your Q code, just as if it were
a Q function, assuming you provide it a C++ function something like:
Root *Bar(Root *x) { // Code you write to implement bar. }
In both forms of external
the name can be a quoted
string instead of an identifier. In that case the string is
the assembler label for the generated/needed C++ function, as
opposed to the C++ name of the function.
(NOTE: WORK IN PROGRESS!)
To reduce a sequence using a binary infix function
is to combine the elements of the sequence by place the
operator "between" each element. For example,
the reduction of [2 3 4]
using +
is 2+3+4
.
The symbol @
is used for reduction. This use of @
is related to its use for making tuples (See section Tuples).
A recursive definition:
[] @OP R == R (x,L) @OP R == x OP (L @OP R) [] @OP == left_identity OP # If defined [x] @OP == x (x,L) @OP == x OP (L @OP)
L OP@ [] == L L OP@ (x,R) == (L OP x) OP@ RNote that right(?)-reduction
OP@
uses
the same syntax as tupling: X@
.
However, the former requires OP
to be
a binary function, while the latter requires
X
to be a sequence.
Note that @,
is append
,
while @@,
appends all the elements.
The operator |
is syntax, not a function.
However, it is convenient to define @|
and |@
using the same definitions as before (NOT IMPLEMENTED).
The left_identity
and right_identity
of |
are assumed to be (;)
(i.e. the empty sequence).
It is also convenient to define
~|@ EXPR
to collect each result from
evaluating EXPR, and make a sequence of them all.
The inverse of a function f is written ~f
.
The inverse depends on what kind of a function f is.
Generally, for a postfix or infix function f:
if x ~f a ...
is y, then y f a ...
is x.
If f is a prefix function (or is applied as one),
then if ~f x a ...
is y, then f y a ...
is x.
If the operator ~
is applied infix, as in x~f
,
the result is the same as x(~f)
, except that
any application rule for x is ignored.
A stream is a sequence of values combined with a current position. You can read (or write) only the sequence element at teh current position; however, the position is changed by various stream options described below.
Reading characters in from files or writing characters out to files uses character streams. These are often called files.
Get the next value from the stream, moving the current position forward. At end of file, return ???.
[NOT IMPLEMENTED?]
Same as get
, not does not change the current position.
Get the next n values from the stream, returning a list of the n values read. If the there are less than n characters after the current position, just return as many as are available. (That is, if the stream was at end of file, return an empty array.) Move the current position forward as many characters as were read.
Write value to the stream; that is, set the current element to value and move the current position forward. Returns the null sequence.
Function: stream put value1 ... valuen
Same as:
(stream put value1; ...; stream put valuen)
Move the current position of the stream to the position pos, an integer offset relative to the beginning of the stream.
Move the current position of the stream to the integer offset pos relative to the current position of the stream.
Move the current position of the stream to the integer offset pos relative to the end of the stream.
To format data as a string, do:
sprintf format_string arg_1 ... arg_nThe args are formatted according to formatting directives in the string format_string.
To format data as a string, and send the result to the output text stream out_stream do:
out_stream printf format_string arg_1 ... arg_nIf out_stream is omitted, it defaults to the standard output file. Except for some pathological cases (and efficiency), this is the same as:
out_stream put (sprintf format_string arg_1 ... arg_n)@
An example:
:x=12 sprintf "[x=%05d]" 12 # Yields the string "[x=00012]"The text
"%05d"
is a format directive which indicates
that the argument value (in this case 12) is to be written
using 5 decimal digits, filled with zeros on the left.
The rest of the format string ("[x="
and "]"
)
is emitted verbatim into the output.
A format control string is written in a form (syntax) similar to
the printf
control strings in the C language.
However, the semantics is closer to the format
control
strings in Common Lisp: formatting is is quite powerful,
and can take arguments that are typed.
Each character in a format string is sent verbatim to
the output, until a percent-sign (%
) is seen.
The %
indicates the start of a format directive.
The format directive is read and interpreted according to the
rules given later. This usually "uses up" one or more
of the given format arguments. Then the remaining characters
of the format string are sent to the destination, with
remaining format directives interpreted as before. This
continues until there are no more characters in the format string.
A format directive begins with a %
.
Then there may be an optional a #
, which usually
means that the argument should be printed in a "readable"
format (that is: if the output is read back, the result
should be equal to the orginal argument).
Finally, the format directive is terminated by a single format specification characer, which species what kind of formatting to do.
The format directive %%
uses no arguments, and
outputs a single %
.
These all use a single argument, which should be a number.
%pd
("decimal")
Print the argument as a decimal integer using at least p degits.
The default for p is 0.
%pf
("fixed-point float")
%pe
("exponential float")
%pg
("general float")
%a
("anything")
Print the argument (any type) in a nice format.
Do not print quotation marks or other escape characters.
(Similar to Common Lisp's ~A
.)
%s
("string")
Same as %a
.
%-width.precisions
Print the first precision characters of the printed representation
of the argument, followed by enough spaces to bring the
total length to width,
The default for width is 0, and for precision it is -1.
If the -
is missing, remove excess width at the left,
and add needed padding at the left.
%#s
Print the argument in a format that can be read back in.
(Similar to Common Lisp's ~S
.)
%c
("character")
Same as %c
, except that if the argument is an integer,
its character value is printed instead.
%#c
Same as %c
, except that if the argument is an integer,
its character value is printed instead.
%?
Indirection
The ?
format directive uses up two arguments.
It calls format
recursively, using the first argument
as a format control string, and the second argument
(which must be a sequence) as the list of arguments to process.
Q1> printf "%s%?%s" "Before<" "(%d;%#x)" [10 20] ">After." Before<(10;0x14)>After.
%#?
Indirection.
Same as %?
, except that one one argument is consumed, which
is used as a format control string to a recursive call to (s)printf
.
The recursive call may use arguments from the original call:
Q2> printf "%s%#?%s" "Before<" "[%d;%#x]" 10 20 ">After." Before<[10;0x14]>After.
%#count{actions%}
Iteration
Keep repeating the actions, until there are no more arguments.
Repeat at most count times (default is infinity).
%count{actions%}
Iteration
Same as %#count{actions%}
, but uses one
argument, which must be a sequence. That sequence used whenever
arguments are needed by the actions.
Q1> printf "%{(%s)%}" [3 4 5] (3)(4)(5) Q2> printf "%#{(%s)%}" 3 4 5 (3)(4)(5)
%^
Break from iteration
A structure expression is a statement list, where at least of one the identifiers declared is exported.
An assignable variable is an object that has changable state.
NEW
creates a new assignable object.
Its value can be changed with the assignment operator:
:v = NEW
v := 10
:u = NEW
u := v + 5
V value
extracts the current value of V.
The :=>
operator is as follows:
V :=> a ... z
assigns V to a,
the old value of a to b, and so on, finishing by
assigning the old value of y to z.
The returned value is the old value of z.
All of the assignments are done in parallell.
To increment an assignable variable, use the +:=
contructor,
as in x +:= 2
or x(+:=)1
.
This is a special of :=
used as a postfix operator,
where the operand is a function. In that case,
v F:= x1 .. xn
is the same as:
v := ((v value) F x1 .. xn)
except that v is only evaluated once.
A design goal Q has to make it convenient to use as a command language. It has many of the features found in typical shells (sh, csh, bash, as well as languages like perl and rexx), and it can interact with a Unix-like system in many of the same ways.
To run a program, do: run program args
.
The program (which is quoted) must name an executable file.
The args (which are also quoted) are passed to the program.
Q1> run wc Q.C 308 950 7986 Q.CIf the program is a simple identifier that is not in the scope of a Q variable, and the identifier names an executable file in the current search path, then the
run
may be omitted.
Q2> wc Q.C 308 950 7986 Q.C(Note that determination that program is an executable program is done at compile time, so you cannot omit
run
if
the program hasn't been created yet.)
The standard output from run
is coerced to a string,
which you can use as you use any other string:
Q3> :X = wc Q.C Q4> printf "{%#a}" X {" 308 950 7986 Q.C\n"}
The left-hand argument to a run
command is coerced to a string,
and the result is used as the standard input of the program:
Q5> "Hello World!\n" run /usr/bin/tr a-zA-Z A-Za-z hELLO wORLD!
If an argument contains the "globbing" characters *?[]()|
these are use to match against file names, as in most shells:
Q1> echo Makefi* Makefile Makefile~(
echo
is a builtin command that just prints into arguments.)
The globbing syntax is an extension of that used for most shells;
see See section Regular Expression Patterns for details.
Tilde expansion is also done.
Q1> "ab cde fgh" run wc 0 3 10
To replace Q by program, do: exec program args
.
By the way, before execution Q translates
run program args
into something very much like:
primitive_run "program" (globlist (quote args))
.
(However, you can't call invoke primitive_run
directly.)
If you run Q interactively, you can run multiple jobs at once.
Move the specified job into the foreground.
Continue executing the specified job in the background.
List current jobs
You can interrupt the current forground job with ^Z or kill it with ^C.
Execute the current command in the background.
(Most shells use command&
syntax for this.)
Returns the current working directory (as a string).
Charges the current working directory to dir (which is quoted).
If dir is relative (does not begin with a /
),
and the CDPATH
environment variable is defined,
then cd
will search each of the directories in CDPATH
for a sub-directory dir. Here, CDPATH
is a string
of directory names, separted by colons; it should begin with .
.
cd
will try to hide symbolic links: If you cd
down a symbolic link, and later cd ..
, you will get back
where you started. In contrast, the chdir
routine will
follow the physical ..
link in the file system.
If dir is -
, it is replaced by the value of $OLDPWD
.
Thus cd -
is the same as cd $OLDPWD
.
Change the current working directory to the user's home directory.
Charges the current working directory to dir (which is evaluated).
If dir is ""
, same as chdir env'HOME
.
The chdir
routine is a low-level routine that ignores CDPATH
,
and does not so anything special for symbolic links.
Current working directory as set by cd
or chdir
.
Previous working directory as set by cd
or chdir
.
The arguments in the command line used to invoke Q are in argv
,
which is an array of strings. The first element (i.e. argv 0
)
is the name of the program running, usually just Q
.
If Q is invoked on as file (as in: Q -flags file arg1 ... argn
),
the value of argv
is the file name followed by the non-flags
arguments: ["file" "arg1" .. "argn"]
.
This is compatible with most shells.
The "environment" passed to Q (in the Unix sense; see "man 7 environ")
is available in the variable env
. This is an updatable
table, containing strings that are indexed by symbols. When a
program is executed, the environment passed to it is taken
from env
. For example:
Q1> env'TERM dumb Q2> env'TERM:="xterm" Q3> run /bin/sh $ echo TERM= $TERM TERM= xtermTo remove an element from the environment, assign
NONE
to it:
env'TERM:=NONE
Terminates the Q program. (Same as the Unix call exit(code)
.)
The default for code is 0.
Same as exit 0
.
The echo
builtin prints its argument to the output.
The -n
flag suppresses a final new-line.
The -q
flag formats each argument as a quoted string
literal that can be read in again:
Q1> echo prefix(XX YY)"\012"postfix prefixXX postfix prefixYY postfix Q2> echo -q prefix(XX YY)"\012"postfix "prefixXX\npostfix" "prefixYY\npostfix"
Informally, the quote
macro
puts string quotes areound each word in words...;
return resulting code evaluates to a vector of strings.
For example quote 4+9 is $(4+9).
becomes
["4+9" "is" "$(4+9)."]
, which evaluates to the vector:
["4+9" "is" "13."]
.
These routines are used internally (by run
and other commands):
Function: glob pattern
Return the list of file names that match the pattern
.
The input should be a string, and the result is a vector of strings.
If there is no match, the result is []
.
Q1> printf "<%S>" (glob "parse*") <["parserule.o" "parsemacros.o" "parse.o"]>
Call glob
on each element of names,
which should be a sequence of strings.
Append all the results into one long vector of strings.
However, any elements of names that have no matches,
are inserted directly into the output list (after parentheses
and quoting characters have been removed).
Q2> globlist ["*.b" "parse*" "foo\"+\"(abc)"] *.b parserule.o parsemacros.o parse.o foo+abcThis behavior mimics that used by
/bin/sh
.
An expression in Lisp/Scheme syntax may entered by
prefixing it with the Scheme
keyword.
The expression is read using a programmable Lisp reader.
Then the resulting Lisp object is converted to a Q expression
for evaluation.
Q1> print_readable:=1 Q2> Scheme (quote (:foo #(3 4 5))) ['foo [3 4 5]]
To print an object in Lisp/Scheme syntax, set the print_lisp
variable:
Q3> print_lisp:=1 Q4> Scheme '(:foo #(3 4 5))) (:foo #(3 4 5))
The only special form that has been implemented is quote
,
so the Scheme
keyword has limited usefulness.
A number of issues also need to be resolved, such as
the representation of the empty list/sequence
(Scheme 'nil
vs. Scheme '()
vs. []
).
A number of functions from Common Lisp and the Scheme R3 report have been implemented. You can use them in Q code:
Q5> Scheme (symbol->string 'abc) "abc" Q6> (Lisp #'find-package) "common-lisp" package "common-lisp"
Right now, you can also just name Lisp function directly in Q code, since Scheme, Lisp, and Q use the same global name space. This will probably change:
Q7> symbol\-\>string 'abc "abc"Note that
->
had to be "quoted" with \
s.
The following Common Lisp functions and values have been implemented: car, cdr, cddr, cadr, cons lispread (calls the Lisp reader) nil, t vector integer-length getprop, putprop, remprop, symbol-plist, symbol-package make-symbol find-package symbol-function, symbol-value, set, boundp, fboundp, makunbound, fmakunbound
The following Scheme functions have been implemented: car, cdr, cddr, cadr, cons vector, make-vector, vector-fill! symbol->string, string->symbol
The only recognized declare
form is a non-standard one:
(declare (external Foo))
in the body of a function declares that
when the function is compiled, the name of the emitted C++ function should
be Foo
.
(declare (external "Foo"))
is the same, except that it declares
that the assembler label of the function is Foo
.
If the there are no forms in the function following a
(declare (external ...))
form then the function is assumed
to be defined in some external C++ file. See section C interface
(Section numbers refer to: Guy Steel jr. "Common Lisp: The Language", second edition, 1990.)
The "Predicates on Numbers" (from Section 12.2) are all implemented.
True if number is zero.
True if number is positive
True of number is negative
True if integer is odd.
True if integer is even.
The "Logical Operations on Numbers" (from Section 12.7) are all implemented. These all consider an integer as a semi-infinite bit string.
Function: logior &rest integers
Take the "inclusive or" of the integers.
Function: logxor &rest integers
Take the "exclusive or" of the integers.
Function: logand &rest integers
Take the logical "and" of the integers.
Function: logeqv &rest integers
Take the "exclusive nor" (bit-wise logival equivalence) of the integers.
Function: lognand integer1 integer2
Same as (lognot (logand integer1 integer2))
.
Function: lognor integer1 integer2
Same as (lognot (logor integer1 integer2))
.
Function: logandc1 integer1 integer2
Same as (logand (lognot integer1) integer2)
.
Function: logandc1 integer1 integer2
Same as (logand (lognot integer1) integer2)
.
Function: logandc2 integer1 integer2
Same as (logand integer1 (lognot integer2))
.
Function: logorc1 integer1 integer2
Same as (logor (lognot integer1) integer2)
.
Function: logorc2 integer1 integer2
Same as (logor integer1 (lognot integer2))
.
Function: boole operation integer1 integer2
Does the bit-wise boolean operation indicated by operation, on the two operands integer1 and integer2. The operation is one of the 16 constants below.
Returns the bit-wise logical "not" of integer.
Function: logtest integer1 integer2
Same as (not (zerop (logand integer1 integer2)))
.
Function: logbitp index integer
True iff bit number index in integer is one.
Does an arithmetic shift by count on integer
.
Shift left is count is positive; otherwise, shift right.
Count the number of 1-bits in integer, if it is non-negative. If integer is negative, count number of 0-bits.
Function: integer-count integer
Return number of bits needed to represent integer.