Is it a binary file? Is it a text file? It's a Blob

Let's say we want to get the contents of a file as a value, using a simple function, without using a port or looping. Kawa has a function to do that:
(path-data path)
The path can be a Path object, or anything that can be converted to a Path, including a filename string or a URL.

You can also use the following syntactic sugar, which is an example of SRFI-108 named quasi-literals:

&<{pname}
This syntax is meant to suggest the shell input redirection operator <pname. The meaning of &<{pname} is the same as (path-data &{pname}), where &{pname} is a SRFI-109 string quasi-literal. (This is almost the same as (path-data "pname") using a traditional string literal, except for the rules for quoting and escaping.)

What kind of object is returned by &<{pname}? And what is printed when you type that at the REPL? Fundamentally, in modern computers the contents of a file is a sequence of uninterpreted bytes. Most commonly, these bytes represent text in a locale-dependent encoding, but we don't always know this. Sometimes they're images, or videos, or word-processor documents. It's like writing assembly code: you have to know the types of your values. At best we can guess at the type of a file based on its name or extension or looking for magic numbers. So unless we have more information, we'll say that path-data returns a blob, and we'll implementing it using the gnu.lists.Blob type.

$ cat README
Check doc directory.
$ kawa
#|kawa:1|# (define readme &<{README})
#|kawa:2|# readme:class
class gnu.lists.Blob
You can explicitly coerce a Blob to a string or to a bytevector:
#|kawa:3|# (write (->string readme))
"Check doc directory.\n"
#|kawa:4|# (write (->bytevector readme))
#u8(67 104 101 99 107 32 100 111 99 32 100 105 114 101 99 116 111 114 121 46 10)
#|kawa:5|# (->bytevector readme):class
class gnu.lists.U8Vector

The output of a command (which we'll discuss in the next article): is also a blob. For almost all programs, standard output is printable text, because if you try to run a program without re-direction, and it spews out binary data, it may mess up your terminal, which is annoying. Which suggests an answer to what happens when you get a blob result in the REPL: The REPL should try to print out the contents as text, converting the bytes of the blob to a string using a default encoding:

#|kawa:6|# &<{README}
Check doc directory.
It makes sense look at the bytes to see if we can infer an encoding, especially on Windows which doesn't use a default encoding. Currently Kawa checks for a byte-order mark; more sniffing is likely to be added later.

What if the file is not a text file? It might be reasonable to be able to configure a handler for binary files. For example for a .jpg image file, if the the console can display images, it makes sense to display the image inline. It helps if the blob has a known MIME type. (I believe a rich text console should be built using web browser technologies, but that's a different topic.)

Writing to a file

The &<{..} operation can be used with set! to replace the contents of a file:

(set! &<{README} "Check example.com\n")

If you dislike using < as an output operator, you can instead using the &>{..} operation, which evaluates to function whose single argument is the new value:

(&>{README} "Check example.com\n")

You can use &>> to append more data to a file:

(&>>{README} "or check example2.com\n")

The current directory

Functions like path-data or open-input-file or the sugar we seen above all use the current directory as a base directory for relative pathname. You get the current value of the current directory with the expression (current-path). This returns a path object, which prints as the string value of the path.

The initial value of (current-path) is the value of the "user.dir" property, but you change it using a setter:

(set! (current-path) "/opt/myApp/")
A string value is automatically converted to a path, normally a filepath.

The procedure current-path is a parameter, so you can alternatively call it with the new value:

(current-path "/opt/myApp/")
You can also use the parameterize form:
(parameterize ((current-path "/opt/myApp/"))
  (list &<{data1.txt} &<{data2.txt}))
Tags: