To start, here’s sample Clojure code rewritten in Animus:
; Clojure
;A test program
exploring how to structure GUI code in Clojure
;The GUI draws
whatever you type in the text field nicely in the panel below.
;license: Public
domain
(import '(javax.swing
JFrame JLabel JTextField JButton JPanel)
'(java.awt.event
ActionListener)
'(java.awt
GridBagLayout GridBagConstraints Color Font RenderingHints))
(defn make-model []
(ref "Hello MVC!"))
(defn
make-graphics-panel [model]
(let [panel
(proxy
[JPanel] []
(JPanel
[] (println "in constructor"))
(paint
[g]
(doto
g
;clear the background
(.setColor (. Color black))
(.fillRect 0 0 (.getWidth this) (.getHeight
this))
;draw the text
(.setRenderingHint (. RenderingHints
KEY_ANTIALIASING)
(. RenderingHints
VALUE_ANTIALIAS_ON))
(.setFont (Font. "Serif" (. Font
PLAIN) 40))
(.setColor (. Color white))
(.drawString @model 20 40))))]
;repaint when the model
changes
(add-watch model "repaint" (fn [k r
o n] (.repaint panel)))
panel))
(defn
make-text-field [model]
(doto (JTextField.)
(.setText @model)
(.addActionListener
(proxy
[ActionListener] []
(actionPerformed
[e]
(let
[new-text (.getActionCommand e)]
(dosync
(ref-set model new-text))))))))
(defn
make-text-field-constraints []
(let [c (GridBagConstraints.)]
(set! (.fill c) (. GridBagConstraints
HORIZONTAL))
(set! (.weightx c) 1)
c))
(defn
make-panel-constraints []
(let [c (GridBagConstraints.)]
(set! (.gridy c) 1)
(set! (.weighty c) 1)
(set! (.fill c) (. GridBagConstraints
BOTH))
c))
(defn
make-gui-panel [model]
(let [gridbag (GridBagLayout.)
text-field
(make-text-field model)
panel
(make-graphics-panel model)]
;set up the gridbag constraints
(doto
gridbag
(.setConstraints text-field
(make-text-field-constraints))
(.setConstraints
panel (make-panel-constraints)))
;add the components to the panel and return
it
(doto
(JPanel.)
(.setLayout gridbag)
(.add text-field)
(.add
panel))))
(defn show-in-frame
[panel width height frame-title]
(doto (JFrame. frame-title)
(.add
panel)
(.setSize
width height)
(.setVisible
true)))
(show-in-frame
(make-gui-panel (make-model)) 300 110 "GUI Test")
; Animus
; A test program exploring how to structure GUI code in
Clojure
; The GUI draws whatever you type in the
text field nicely in
; the panel below.
; license: Public domain
import .p
javax.swing
JFrame JLabel JTextField JButton JPanel
.p java.awt.event
ActionListener
.p java.awt
GridBagLayout GridBagConstraints Color Font RenderingHints
.sub add addChild
defunc make-model
ref "Hello
MVC!"
defunc make-graphics-panel
model
let panel
proxy JPanel
.m
JPanel
println "in constructor"
.m
paint g
doto g
; clear the background
setColor Color.BLACK ; an imported class’s statics get
; mapped
to Class.name
fillRect 0 0 (getWidth) (getHeight)
;
draw the text
;
you can optionally specify to import statics by own name
; rather than qualified by
their class
setRenderingHint KEY_ANTIALIASING
VALUE_ANTIALIAS_ON
setFont (Font "Serif" Font.PLAIN 40)
setColor Color.WHITE
drawstring d'model 20 40
; repaint when the model changes
add-watch model "repaint" (func
a b c d (repaint panel))
,panel
defunc make-text-field model
doto (JTextField)
setText d’model
addActionListener
proxy
ActionListener
.m actionPerformed e
let
new-text (getActionCommand e)
dosync
(ref-set model new-text)
defunc make-text-field-constraints
let c (GridBagConstraints)
set! (fill c) GridBagConstraints.HORIZONTAL
set! (weightx c) 1
,c
defunc make-panel-constraints
let c (GridBagConstraints)
set! (gridy c) 1
set! (weighty c) 1
set! (fill c) GridBagConstraints.BOTH
,c
defunc make-gui-panel
model
let gridbag (GridBagLayout)
,text-field (make-text-field model)
,panel (make-graphics-panel
model)
;set up the gridbag constraints
doto gridbag
setConstraints text-field (make-text-field-constraints)
setConstraints panel (make-panel-constraints)
;add the
components to the panel and return it
doto (JPanel)
setLayout gridbag
addChild text-field
addChild panel
defunc show-in-frame
panel width height
frame-title
doto (JFrame frame-title)
addChild panel
setSize width
height
setVisible true
show-in-frame (make-gui-panel (make-model)) 300 110 "GUI
Test"
Parentheses
Let’s first see how Animus cuts down on the number of parentheses. Consider the special form let in Scheme:
(let ((a 3) (b 4)) (+ a b)) ; Scheme
Each binding in a Scheme let is in its own pair of parentheses, and all the bindings are surrounded in another pair of parentheses so as to distinguish the bindings from the body. When a special form or macro uses parentheses in this way to denote special arguments, it’s called grouping. Not only is grouping ugly, it’s confusing for learners because it violates the normal expectation that parentheses denote a call, macro, or special form, and so reading through the parentheses remains quite difficult until the programmer becomes very familiar with the special forms and standard macros.
Clojure sensibly makes grouping in let more readable by putting the bindings in a vector literal rather than a list and by not surrounding each individual binding in its own parentheses:
(let [a 3 b 4]
(+ a b)) ; Clojure
But in Animus, we go further and drop grouping altogether:
(let a 3 b 4 (+
a b)) ; Animus
The compiler can tell where the body starts because, reading left-to-right, it reads through pairs of names and values until it finds a list instead of a symbol.[1]
Similarly, instead of:
(fn [a b c] (print a b c)) ; Clojure
…we can just write:
(fn a b c (print a b c)) ; Animus
A problem, though, is that Clojure’s fn optionally starts with a name for the function (so as to be able to refer to itself in its body):
(fn foo [a b c]
(print a b c) foo) ; Clojure
Without the grouping, the name will not be distinguishable from a parameter. One solution is to simply write a separate macro to accommodate this minority case, but if we’re not willing to do that, we can simply use a designated keyword to distinguish the name argument:
(fn :name foo a b c (print a b c) foo)
If long keywords are unacceptable for frequently used options, we can keep these keywords brief:
(fn :n foo a b c (print a b c) foo)
Flags
In some cases, we might run into an ambiguity if we need a special form or macro to accept a keyword literal as argument. For this reason and also for the sake of style, we have a separate type of keyword used for grouping and denoting optional arguments. In Animus, I call these flags and write them as symbols beginning with a dot rather than a colon:
(fn .n foo a b c (print a b c) foo)
With flags, we can ditch grouping entirely and still have macros that allow special minority cases.
Flags can also help us mitigate the need to quote so often. The Clojure function in-ns, for instance, requires coders to quote the symbol denoting the package name because the coder occasionally might want to pass the symbol to the function from a variable, meaning we need to distinguish between the case of a symbol standing for itself or for a variable. If we make in-ns a macro, we can leave the symbol unquoted such that, when we wish the symbol to stand for a variable rather than for itself, we simply precede the symbol with a flag denoting this special case:
; Clojure
(in-ns ‘cat)
; Animus
(in-ns cat)
; Clojure
(def x (quote cat))
(in-ns x)
; Animus
(def x (quote cat))
(in-ns .v cat) ; “v” for “variable”
(Or, again, we could simply create a separate macro or function for this minority case.)
Indentation
To get rid of even more parentheses, we introduce an indentation rule that allows us to leave some parentheses implicit:
; Clojure
(defn factorial [n]
(loop [cnt n acc 1]
(if (= cnt 0)
acc
(recur (dec cnt) (* acc cnt)))))
; implicit parentheses
defn factorial n
loop cnt n acc 1
if (= cnt 0)
,acc
recur (dec cnt) (* acc cnt)
The rule is that every line beginning with a symbol but not a comma is implicitly surrounded in a pair of parentheses such that the end parenthesis comes after the last line indented underneath.
(Note that, unlike in Clojure, commas cannot be inserted at stylistic whim. Also note that all arguments on succeeding lines must be indented in by a standard four spaces. This regularity is not only more readable than the mystifying, inconsistent indentation typical in Lisp, it frees coders from thinking too much about an irrelevant style issue.)
The apparent downside of our indentation rule is that we can’t do formatting like this:
; Clojure
(send-off backup-agent (fn
[filename]
(spit filename
snapshot)
filename))
To get the arguments of the form starting with fn on to their own lines, we must start fn on its own line as well:
; Animus
send-off backup-agent
fn filename
spit filename snapshot
,filename
This gets especially bad when an interior list is deeply
nested because that means we have to use up many more lines. Here, for
instance, to get doseq on to its own
line, we have to put everything containing it onto its own line as well:
; Clojure
(map (fn [t] (fn [] (dotimes n niters (dosync
(doseq r refs
(alter r + 1 t))))))
(range
nthreads))
; Animus
map
fn t
fn
dotimes n niters
dosync
doseq r refs
alter r + 1 t
range
nthreads
For a long time, I tried to devise some special exception to the rule that avoided this stair-step effect, but I eventually concluded this limitation is a feature, not a bug: cases like this last one are not common, so at most this stricture costs an extra line or two here and there, and this price is compensated by the improved regularity and clarity.
Still, if in some particular case you find the significant indentation to be bothersome, you can start a line with ,( to resort to free-form syntax:
map
,(fn t (fn (dotimes n niters (dosync
(doseq r
refs
(alter r +
1 t))))))
range
nthreads
Another issue with our indentation rule is how it affects Clojure’s vector and hashmap literal notation. We could simply prescribe that a line beginning [ or { denotes the start of a vector or hashmap with an implicit ] or }, e.g.:
; Clojure
[a
b
c]
...could
be written:
; Animus
[a
,b
,c
However, this asymmetry looks a bit odd and is inconsistent with the parentheses rule. Consider, though, that while we need vectors and hashmaps to stand out a little bit in code for the sake of writing JSON-like data, we don’t need them to visually stand out all that much if they aren’t used for grouping like in Clojure. So as a compromise, we write vector and hashmaps as regular forms, not literals, but with eye-catching symbolic names:
; Clojure
[a b c]
; Animus
^ a b c
; the function ^ creates a vector
; Clojure
{:a 3 :b 6}
; Animus
% :a 3 :b 6
; the function % creates a hashmap
(If it turns out that we really need vectors and hashmaps to be recognized at reader time, we could have the reader specially look for forms beginning with these symbols. Either that, or just go back to [ and {, which wouldn’t be so bad, really.)
Finally, there’s an issue with forms written starting with something other than a symbol:
; Clojure
(:cat x)
; return the value associated with key :cat in the hashmap x
We have trouble writing this in Animus because a line starting with a keyword is not implicitly surrounded in parentheses, and if we write a line as (:cat x) in Animus, it is interpreted as ((:cat x)). To get around this, use the macro &, which simply calls its first argument as a function, passing to it the remaining arguments:
; Animus
& :cat x
; (& :cat x) is effectively equivalent to Clojure’s (:cat x)
Reader macros
Reader macros in Lisp are effectively a way to sneak syntax back
into the language. This extra syntax serves two purposes:
1)
Syntactical
compression: Some things occur so commonly that even the syntactical
compression of a regular macro is not enough. For instance, if we must quote
often, the burden of a full quote form is relatively onerous compared to just
an apostrophe.
2)
Sigils: ASCII
symbols often effectively serve as visual markers in code that help certain
forms stand out.
The question is how much compression do we really need and where,
and what in code really deserves to stand out? In Animus, we have the further
concern that reader syntax needs to be as general as possible so that we can
flexibly embed other languages. The solution in Animus is to replace Clojure’s
reader macros in two ways:
First, we can replace a reader macro with just a regular macro
with a short name. So for instance, the reader macro # in Clojure is replaced by the regular macro fn (consequently, Clojure’s fn is renamed func in Animus):
; Clojure
(fn [] (foo a b c))
; Animus
func (foo a b c)
; Clojure
#(foo a b c)
; Animus
fn foo a b c
Second, we can replace reader macros with what I call prefix macros, which have the form:
identifier’
The whole point of prefixes is to take up very little space, so
their names are generally only one or two characters long. For instance, the
prefix macro for quote is q’ :
q’fox ; (quote fox)
A prefix is pretty much like a regular macro except: 1) a prefix
is always written with no space between the apostrophe and its first argument; 2)
the parentheses of a prefix macro starting in the middle of a line are left
implicit if the prefix macro has just one argument:
e’bar ; invoke macro e’ with argument bar
ack e’bar ; call ack with argument
(macro e’ with argument bar)
Similar to prefix macros are string
macros. A string macro is denoted by an identifier directly preceding a
string (which is either in single- or double-quotes). For instance, the re string macro precedes a regular expression
string:
re‘Dear (Sir|Madame),’ ; re makes this string a regex, not a string
re“Dear (Sir|Madame),” ; same as above
Because string and prefix macros look similar, you can’t have,
say, both a prefix macro named foo and a string macro named foo in the same namespace.
Java
interop
In Clojure, Java methods and fields are accessed with the special
form . (dot), and objects are instantiated
with the special form new, but on
top of this Clojure adds special syntax and a way to add classes into the
namespace; it ends up that the preferred syntax is as such:
; Clojure
(.bar foo 5) ; instance method: foo.bar(5)
(.bar foo) ; either instance
method: foo.bar()
; or instance
field: foo.bar ((depends on whether bar is an instance or a field))
(Cat/meow 5) ; static
method: Cat.meow(5)
(Cat/meow) ; static method:
Cat.meow()
Cat/meow ; static
field: Cat.meow
My objections to Clojure’s arrangement here are that there are too
many different ways to express the same thing and that the way these convenience
forms are enabled is a little convoluted. In particular, the static call (Cat/meow) may look simple, but it actually
requires a special rule that allows classes to be directly bound into
namespaces (rather than through a var), and then it requires a special rule for
/ (slash) in a symbol suffixing a class name.[2] I
also object to the aesthetic of
Clojure’s interop syntax: while some might argue that Java interop should stand
out in code, I believe that it should blend in.
So the Animus equivalent to the above Clojure code looks like
this:
; Animus
bar foo 5
bar foo
Cat.meow 5
Cat.meow
,Cat.meow ; the comma, recall, suppresses the implicit
parentheses
The way this works is that Java methods and fields are imported
directly into the namespace as macros that simply expand into interop (Animus’s equivalent to Clojure’s .
form). So, for instance, when we import the class x.y.Cat, its static method meow comes into the namespace as the symbol Cat.meow bound to a macro such that:
Cat.meow 5 ; expands into
(interop x.y.Cat meow 5)
(If we import two classes both with a method or field of the same
name, this is not a problem because the two resulting macros will be exactly
the same, so it doesn’t matter which gets assigned to that name in the
namespace.)
To assign a value to a field, we could just provide a third
argument to the field’s macro:
nixon fillmore 3 ; expands to equivalent of fillmore.nixon
= 3
The trouble is that we want state changes to stand out (especially
in functional programming!). One solution is to instead do assignment through another
macro given the same name but with an ! added to its name:
nixon! fillmore 3
Or we could just have the set! special
form expect its first argument to be an interop form:
set! (nixon fillmore) 3 ; (set! (interop nixon fillmore) 3)
When importing a class, the class object itself is, by default,
bound to a symbol of the same name, e.g.
importing x.y.Cat binds the
class object to Cat (and
unlike in Clojure, this binding is through a var).
JSON
Consider
some typical JSON:
[
{
‘title’:‘Empire Burlesque’,
‘artist’:‘Bob Dylan’,
‘country’:‘USA’,
‘company’:‘Columbia’,
‘price’:‘10.90’,
‘year’: 1985
},
{
‘title’:‘Heroes’,
‘artist’:‘David Bowie’,
‘country’:‘UK’,
‘company’:‘RCA’,
‘price’:‘7.50’,
‘year’: 1977
}
]
Expressing this same structure in Lisp is easy. In Animus, we could write:
^
%
:title ‘Empire Burlesque’
:artist ‘Bob Dylan’
:country ‘USA’
:company ‘Columbia’
:price ‘10.90’
:year 1985
%
:title ‘Heroes’
:artist ‘David Bowie’
:country ‘UK’
:company ‘RCA’
:price ‘7.50’
:year 1977
Of course, what we can do in Javascript and Lisp (but can’t do in XML) is express common data structures more compactly with functions:
^
Cd ; function Cd returns a new
Cd object
‘Empire Burlesque’
‘Bob Dylan’
‘USA’
‘Columbia’
‘10.90’
1985
Cd
‘Heroes’
‘David Bowie’
‘UK’
‘RCA’
‘7.50’
1977
Of course, the readability of such functions hinges upon your ability to remember the number, significance, and order of their parameters.
XML
When imitating XML in Lisp, we can ditch XML’s distinction of attributes and content and just treat both kinds of thing as arguments to the tag (which is either a function or macro). However, we run into a problem with mark-up style data because, frequently in mark-up, a tag has other tags interspersed in its content. This then seems to require us to awkwardly break up our content into multiple strings interspersed with other forms. Consider this XHTML:
<div>
<p>Lorem ipsum dolor sit <em>amet</em>,
consectetur adipisicing elit, </p>
</div>
We could write this in standard Lisp syntax like so:
(div (p ‘Lorem
ipsum dolor sit’ (em ‘amet’)
‘, consectetur adipisicing elit,’))
This isn’t terrible, but it gets a little worse with Animus’s indentation scheme, which tends to require us to use more lines:
div
p ‘Lorem ipsum dolor sit’
em ‘amet’
‘, consectetur adipisicing elit,’
Actually, we could fix this by just stringing it all on to a single line:
div
p ‘Lorem ipsum dolor sit’ (em ‘amet) ‘, consectetur
adipisicing elit,’
But this is still a bit ugly because of the way we must split the content string to intersperse it with forms. To pretty this up a bit, Animus has a special kind of string literal which I call a content string, denoted by an opening / (slash) and terminated by the end of the line[3]:
/bla bla bla bla
The newline that ends the string is not considered part of the content, and like in XML tags, a series of contiguous whitespace characters is treated as a single space. The special thing about content strings is that you can intersperse forms in them using { }:
div
p /Lorem ipsum dolor sit {em /amet}, consectetur
adipisicing elit,
Notice that the em tag includes its own content string, which terminates at the matching }. Because a content string only terminates at } or newline, one is only written either last on its line or as the last argument inside { }.
If you feel that what you want to intersperse in a string is too much for one line, just make it a regular argument and continue the content on the next line:
div
p /Lorem ipsum dolor sit
em /amet
/, consectetur adipiscing elit,
In my experiments with formatting random XHTML files in this syntax, things work out quite neatly, but I think your acceptance level largely depends on how you feel about your editor wrapping lines. To my mind, the editor would ideally only wrap these content strings, and the wrapped lines would be specially colored and automatically indented to line up with the start of the string:
div
p /Lorem ipsum dolor sit amet, consectetur
adipisicing elit,
em /sed do
/eiusmod tempor incididunt ut labore et
dolore magna aliqua.
Ut enim ad
minim veniam, quis nostrud exercitation ullamco
laboris nisi
ut aliquip ex ea commodo consequat. Duis
aute irure
dolor in reprehenderit in voluptate velit esse
cillum dolore
eu fugiat nulla pariatur. Excepteur sint occaecat
cupidatat non
proident, sunt in culpa qui officia deserunt
mollit anim
id est laborum.
Even with theses affordances, we still have a bit of a problem because some mark-up documents deeply nest the bulk of their content, meaning we’d end up horizontally scrolling a lot due to the rigid indentation scheme. One solution is for text editors to intelligently elide deep levels of indentation. For instance, a line that’s indented in a dozen times could be written up against the left margin but somehow given a visual indication of its “true” indentation, something like this:
foo
bar
foo
bar
foo
bar
foo
bar
foo
←
→bar
To be honest, this whole issue of mark-up data seems one area in which Animus requires further thought and experimentation.
Highlighting
The simplest highlighting of Lisp would color comments, literals, and symbols like def, fn, if, etc. Ideally, though, we would also use highlighting to distinguish standard names, local names, module names (stuff defined in the current namespace), and foreign names (stuff defined in other namespaces):
defunc make-gui-panel model
let gridbag (GridBagLayout)
,text-field
(make-text-field model)
,panel (make-graphics-panel
model)
;set up the gridbag constraints
doto gridbag
setConstraints
text-field (make-text-field-constraints)
setConstraints
panel (make-panel-constraints)
;add the components to the panel and
return it
doto (JPanel)
setLayout
gridbag
addChild text-field
addChild
panel
The highlighting here is:
· Foreign names in black.
· Module names in orange (bold where defined).
· Local names in red (bold where defined).
· Standard names in blue (bold for special forms and for standard macros with bodies).
By putting these things in different colors, the reader can quickly distinguish names they should be very familiar with from those they shouldn’t. I find quickly identifying local names particularly useful because what tends to happen as I read an unfamiliar function is that I have trouble loading the local names into my short-term memory, so I end up thinking too much about where the names are coming from (even including those names whose declarations I just read a moment ago). This highlighting helps me quickly orient myself when reading code both familiar and unfamiliar.
Name completion, name refactoring, and name checking
In the great debate of static typing versus dynamic typing, my personal experience in dynamic languages has been this: I don’t make type errors. OK, I do make type errors occasionally, but in a whole year working in Javascript, I made maybe half a dozen type errors total, and none of them took long to track down and correct.
However, the dynamic language errors I do make all the time are name errors: sometimes I just recall a name incorrectly, or sometimes I just make a typo. These mistakes occur most often for method and property names, e.g. I’ll write x.foom when I should write x.foo. Now, in Clojure, even if we’re imitating an OOP style, the methods and properties simply live in the regular namespaces rather than the namespaces of the types, so the issue is not (as it’s usually framed) about fancy code analysis to determine the type of objects denoted by names; rather, we just need our editing environment to complete names from the active namespaces and to point out where we’ve written undefined names. If our editing environment can get a handle on names, it can also rename identifiers without error.
What makes getting a handle on names difficult with dynamic code is that there’s always the potential for names to change at runtime. It’s especially tricky in Lisp: macros might play odd game, and macros don’t really exist until runtime because they may rely upon runtime state. Consequently, for our pre-execution code analysis to really work, we have to at least partially run the code—we must run the program without really running it. This requires two accommodations on the coder’s part:
First, a namespace must be confined to a single module rather than arbitrarily spread amongst the source files (as Lisp currently allows). This basically means ‘one namespace equals one file’, like in Python. ((Clojure already has module-like things it calls libs, but Clojure doesn’t require you to organize your code into libs.))
Second, a module must be explicitly split into its two parts: 1) definitions; 2) code to execute when the module is ‘invoked’. This is done simply by putting the “runtime code” in a special block called runtime:
defunc show-in-frame panel width height
frame-title
doto (JFrame frame-title)
addChild panel
setSize width height
setVisible true
;
pre-execution analysis will know not to actually run this code
runtime (show-in-frame (make-gui-panel (make-model)) 300 110 "GUI
Test")
Like tests, code analysis becomes essentially just an alternate execution path through the code. This code path produces meta information to be used by the editing environment.
Understand the limitation of this scheme: when code completion tells you that foo refers to such-and-such function, this cannot be guaranteed with certainty because there’s always the potential for monkey business at runtime. So long as code completion is understood to not be a guarantee, it’s still extremely useful.[4]
Embedded languages
If embedding other programming languages in Lisp is a viable option, why is it not done already? Well, to an extent, it is done, but usually with new custom languages rather than with established languages. For example, this article describes a custom assembly language written as a Lisp (which is not quite the same as embedding a language in an existing Lisp, but it’s close). The reasons why this is not done for established languages, though, are quite reasonable:
· First, for those already comfortable with a language and its tools as they exist, there’s a big upfront cost in learning and adjusting to a new syntax and toolchain.
· Second, it would take many years for a replacement toolchain to catch up with the old, and in the meantime, the new and old parts may not work well together. For instance, until we write a new debugger, any C++ we generate from Lisp will be awkward and confusing to debug.
But consider the potential benefits:
· Lisp macros. Macro-like features introduced in other languages (e.g. the preprocessor in C, annotations in Java) tend to be unwieldy and simply aren’t as powerful as Lisp macros. This has consequences. In Java, for example, we not only have to wait until Java 7 to get a feature like string cases in switch statements, we waste a lot of time debating whether to add such things into the language. Lisp macros allow us to implement as library code whole classes of functionality that otherwise would require everyone to agree on making changes to the core language.
· Prettier, clearer code. Pretty syntax minimizes line noise and is formatted in a consistent layout that avoids consuming too many lines and keeps the lines down to moderate length; clear code avoids filler and convolutions. Sadly, most languages include some degenerate syntax that is neither pretty nor clear (C’s declaration syntax comes to mind). Embedding a language in Animus allows us to fix these mistakes and to use macros to cut down on boilerplate.
· Cleaner cross-language interop. In even the best cases, getting one language to interoperate with another involves a lot of boilerplate. When two languages are expressed in the same syntax and in the same VM (at least in pre-compiled form), this opens up possibilities for a much more elegant interop scheme.
· Unified tool chain. Not only should languages be embedded in Lisp, their tools should be embedded as well. Tools each tend to come with their own syntax and rules to learn, so a build tool, for instance, requires learning a bunch of commands and command-line options along with learning how to write the build scripts themselves (and learning what these files are expected to be called and where they are expected to be found in the file system). If we punted this sort of functionality into an Animus library, then for example, a maven build script, rather than being written as pom.xml, would be written as just another Animus file. If all our tools—compilers, debuggers, profilers, version control, doc tools, and so on—took this approach, they’d be easier to create and to learn to use, so we’d see a more rapid pace of innovation, better interoperability, and more transference of knowledge and skills.
So how does Animus better enable language embedding compared to a more traditional Lisp, such as Clojure? Well the simple changes in syntax—fewer parentheses, a more general replacement for reader macros—make a big difference. Yes, a programmer confronted with this change-over still must pay a sizable transition cost, but at least the end result looks appealing.
So here’s an example of what C might look like embedded in Animus. This is based on a random sample I found in some Linux driver code:
c-defunc RADEONPostInt10Check void .static pScrn
ScrnInfoPtr ptr void .p
; declarations begin with var
var info RADEONInfoPtr (RADEONPTR pScrn)
var RADEONMMIO uchar .p (MMIO info)
var pSave RADEONInt10Save ptr
var CardTmp uint32_t
; rest of this function omitted…
The general strategy here is that our C-inAnimus library has macros and functions for defining objects representing the semantics of our embedded code, and then at some point these objects will be passed to a compiler component. So the c-defunc macro here creates an object representing a C function, which is bound in the module to the name RADONPostInt10Check. After the function name, we put the return type followed by any function modifiers and then followed by the parameters, so this function returns void, is declared static, and has two parameters, pScrn of type ScrnInfoPtr and ptr of type void pointer.
Note that we’ve reversed the C convention: in a declaration here, names come before their type. Also, be clear where the names used are coming from: the symbol void is recognized in c-defunc as effectively a reserved word, and ScrnInfoPtr is defined in another module as a variable bound to an object representing a C type.
In the body of the function, var is recognized by c-defunc as a reserved word denoting a declaration form, so we’re declaring four locals: info, RADEONMMIO, pSave, and CardTmp, with the types RADEONInfoPtr, uchar pointer, RADEONInt10Save, and uint32_t, respectively. The first three declarations are given initial values from the expressions (RADEONPTR pScrn), (MMIO info), and ptr, respectively.[5]
Here’s another example of our faux C:
c-defunc RADEONGetRec Bool .static pScrn ScrnInfoPtr
if (driverPrivate pScrn)
return TRUE
= (driverPrivate pScrn)
(xnfcalloc (sizeof RADEONInfoRec) 1)
;
pScrn.driverPrivate = xnfcalloc(sizeof RADEONInfoRec, 1);
return TRUE
The function RADEONGetRec returns Bool, is declared static, and has the parameter pScrn of type ScrnInfoPtr.
In the function, if is a reserved word recognized by c-defunc, so it’s not processed as the Animus special form of the same name.
In place of the C syntax for dealing with struct members, we have the struct-member form, but rather than use it straight, we more often the struct names themselves, which are macros expanding into this form, e.g. (driverPrivate pScrn) becomes (struct-member driverPrivate pScrn). As shown in the fourth line, a struct-member form can be used as the target of the = assignment form.
Is this crazy?
Assuming this whole idea of language embedding pans out, the question is how long would a full transition take? Well even if a fully workable replacement were available today for a popular language, I still expect it would take a full decade before the majority of that language’s use was done in Animus than in its native form. In the short run, though, the real potential benefit of embedding is for not so popular languages struggling with adoption. Once an audience is in place that understands Clojure/Animus, it should be much easier for that audience to learn a language semantics expressed as a library than to learn a whole new syntax and build chain.
[1] You might object that this makes it too hard to quickly identify the names being bound, but code highlighting will fix this, as we’ll see shortly.
[2] In general, I have a strong distaste for anything that complicates namespacing. Convoluted namespacing is a greatly underestimated source of complexity and confusion.
[3] This means we can’t have comments on these lines, but my current thinking is that that’s OK.
[4] Another limitation is that, if a module uses a name that is shared by two or more imported classes (e.g. two classes both with a method named foo), then that name cannot be automatically renamed because the tools can’t tell which uses of the name refer to which source. When the user wants to do a rename in such a case, they must manually resolve the ambiguities.
[5] The highlighting you see comes from the c-defunc macro, which, at analysis time, passes meta information to the environment telling it how to highlight its code.