Stop creating new languages

30 Aug

Every couple of months, an announcement for a new language pops up on ProgReddit or Hacker News. While some of these languages might have interesting ideas, their ideas rarely justify whole new languages, so mostly these languages seem like arbitrary remixes of existing ones. Consequently, these languages’ authors often come off a bit like crackpots: ‘Look, everyone! I’ve rearranged the bookshelves with my new classification system. Once you master it, you’ll find browsing of biographies 6% more efficient and reshelfing of autobiographies 11% more efficient! *ehem* Once you master it.’

Some observers react to this steady nuisance of quixotic pet projects by dismissing the need for better languages entirely. This is sensible in the short term because new things in programming rarely constitute big enough improvements over the day’s status quo to justify the transition costs. In the long run, however, it’s myopic: the languages and tools of today are generally significantly superior to what we were using a generation ago, so it’s not unreasonable to expect further significant advances.

In one reading of the history, though, the improvements we’ve seen in the last twenty-odd years are entirely from the realization of old ideas—automatic garbage collection, full object-orientation, functional programming, etc.–and so it’s claimed that no one has had any really new ideas for decades now. There’s something to this observation, but we still shouldn’t reject new languages out of hand:

  • First, the original formulations of the old ur-ideas prompted many practical questions, but many of our answers to these questions still remain sketchy, leaving open the possibility of more fundamental changes to come.
  • Second, while I think it very unlikely that, at this late date, someone will identify a new programming paradigm, it always seems naïve to declare the End of History and rule out any future potential for big, transformative ideas.
  • Third, and most importantly, I don’t believe languages must only advance on big ideas, for little details matter—they add up. Even if what most new languages largely do is just rearrange the furniture for the sake of aesthetics and minor efficiencies, after a few rounds of 5% improvement, you begin to see a real qualitative difference. Python, for instance, is semantically not all that different from Perl, but what a difference sane syntax makes.

So am I saying we should tolerate the crackpots? To a point. Any new language warrants major skepticism, no matter the source, but especially a language coming from an unknown. It’s for a good reason that we have a natural tendency to treat the opinions and ideas of established voices much more charitably—both in time and sympathy—than those of unknown quantities: without this bias, we’d waste a lot more time on crap than we do already, for it simply can take a lot of time and effort to discern the difference between a crackpot and someone worth listening to.

So as an unknown with something Important To Say, you must be very careful in how you present yourself and your ideas so as not to be dismissed as a crackpot. I have two pieces of advice. First off:

Don’t be a crackpot.

Obvious, perhaps, but surprising how many people miss this one. Second:

Be as clear as possible.

Only when reading a name-brand am I willing to accept that difficulties in comprehension are my own fault, not necessarily the author’s. Not so for an unknown. If James Joyce hadn’t written Dubliners, it’s doubtful anyone would ever have read Finnegan’s Wake, let alone called it brilliant.

In the particular case of introducing a new programming language, it’s especially critical to be very clear about the problems your language addresses. What’s the point of this thing? How is it supposedly actually better? Before I continue reading what you have to say, I want to know that you’re not just re-arranging the furniture.1

So it’s with full awareness and trepidation that I admit that I, too, have tried my hand at designing a programming language. Following my own advice, I’ll try to be up front about what I’m pushing: a Lisp people will actually learn and use.2

Here’s what I want in a Lisp, in order of ambition:

  • Easy to learn: The standard dialects of Lisp tend to be taught ineffectually and tend to be unnecessarily confusing. (Yes, that includes Scheme.) I won’t go in to details here, but suffice it to say that ease of learning really matters—not just for the sake of getting more people to use the language, but for the sake of getting those who use the language to truly understand it.
  • Readable syntax: As everyone knows, Lisp has a problem with parentheses. Proponents argue that you just get used to it, and this is true, but the preponderance of parentheses constitutes a lot of line noise that I believe hinders readability (and editability) even for experienced eyes. Additionally, some Lisp dialects get a bit too noisy with reader macros, such as having the apostrophe for quoting all over the place. Furthermore, I find that the irregularity of the standard indentation style of current Lisps is unnecessarily difficult for learners to grok and leaves too much to stylistic choice.
  • Syntax highlighting, code assist, and assisted refactoring: Programmers working in Java and C# have become much accustomed to conveniences that keep their code neat, that provide quick access to documentation, and that free them from having to remember minute details such as type taxonomies, function signatures, and precise identifier names. Providing those same conveniences in a dynamic language is much more challenging and error prone because something as simple as renaming an identifier often requires that the tools make risky assumptions about what’s going on at runtime. Up to now, solutions to this problem have relied upon very sophisticated code analysis that still doesn’t work right much of the time. I believe there’s a simpler solution.
  • Push-button debugging: Programmers working in Java and C# become accustomed to no-hassle debugging, where setting a breakpoint requires just a click and where the IDE takes you through the code as you step through. This level of ease is lacking in most other languages, but especially in Lisp, where macros complicate the process.
  • Embedded data: Lisp’s tree-based syntax makes it usable as a structured-data format, meaning we don’t have to punt data into a separate format, such as XML or JSON. Instead, data can be expressed in Lisp using an ordinary library rather than a special syntax that requires special processing and tools. This could spare us from perverse data languages, like XSLT, which inevitably contort into full-fledged—and crappy—programming languages. The trouble is that standard Lisp syntax doesn’t work well for data dominated by text, i.e. documents. So, for instance, while you might use a current Lisp in place of JSON, you probably wouldn’t use one in place of HTML.
  • Embedded languages: While some languages arguably shouldn’t exist at all (some haters say this about Java, for instance), other languages, like C and C++, clearly exist for a reason. But the fact that these languages fill necessary semantic niches doesn’t mean that they need their own syntaxes: instead, the right dialect of Lisp could “host” the complete semantics of a foreign language as a library. Consider a C program, which is typically written as a mish-mash of C code, preprocessor directives, and build files (makefiles, etc.). We could create a Lisp library that allows us to write C semantics in Lisp and produces the same end product (executables and object code) but which would elegantly integrate the equivalent functionality of the preprocessor and build chain in a way that is cleaner, more flexible, and easier to learn. If a way can be found for Lisp syntax and macros to provide the ideal amount of syntactical concision for all possible languages, future language designers can forget about syntax and just focus on semantic innovations.3

Now, as it turns out, the Lisp I want in all other respects resembles Clojure, so really what I’m proposing is specifically a Clojure dialect. In fact, implementation of my dialect won’t require much more than swapping out Clojure’s reader, wrapping some of its macros and functions, adding one or two data types, and creating editor assistance.

I’m calling my Clojure dialect Animus. Animus is still very much in flux, but I describe it in its current form here. Also take a look at some experiments with various languages to see what they might look like embedded in Animus.

  1. Or at least, if you are just rearranging furniture, I’d much rather you be honest about it: if you yourself realize that that’s what you’re doing, then you at least have a chance of delivering an actual—if small—improvement to the status quo. []
  2. This isn’t actually what I set out to design. When I first started thinking about a language a few years ago, my favorite language was Python, and I didn’t know Lisp, so for a long time I was simply thinking of ways to improve upon Python. At some point, I accepted the idea of prefix notation and macros, and things progressed from there. []
  3. Haskell strikes me as language that could greatly benefit from embedding in Lisp. The few times I’ve attempted to pick up Haskell, I’ve been offended by the ridiculous Perl-like syntax of ad hoc convenience piled upon ad hoc convenience. If there’s something worthy in Haskell’s semantic model, it’s obscured under a mess of syntax. []

3 Responses to “Stop creating new languages”

  1. Don Stewart September 1, 2009 at 12:00 am #

    > If there’s something worthy in Haskell’s semantic model, it’s obscured under a mess of syntax

    That’s a very superficial analysis.

    The reason it doesn’t make sense to embed Haskell in Lisp is that it is
    statically typed, so we can use that to optimize it extremely
    aggressively before compiling to native code. It undergoes full type
    erasure statically, resulting in a significant performance benefit over
    runtime dynamic typing, a la Lisp.

    http://shootout.alioth.debian.org/u64q/benchmark.php?test=all&lang=all

    i.e. 1 – 5x faster

    Finally, the worthy semantics of note is referential transparency by
    default, making parallelisation trivial. Something that’s not going to
    happen in Lisp anytime soon.

  2. Brian Will September 1, 2009 at 5:53 pm #

    > The reason it doesn’t make sense to embed Haskell in Lisp is that it is statically typed

    I don’t think you understand the idea. C is static and native compiled too, of course, but an in-memory representation of C code in Lisp can be compiled out to object code. The Lisp layer becomes just syntax and a build system, effectively.

    Languages with their own runtime are a bit trickier. If you wanted, say, to write Python in Animus but target CPython, Animus somehow has to get its hooks into running and managing CPython. Or say you’re writing Javascript in Animus but targeting the browser: there’s no way to do it, really, except to generate actual Javascript source (like GWT, I suppose). In any of these scenarios, you’re bridging over “legacy” layers, which is never ideal. The idea, though, is that we would eventually ditch these intermediate layers. But in the mean time, I admit, things would be a bit messy.

  3. Nicholas Harris September 6, 2009 at 7:18 am #

    New languages should be privately tested by their creators by applying them to the task of building whole applications, user interfaces or operating systems. If this resultant software proves to be a success (user-centred, rapidly-developed, efficient and robust) then the language-author can “come out of hiding” and tell the world what weird and wonderful homebrew language they used to make it all a resounding success without fear of ridicule.

    Just creating a freeware game with your new language as a proof of concept may be sufficient to convince others of its utility. You would at least attract potential “modders” who would be prepared to learn your language in order to script new AI behaviours, etc.

    I say this because I have been working on an entirely new language for very many years, struggling to find the exact ingredients to “bake the perfect cake”. My fear being that I will need some feature or programming paradigm in the future and curse the tools at my disposal for not supporting those semantics – something that would be made all the more infuriating by having the knowledge that I was in the ideal position to avoid all of these problems if I had only anticipated them in the design at the outset. That said, I also wanted to keep my language “small” in the sense that it could be described by a thin manual as I felt that there were many ‘full featured’ languages out there already that just seemed to have musty annexes that one didn’t come into regular enough contact with to retain a degree of mastery and which, as a result, forced you back into reading a huge reference manual.

    I don’t want to sound negative as I quite understand the desire to tell people who may be “in a position to understand what you are on about” what you have been able to achieve. I know no one who I can talk about the specific details of what I have been working on and have, eventually, come to accept this. In fact, if I were to find that someone wanted to really know I wouldn’t tell them because I would not want them to steal my ideas and implement something similar to my work long before I got around to doing it (as I am quite a slow worker). I also want my ideas to be taken as a package, rather than “cherry-picked” as it is how they all fit together that makes it significant.

    Anyway, best of luck with Animus. I can say that designing a language makes for an extremely diverting intellectual pursuit – even if there is no money at the end of it…

Leave a Reply