Syntax does/doesn’t matter
January 31, 2007 – 8:35 amSyntax doesn’t matter: any good programmer works with multiple languages over their lifetime, most of these languages expressing basically the same ideas in mostly arbitrarily different ways; any serious student of programming will come to the same conclusion once they learn their third or fourth language.
I’ve seen this in another context, trying to learn Arabic. This Slate article explains the difficulties of this undertaking very well, and it points out what I experienced myself: reading the script is the easy part, and actually not all that hard. Excluding the elaborate calligraphies, English-speaking learners of Arabic should only need a month or two to feel reasonably comfortable quickly identifying the characters and reading them chained together. Additionally, it is surprisingly easy to adjust to reading right-to-left: after a short time, the brain simply makes the switch. (Remembering to open books from what seems like the back takes a lot longer.) Programmers develop this same ’switch’ when working with different syntaxes.
But syntax, of course, does matter on two ends: programmers have to write in a syntax, and learners have to learn a syntax. Really, when people say ’syntax doesn’t matter’, they mean that programmers—and those learning to program—shouldn’t think in syntax. This is true, for whatever the syntax, the syntax is not going to change the semantics of what the programmer codes, nor is it going to change the core concepts of a language which the student must learn.
My main purpose in proposing a new educational programming language is to give learners a language that expresses the core features common to modern languages in the syntactically most direct and obvious way possible. This principle should be applied to the libraries of the language as well, so for instance, abbreviations that are non-obvious to a non-programmer must be excised from the language, including the libraries, e.g. the language can’t have anything like C’s “stdio” (”standard input/output”) or “sprintf” (”string print formatted”). (Such abbreviations would be more forgivable if full-names were given in learning materials, but out of the dozens of tutorials and references of the C language I have read, only a couple do so.) Taken individually, shortcuts like these may seem to the initiated like small hurdles, but only because the initiated can no longer see how non-obvious such shortcuts are. The small hurdles quickly add up. Every quirk, every historical legacy, every shortcut is one more thing which at some point is going to cause the learner frustration, possibly halting their progress.
To transition learners from Pygeon (my educational language) into C and Java, it therefore makes a lot of sense to give learners intermediary languages, ones which are as free as possible of the quirks, historical legacies, and shortcuts found in C and Java. Call these intermediary languages PygeonC and PygeonJava (calling them CPygeon and JPygeon would imply we are talking about particular implementations of Pygeon, like with CPython and JPython). Syntactically starting from a base of Pygeon, these languages would be directly translatable into valid C and Java (in fact, this would be the simplest and best implementation method).
To give you an idea, I’ll discuss a few of the syntactical foibles of C and how the semantic content behind the foibles might be more obviously (though likely more verbosely) expressed in a Pygeon-derived syntax:
The first question is whether PygeonC would need to introduce the control flow features of C not found in Pygeon (this includes for, switch, do-while). You can program C just as well without these constructs as you can with them, so it seems OK to omit them. On the downside, not including these constructs delays introducing them to students until they encounter real C, but I feel this is the right choice, as PygeonC is meant to introduce learners to the concepts of C, and these constructs arguably are just syntactical conveniences.
Goto and labels, however, aren’t quite just conveniences, as they often open up significantly different ways of expressing some particular logic, so I feel their inclusion is warranted. More importantly, the fine-grained control offered by goto and labels are in the spirit of C—even though their use is best avoided in almost all cases even in real C. PygeonC’s goto statement will look just like C’s, but label’s will be declared with the label keyword, not followed by a colon e.g. label foo. Labels must go by themselves on the line preceding the statement they label.
I’ll discuss more C/PygeonC syntax in my next post, What’s the matter with C?.