Archive | July, 2009

Learn to program in 20+ easy steps

22 Jul

As it stands, my comprehensive Introduction to Programming series is organized into parts as follows:

1) A first language

To program, you must learn a programming language, so we start by introducing a language called Pigeon. Pigeon is a language created expressly for students in that it features as simple a grammar as I believe possible while still reflecting the concepts found in “real” languages. Learning Pigeon first should make later tackling your first real language much easier.

2) Numbers

Computers are deeply mysterious until you understand how information is represented as bits. We start with numbers because their bit representation is used as a basis for representing other kinds of data.

3) Text

Text is represented using what are called character sets. The most commonly used character sets are ASCII and Unicode.

4) Images

In this part, we briefly discuss some elementary concepts of computer graphics.

5) Hardware and operating systems

Here we cover essential concepts in computer hardware and operating systems.

6) Languages and tools

There is a wide array of programming languages in existence. We’ll survey the most popular ones and discuss their major differences. We’ll also discuss the associated tools, e.g. debuggers.

7) The Javascript language

The popular language Javascript (not to be confused with Java, another popular language) is very close semantically to Pigeon, so it’s a natural choice for our first real language.

8) The Internet and the web

Here we’ll discuss the basic structure of the Internet and the protocols on which it runs.

10) Structured data formats

Structured data is data in which individual pieces of data (numbers, pieces of text, etc.) are related together in an organized way. A person, for instance, can be represented as structured data: a name (text), an age (number), an address (text), etc. We have standard formats for such data such as XML (Extensible Markup Language) and JSON (Javascript Object Notation), among others.

11) HTML and CSS

Webpages are documents comprised primarily of HTML (Hypertext Markup Language) and CSS (Cascading Style Sheets). We’ll also cover the role of Javascript in webpages.

12) The Unix command line shell

Before graphical user interfaces, computer users interacted with the computer using a command line shell—basically, an interactive programming language in which each command the user types is immediately executed. Today, shells are still powerful tools used by programmers and system administrators. In this unit, we’ll focus on the shell language used most commonly in Unix systems, BASH (the Bourne Again Shell). We’ll also discuss the command-line programs commonly available on Unix systems and discuss how these programs can be tied together through the shell.

13) Assembly language

Assembly language represents the lowest level of programming. Here we’ll cover assembly language for x86 processors using the NASM assembler.

14) The C language

The C programming language is one of the oldest and most influential languages still in use today. Unlike most other languages (including Pigeon and Javascript), C gives programmers a fine degree of control over the hardware, making it suitable for writing systems software (such as operating systems, like the Linux kernel) and for programs requiring high performance, such as the latest computer games.

15) Data structures and algorithms

There are only so many fundamental ways of organizing data. We’ll discuss these data structures and the algorithms associated with them.

16) Object-oriented programming

Object-oriented programming is a style of programming in which the programmer, rather than focusing on action,  focuses on establishing types of data and the actions associated with those types. This style is strongly encouraged in a number of so-called “object-oriented languages”, including Java.

17) The Java language

Since the late 1990′s, Java has been the most commonly used programming language. Java’s success has spawned an imitator from Microsoft called C# (“C sharp”), which differs in many details but is fundamentally similar. We cover Java instead of C# mainly because Java is somewhat simpler and still more popular.

18) Encryption, security, and compression

We’ll discuss the basics of encryption, security, and compression, which aren’t nearly as arcane as you might imagine.

19) Graphical interfaces

To write a program with a GUI (graphical user interface), a programmer uses a library called a GUI toolkit. We’ll focus on one such toolkit for Java called Swing.

20) Version control

When we write code, it’s very nice to be able to keep track of all of our changes such that we can always go back to an earlier version when we mess something up. It’s also really important to coordinate our changes with others working on the same code. For these reasons, programmers use programs called version control systems to manage their code. We’ll focus on two popular such programs, Subversion (abbreviated as “svn”) and Git.

21) Databases

A database is a specialized program for storing large amounts of data in a way that can be searched and retrieved efficiently. Databases are used everywhere: for instance, a popular website like uses databases to store product and customer information. The most commonly used databases are relational databases, meaning they structure data in the style of the relational model. The programs we write typically communicate with a relational database using a query language called SQL (pronounced “sequel”, Structured Query Language).

22) Regular expressions

Regular expressions (sometimes abbreviated as “regex”) are a sophisticated tool for finding patterns of characters in text. For instance, using a regular expression, I could easily remove from a text all instances of the word “curry” following the word “lemon”.

23) The Clojure language

Many regard Lisp as the most elegant of all languages, and Clojure is a particularly elegant recent variant of Lisp. Clojure gives us an opportunity to introduce functional programming, a style of programming in which we avoid “state change” as much as possible.

24) Automating the build process

The whole process of translating source code and data files into a working program is called the build process. In most software projects, we end up building the project many times over as we develop the code, fix bugs, and change features, so it makes sense that we automate this whole process as much as possible. In this part, we’ll discuss popular build tools, such as the Unix make program.

Reinventing the desktop (for real this time) – Part 1

20 Jul

Being of a presumptuous nature, I tend to get big ideas, and among those big ideas are notions of how to “reinvent the desktop”, notions which I call collectively Portals (a play on Windows).

Ain’t broke?

Before I explain Portals in detail, we should establish whether anything is really wrong at all with the modern desktop or if desktop “reinvention” is just a chimera of UI-novelty seekers. This is only prudent because, if we can’t clearly identify deficiencies of the status quo, we may fall into the trap of replacing the status quo with something not truly better, just arbitrarily different.

So let’s first consider what functionality comprises a GUI desktop. A desktop consists of:

  • An interface for starting applications, for switching between open applications, and for allotting screen space between open applications.
  • A common set of interface elements for applications, often including guidelines for the use thereof to achieve a cross-application standard look-and-feel.
  • A data-sharing mechanism between apps (copy and paste).
  • A common mechanism for application associations—what applications should be used to open such-and-such file or send a new email, etc.
  • A set of system-wide keys, e.g. ctrl+alt+delete on Windows.

And because most users don’t/can’t/won’t use a command-line, desktops include a minimum set of apps:

  • File management.
  • System configuration and utilities.
  • Program installation/removal.

Since the 1980′s, this functionality has been presented to users on most systems with only minor variations upon the standard WIMP (Window, Icons, Menu, Pointer) model handed down from Xerox PARC and the first Mac, so, obviously, the modern desktop is not really broken: people have been getting by with essentially the same design for decades now. Still, there is a perennial longing for something better, so the question is what motivates this feeling?

What’s wrong?

Scaling issues

A fundamental difference between the computing experience of 1984 and the computing experience of twenty-five years later is that users simply do a lot more with their computers: more diversity of tasks, more tasks at once, and a lot more data, both on the user’s local machine(s) and out there on the network. In particular, the window management and file management that made sense for 1984’s attention load just don’t hold up in an age of web distractions and half-terabyte hard drives.

Lack of sensory stimulation and tactile interaction

Only librarians want to live in a grey, motionless, silent world of text, but for a long time, that’s what the computing experience was. Then came icons and windows, and they could move! Quickly this novelty wore off, so today our menus slide, our workspaces spin in three dimensions, and our windows cross the event horizon every time we minimize them. And our iPhones fart.

Moreover, we increasingly expect interfaces to entertain our hands. Touch screens! Multi-touch! Surface top! Gestures! I’ll admit that these developments are exciting, but they’re exciting mainly because we don’t really know what will come of them—our hopes at this point remain still very vague. As clearly as we can define it, our hope is that computer interaction can be made satisfying in the same way that a good hit on a tennis ball is satisfying or in the same way that closing a well made car door is satisfying.

Sadly, these ideas may turn out to be like virtual reality: worlds of possibilities, none of the possibilities very useful. So we may be in just another cycle of the permutations of fashion. Still, aesthetics and feel really do matter to an extent, for a good layout of information and good use of typography tends to be aesthetically pleasing, and good tactile feel, such as proper mouse sensitivity, definitely facilitates usability.

We should acknowledge, though, that computing is no longer a dull, grey world anymore, mostly thanks to the web, not changes in the desktop. This suggests, then, that the best way forward for an aesthetically pleasing and stimulating desktop is to minimize the interface: the less screen real estate occupied by the interface’s “administrative debris”, the less there is that we need to make look good and therefore the less opportunity that we have to fail.

Administrative debris

Edward Tufte coined administrative debris to denote all of the elements of a UI not directly conveying the information the user really cares about. For instance, the menus and toolbars of most apps are almost entirely administrative debris. Such debris is problematic because:

  • Debris takes up precious screen real estate, which would be better used to present information.
  • Debris distracts the user.
  • Debris requires the user to learn its layout and how to navigate in and around it.
  • Debris is aesthetically displeasing and intimidating because it suggests complexity, both in terms of information clutter and conceptual difficulties.
  • Debris often has to be managed by the user, thereby creating more “meta work”.

Meta work

Meta work is any work which the interface burdens upon the user in addition to the user’s actual work. Meta work is terribly displeasing, the mental equivalent of janitorial work.

Some meta work is hard to imagine getting rid of, such as scrolling through a list of information, for if we really intend to present more information than fits on screen, the user must scroll or page through it somehow. Most interface meta work, however, comes from two sources:

  • Positioning things and navigating. In particular, moving and resizing windows and navigating through menus and dialogs. This also includes any kind of collapasble or adjustable information display. I find file browsers, for instance, to require constant adjustment because the directory tree view and the columns of the grid view are half the time either too wide or too narrow.
  • Debris. When the debris can’t all fit on screen at once, we require mechanisms for the user to manage the debris. The Office 12 ribbon, for instance, requires the user to manage which strip of controls he is viewing at any moment.

Most disconcerting, meta work perniciously tends to beget more meta work because the mechanisms introduced to manage information and controls often themselves take up space and require management.


Interactions with information through debris are indirect, so Tufte’s general prescription for minimizing administrative debris and meta work is to make interactions with information direct. For instance, rather than editing properties in a dialog, users should directly edit those values in some screen element directly attached to the affected object or, ideally, directly edit the object itself.

Direct interactions also have the virtue of being generally more obvious how to do than indirect interactions. On the other hand, most users aren’t familiar with direct interactions as a convention, so it may not occur to users to try them.


Because we must hide a lot of things for the sake of limited screen space, a lot of information and administrative debris gets buried into hierarchical trees, meaning users end up spending a lot of time and mental energy navigating (which is really just another kind of meta work). For instance, to change my mouse settings in Windows, I follow the chain Start->Control Panel->Mouse. Or, say, to open a file, I must recall its drive, its directory path, and then finally its name. This hierarchical recall—and the ensuing navigation action—is mentally taxing and error prone.

The usual justification for using a tree is to avoid stuffing everything into one big flat list, but this is generally a misguided tradeoff. Consider a typical hierarchical menu, first in the usual pull-down/pop-out configuration, second in one big scrolling list with divisors between sections. Which is easier to learn? Which is easier to explore? Which is easier for recall? I believe you’ll find the flat list is better on all measures but perhaps one: a long list may be a bit intimidating on first glance compared to a hierarchy that hides the items in submenus by category.

(Actually, the flat list may be better even on this count because a menu which hides complexity is daunting in its own way: the user browsing such a menu quickly finds lots of complexity which they’ll have to recall how to find again later. Besides, the “first contact” shock of a long list can be mitigated with visual design that appropriately emphasizes the right elements. So flat lists arguably win on all counts.)

Now consider file hierarchies. Rather than having to remember that your Twin Peaks / Doctor Who crossover fan fiction is stored as e:/fanfic/twinpeaks_doctorwho.txt, it would be far better if you could just textually filter down by a query for twin peaks who or any other query terms that occur to you by free association. In fact, it would be nice when creating the file if you didn’t have to decide between twinpeaks_doctorwho.txt and doctorwho_twinpeaks.txt and didn’t have to decide whether to place this file in fanfic or some other directory. The lesson here is that:

  1. Hierarchical recall is mentally taxing and error prone. What we really want is free-associative recall.
  2. Hierarchical naming and placement are mentally taxing and error prone. What we really want are tagging and full-text search.

(See Clay Shirky on hierarchy.)

Frustrating discovery and recall

Perhaps the biggest frustration in using software is knowing what you want the software to do and knowing that your software can do it but not being able to figure out how to get the software to do it. These frustrations typically stem from an an inability to guess what the developers decided to name a feature and where the developers decided to place the feature in a hierarchical menu or dialog chain. For instance, the user looking for a program’s options dialog has to guess whether to look for File->Preferences, Edit->Preferences, Edit->Options, Help->Options, Tools->Options, or some other path.

The general solution here is, again, a big, flat list filtered by textual query. Like disambiguation pages and redirection in Wikipedia, a single item should be associated with any synonyms so that users need not recall the single precise name favored by the developers, e.g. preferences should show up in a query for options and settings.


Thinking up features is easy, but thinking up features that obviate other features is hard. Moreover, once a feature is added to a program, it takes a lot of political will to remove it. Consequently, many interfaces are laden with redundancy.

A degree of redundancy often serves a legitimate purpose, for many tasks should be equally doable by either keyboard or mouse, and common tasks often warrant shortcuts that make up in convenience what they lack in discoverability. In many cases, though, designers have simply let redundancy proliferate unchecked. A typical Windows application, for example, presents the user with at least four ways of closing the application using the mouse:

  • Via the X in the top right.
  • Via the right-click menu of the window on the taskbar.
  • Via the icon menu in the top left.
  • Via the menubar.

Additionally, users can close an application using the keyboard:

  • Via alt+F4.
  • Via ctrl+w
  • Via accelerator keys for the icon menu.
  • Via accelerator keys for the menubar.

That makes at least eight ways to close an application. This particular case of redundancy is maybe not so bad because most users have a favored method which they use by reflex, but the redundancy still clutters the interface, not just in screen space but in documentation space and mental space.

At its worst, redundancy isn’t just clutter, it’s more meta work heaped upon the user. Not only are such choices more management work, the bother of having to make these choices often lingers on the user’s mind. As Barry Schwartz discusses in The Paradox of Choice, choices are often a hidden source of unhappiness: when presented with a choice, people fret because they want to believe that the choice has a correct answer, even when none exists and even when the disparity of outcomes is inconsequential.

Most choices in interfaces impose very small burdens individually, but together they add up, and too often, designers underestimate this burden of choice. When users are making little choices optimizing for the best way to do something, it’s quite likely that the interface should be making these choices for them.

Thwarted reflexes

The opposite of making a choice is to act upon reflex. Enabling good reflexes and consistently rewarding them gives users a very satisfying feeling of control.

Ideally, a good reflex action should be context-free, meaning it shouldn’t require a particular desktop or application state. For instance, alt+tab is a desktop-level reflex that is supposed to work in all contexts such that, at any time, the user can hit alt+tab to get back to the window that last had their focus. Unfortunately, this reflex doesn’t work in some contexts, such as in some fullscreen games that either don’t respond to this command or only do so very slowly. Another aggravating example is Flash in the browser, which often steals keyboard focus and thus blocks the alt+d, ctrl+k, and ctrl+t commands.

Some reflexes, though, users pick up like bad habits. In Windows, I’m in the reflexive habit of hitting windows+e every time I wish to browse to a folder even if I already have that folder open as a window, thereby creating more meta work for myself in the form of another folder to close. A better designed reflex action would get me to my desired folder while somehow avoiding this duplication, for well-designed reflex actions don’t lead users down the wrong path.


Because hierarchies suck, designers frequently provide shortcut paths to various nodes in hierarchies. For instance, file dialogs in Windows Vista provide shortcut buttons to standard directories like Documents and Pictures. Or, for example, the display settings in Windows can be accessed via right-clicking the desktop rather than going into the Control Panel, but both paths take you to the same dialog.

The problem is that this virtuality not only introduces redundancy, it presents an inconsistent and disorienting picture to users and burdens them with more arbitrary crap to remember. Virtuality makes hierarchies more confusing, not less, because the same “shape” is presented in many different alternate forms, obscuring the “true” shape and thereby hindering discovery and spatial recall. Furthermore, when the user can’t picture at least the outline shape of the possibilities open to her, she feels surrounded by hidden pitfalls and paralyzed by choice.

Textual search is technically a virtual kind of access, but it doesn’t share these problems. If I access my Doctor Who / Twin Peaks crossover fan fiction by searching for who peaks, this isn’t another bit of arbitrariness for me to have to recall later, it’s just the set of terms that occurred to me at the moment by free association.

Burdened and stolen focus and attention

There’s a word for a person who repeatedly calls your name and taps you on the shoulder: annoying. We also have a word for someone who tries to hand you something when your hands are full already: asshole. So it’s not surprising that the most commonly cited interface annoyances are those obnoxious little pop-up windows that demand your attention and steal your keyboard focus.

Obviously, having your attention actively stolen is bad. Less obviously, meta work in all forms steals attention, but usually passively and in small chunks: after all, attention focused on meta work is attention taken away from actual work.

If there’s something many people feel increasingly short on in the networked world, it’s attention. A well-designed interface enables the user to focus on their own actual work, switching between tasks with little friction.

All your conventions suck

Now let’s get into some concrete criticisms of actual mechanisms commonly used today:

Icons suck

To a large extent, icons exist just as an excuse for designers to introduce eye candy, but the usual justification designers give for using icons is the truism that ‘simply having users point at the very thing they want is the simplest and most intuitive kind of selection.’ This is misguided:

  • Pictographs do not scale as well as text because you can’t alphabetize or do searches on images.
  • As you add more and more icons, the visual distinctiveness of each icon quickly gets murky and ambiguous.
  • Icons are generally not “the very thing” that users are looking for. A pictograph typically provides hints about the thing it represents but is not synonymous with the thing itself.
  • Worst of all, interpreting pictographs is more mentally taxing than reading a word or two, especially when the semantic content is even mildly abstract.

The crux here is that it is far easier for people to recall the general qualities of a picture—its dominant colors and overall shape—than it is to recall its precise details. Also, compared to abstract images, images of recognizable objects are much easier to recall details of because we can mentally fill in the blank spots with our assumptions of what such objects look like. For instance, if shown a picture of a car, a viewer immediately discerns the notion of a car, not because the viewer quickly absorbs all the visual detail but because she immediately registers a few key details and then her mind fills in the missing pieces. This explains why most icons in software are so bad: most icons found in software are small, indiscernible messes, so users fail to recognize what the icons depict and learn to think of them as abstract shapes.

Now suppose I know what I want my software to do but don’t remember at all how the interface designers decided to label that function with text or an icon. If I’m looking for a label, I have to figure out what words the designers chose to describe it, which often requires consulting my mental thesaurus. In contrast, if I’m looking for an icon, I have to figure out what words the designers chose to describe the feature and then figure out how the designers chose to represent those words as an image. While the number of synonyms for a particular concept can be frustratingly many and elusive, the number of visual representations for a concept are innumerable: even if you narrow down the concrete object(s) being depicted, there are still the variables of perspective, composition, style, and color.* Moreover, users can always fall back on actually reading a list of words till they find a likely match; this is reasonably doable, in contrast to “reading” a list of icons, which is painful and slow.

* (Sure many real-life objects only come in one color, but many don’t. In fact, looking over the icons in a few applications, I notice that a strong majority have basically random color assignments, either because of the nature of what they depict or because of the need to make them stand in contrast to their neighbors.)

To the extent you do use icons, follow these guidelines:

  1. All but the most frequently encountered icons should be labeled by text. Many applications omit text labels because small, unlabeled icons allow for buttons that minimize space use (see Photoshop). This is a poor trade off. First of all for the sake of image recall outlined above, but also because even the best designed icons rarely communicate their function as clearly as a word or two of text. In fact, the real virtue of icons is that their shape and color make them noticeable to peripheral vision or visual scanning, so they help users find points of focus and do an initial culling of their possible options. After that initial culling stage, however, users have only narrowed their options and so prefer the relative precision of words to help them make their final selection.
  2. Icons should be simple in shape, distinct in silhouette, have contrasting interior lines, and almost never use more than two dominant colors.
  3. Icons should be as big as necessary to make them conform to rule 2.
  4. The number of icons that it is acceptable to use is proportional to how large and distinct they are, vis-a-vis rules 2 and 3. The array of icons found in today’s typical complex apps, like word processors and Photoshop, is too many by a factor of about three.

Icon view sucks

Compared to the detailed-list view of files, the icon view is a paragon of form over function. Not only should icon view not be the default folder view, icon view should not exist. It’s flat out stupid. Not only is the browse-ability of a list in one dimension far superior to a list in two dimensions, a two-dimensional listing must be rearranged when the view width changes, meaning icons end up changing their horizontal positions, thereby disorienting the user and thwarting his spatial recall.

(A thumbnail view of pictures is a special exception to this rule.)

Thumbnail previews suck

Continuing with the theme of pictures being a false cure-all, thumbnail previews of windows and tabs rarely justify their use:

  • First, most such previews are triggered by a delayed reaction to a mouse hover, which tends to mean they pop up too soon one half the time and too slow the other half.
  • Second, even with great anti-aliasing, a two or three square inch representation of a full window or tab is often just too small to make out clearly.
  • Third, most documents and tabs are comprised mainly of text and so very often look pretty much the same, especially when shrunk down to a small preview.
  • Fourth, the user may expect to see one portion in the scroll of a document and so not quickly recognize the document if another portion is shown in the preview.

For previews to be worth the mental burden, they need to be instant and large, perhaps even full-sized.

Animations suck

Currently, much work is going into GUI toolkits to make it easy to add UI animations, such as having elements that slide around. The inevitable problem with animations, though, is that they introduce action delays and so must be kept very short, and yet the shorter the animation, the more the animation defeats its original intent, which is to convey to users where elements go to and come from. (See Philip Haine’s critique of Apple FrontRow)

Settings management sucks

Desktop settings management exhibits virtuality gone mad. On the one hand, Windows has Control Panel and Gnome has a Settings menu—central places to do configuration—but centrality is deemed too inconvenient for some cases, so we sprinkle special access mechanisms ad hoc throughout the desktop. In Windows 7, for instance, the start menu includes both Control Panel and Devices and Printers even though Devices and Printers is just an item in the Control Panel. Or, for instance, the Network and Sharing Center is an item in the Control Panel, but it’s also accessible via Network in the left panel of the file browser. Worse, some settings are not found in the Control Panel at all, e.g. folder options are in Tools–>Folder Options of the file browser but not in the Control Panel. Most ridiculous and aggravating, though, is how these ad hocisms change with each release such that the user’s hard-learned arbitrary nonsense becomes useless. In the end, the path to every setting becomes an ad hoc incantation, a little piece of version-specific arcana to document in user manuals with a dozen screen shots.

The Desktop itself sucks

Interface design is largely about rationing precious screen real estate, and…

…hey, everyone! Here’s this big blank surface going unused! Let’s give it a random assortment of redundant functionality to make up for the inadequacy of our main controls! Sure, the start menu already has a frequently-used program list, but it’s too orderly. And users already have a home directory, but they can’t see its contents at the random moments that their un-maximized windows are positioned just so. Users love messes! Hmm, now we just need umpteen different special mechanisms for hiding all these windows that obscure this precious space.

*Ahem*…yeah. Put another way:

  • The desktop creates clutter by encouraging people to use it as a dumping ground for files.
  • The desktop contains ‘My Computer’ but itself is contained by ‘My Computer’. Well done, Microsoft, for helping make the concept of files and directories clear, and so much for the metaphor of files as physical objects (which isn’t a good metaphor to begin with, but if you’re trying to go with a metaphor, stick with it).
  • The desktop as a working surface necessitates mechanisms to get at it easily from behind all of these damn windows.
  • The desktop compensates for inadequacies of the start menu and file browser by duplicating some of their functionality, so users are presented with the silly choice of whether to put an application shortcut or file on their desktop and/or in the start-menu/dock, and then later they have to remember where they put it and possibly make an arbitrary choice of which to use.

Menu bars suck

The drop-down, pop-out style of menus found in application menu bars are optimized for minimal obtrusiveness (both in terms of visible space and visibility time) and for minimal mousing (both in terms of motion and clicking). Unfortunately, these optimizations are ultimately inadequate:

  • First, as most applications have conceded, users simply don’t like using the menu bar for frequent accesses, so applications add redundant shortcuts, such as toolbars, for frequently used items.
  • Second, many users find mousing through these menus frustrating despite refined mousing affordances.
  • Third, these standard menus have an artificially limited vocabulary—both visual and functional (e.g. sliders and textfields can’t be menu items*)—so all but the simplest features get shunted into pop-up dialogs.

* (Clicking an item is supposed to dismiss the menu overlay every time, which wouldn’t work for textfields or sliders as items.)

Worst of all, menu bars are not only hierarchical, they present their hierarchy confusingly: their various menus and submenus overlap and flash in and out as the user mouses, and because floating dialogs are untethered from the items which open them, users quickly forget how to get back to dialogs.

Context menus suck

Pop-up context menus suffer most of the same ills as menu bars, and they introduce redundancy. In Firefox, for example, the context menu of the page includes back, forward, reload, stop, and several other items also found in the menu bar.

On the plus side, a context menu doesn’t suffer from the same hierarchical recall problems as menu bars (unless the context menu includes many submenus). However, each context menu effectively presents a virtual view into the menu bar: the menu bar is where all my controls live, but right-clicking different things shows me different mixes of those controls, and sometimes it even shows me things not in the menu bar. This virtuality is bad for all the reasons discussed above.

Dialogs suck

Developers love dialogs because dialogs allow developers to avoid hard decisions of positioning and sizing. Don’t know where to place a feature? When in doubt, stuff it into a dialog.

Yet most users hate dialogs:

  • First, navigating to dialogs is often a frustrating discovery, recall, and mousing process.
  • Second, dialogs not only steal focus, they often block interactions with their parent windows.
  • Third, dialogs have a tendency to get lost behind other windows because they’re generally small and don’t show up in the taskbar list.
  • Fourth, it’s often unclear how users should close a dialog. For instance, clicking X in the top-right is sometimes effectively the same as clicking cancel but sometimes effectively the same as clicking OK.

If there’s anything worse than a dialog, it’s a dialog spawned from another dialog. Thankfully, most of today’s applications have learned to avoid that particular sin.

Toolbars suck

Application developers resort to redundantly placing menu bar items in toolbars mainly because menu bars suck. The redundancy this introduces is aggravating enough, but on top of this, toolbars usually consist mainly of icons (which, recall, also suck), and just like menu bars, most toolbars artificially restrict themselves to simple buttons and thereby end up punting complexity into dialogs. Triple suck score.

In simple applications, like web browsers, the redundancy is not so bad, but as applications get more complex, the number of convenience icons tends to grow (think Word or Photoshop) until the redundancy becomes a nuisance to both newbie users and experienced users alike: newbies find the preponderance of overlapping choices confusing and distracting; experienced users find repeatedly making the arbitrary choice of whether to look in the menu bar or toolbars bothersome and distracting.

The taskbar sucks

Like the web browser tab bar, the taskbar suffers from an intractable dilemma: in the horizontal configuration, it scales poorly past more than 7-9 items; in the vertical configuration, more items fit naturally, but each item has less space for its title unless you’re willing to make the bar a few hundred pixels wide. Widescreen monitors alleviate the space problem in both configurations, but not sufficiently to dissolve the problem.

The start menu sucks

Since Windows 95, the start menu has been arranged in a hierarchy of aggravating pull-out menus, with each program typically getting its own folder. Vista has sensibly moved towards textual query over a flat list, but the flat list is only flat-ish because folders remain. Not only do the folders mean that most items in the list have unhelpfully identical folder icons, virtually all folders have no reason for being: I don’t need a folder that contains X and Uninstall X, for if I want to uninstall X, I’ll use Programs and Features in the Control Panel like I’m supposed to; if a folder contains items other than the program itself, they can simply be their own standalone items or can simply be moved into the application menu or application splash dialog (World of Warcraft does this).

So if I had control of the Windows 7 start menu, I would simply:

  • Put every item in one big scroll such that you get rid of All Programs.
  • Get rid of folders.
  • Add section dividers.
  • Make the whole menu taller, if not the whole height of the screen, and make the program list section wider so that long names are more presentable.
  • Put the items in the right-side of the menu into the left or simply get rid of them, e.g. Shut Down and Control Panel get put in the program list. (If users really need to access these features so quickly—which I don’t think is the case—just add shortcut keys.)

You might object that getting rid of categorical hierarchy means programs can’t be browsed by type, but this is not really the case. First, programs should be arranged into appropriate sections with titles. Second, when menu items are textually filtered, they can be filtered on tags as well as names, e.g. filtering on game should show any game program whether or not it’s in the section games or has game in its title.

Application windows suck

The primary reason to put applications in free-floating windows is so that users will be able to put applications side-by-side, even though doing so is, in truth, at best a niche use case. The problem is that positioning and sizing windows takes a lot of bothersome meta work, especially when maximizing a window’s space usage.

Furthermore, window overlap requires the user to make annoying random choices of how to get at a particular window. Shall the user move or minimize other windows to get at the window underneath? Or should the user alt-tab directly to the window? Or use the taskbar/dock?

In the end, windows burden users with meta work and unnecessary choices for virtually no real benefit. Of course we should have the capability to see applications side-by-side, but we shouldn’t build the whole desktop around the idea.

Drag-and-drop sucks

For drag-and-drop to work efficiently, the drag source and drop target must be in view, but this is very rarely the case without burdensome pre-planning on the user’s part, especially when dragging from one application to another. Nearly as bad, users often mess up drags because drop targets are often unclear or finicky, resulting in unintended actions that must be undone. Users also sometimes simply change their mind mid-drag but are given no obvious way to safely abort the action. Finally, drag-and-drop actions are often poorly discoverable. In iTunes, for instance, the only way to move individual tracks to a device is by drag-and-drop, which many users fail to figure out on their own.

Virtual desktops suck

Floating application windows suck, hierarchies suck, and the desktop itself sucks, ergo virtual desktops suck. (And note how virtual desktops make drag-and-drop suck even more than it already does.)

Gadgets/Widgets/Gizmos/Plazmoids/Desklets/Applets all suck

Application windows suck and the desktop itself sucks, but applets are fucking ridiculous.

OK, I’ll walk that back a bit. Little status/info panel thingies? Fine, but let’s neatly organize them into some proper window rather than dump them onto the desktop surface (which, recall, needs to die).

If an applet is something the user actually interacts with at length, such as a game, there’s no reason whatsoever not to make it a proper application.

Wrong track

Before finally laying out Portals, let’s examine the good and bad interface reform ideas currently in circulation. First, the bad ideas follow four general themes:

Eye candy

Elitism is an essential part of human aesthetics. For instance, while we normally think of the criteria that make a good-looking person good-looking as objective, much of the attraction towards that person hinges on the rarity of their looks, not the looks themselves, per se. Similarly, gold is shiny, but an essential part of its worth is its rarity.

We see this in graphic design as well: what we consider stylish design hinges a lot on what is simply hard to duplicate. In the 60’s, this meant curved plastic furniture; in the 80’s, this meant cheesy computer video effects; today, this means web pages with rounded corners and glossy effects.

On the desktop, today, elite style means using hardware graphics acceleration because, five years ago, no desktop had it. As it stands right now, none of the major desktops have totally sorted out the infrastructure to make acceleration work ubiquitously, nor has the software caught up to make use of the new toy.

The trouble is that the set of new possibilities which acceleration opens up includes a lot of distracting, silly ideas which actually detract from usability. The obvious example of falling into this trap is Compiz and similar projects. Even aside from the purely aesthetic toys in these projects (such as drawing flames on the desktop), many of the features clearly exist purely for the sake of ooh…shiny.

Virtual physicality

Graphics acceleration has also led designers to create physical-simulation abominations like 3D desktops. Examples include:

This review of Real Desktop sums up the problem:

We can’t count the number of times we wished our Windows desktop was as messy as a regular desk. You know, because we’ve never really wished for that. But that’s exactly what Real Desktop lets you do. Oh yeah, it also turns your desktop into a 3D workspace.

While the 3D desktop is certainly pretty, we’re not sure it’s particularly useful. You can move icons around the screen with a left click. Click both of your mouse buttons to “pick up” an icon, or click the edge to rotate it. Probably the most fun you can have is when you highlight a bunch of icons and then drag them into another group of icons and watch them scatter like bowling pins.

Of these desktops, Grape is the least offensive because it mainly sticks to two dimensions, but it still exhibits everything bad about icons and drag-and-drop and imposes a heap of meta work upon the user in the form of innumerable icons, boxes, and text labels to create, position, and manage.

After a little thought and experimentation, it should be evident that treating virtual things as if they are like physical things is satisfying only up to the point where it becomes maddening, for the physical world simply does not scale the way the virtual world can. Sure, these desktops look neat and manageable when you have a couple dozen files, but who has just a couple dozen files anymore?

Manual, transitory organization

When people work in a physical space, they develop organization habits and strategies to cope with the mess of things before them. On your desk, for example, you might keep your personal stuff segregated from your business stuff, which makes sense because, as you work in one domain, you don’t want interference from another domain.

In the virtual world, however, such interference is not a problem: if I don’t have personal documents open at the moment, they don’t in any sense get in the way of the business documents I’m working on. If I do have a personal document open, presumably it’s because I’m switching my attention back and forth to that document. If I were to segregate my current items of attention, I wouldn’t solve the problem that I simply have only one focus of attention to give.

Interfaces that allow users to group or order items for the sake of coping with their number are imposing meta work on the user. Worse, grouping introduces hierarchy such that, to select an item, the user first must recall what group it’s in.

These burdens on the user often make sense when the user is organizing persistent state (e.g. files), but not transitory state. So, for instance, users shouldn’t order their browser tabs and group them into separate browser windows. Rather, the interface should automatically help users cope with dozens of open tabs in a way that obviates this manual work.

Half of the new interface design proposals I see assume that users would like doing manual, transitory organization, I think because the idea seems like it reflects the “natural” way people think and work. This probably stems from a sort of grass-is-greener fallacy: having worked on computers for so long, people begin to feel they’ve lost the virtues of physical paper work, forgetting why they moved away from paper in the first place.

Special pleading

In many desktop and web browser proposals, certain often-used applications and often-used sites are given special priority, usually in the form of convenient-access mechanisms. For instance, a number of design proposals for GNOME and netbook Linuxes elevate personal contacts—IM, email, address book, etc.—to first-level status on par with applications and file directories. Such proposals may have a proper motivation, for perhaps our current general mechanisms really don’t suit a particular common task or workflow. However, we should always try to rethink our general mechanisms before introducing special cases. For one thing, special exceptions tend to please one set of users to the great annoyance of others. For another, each exception is a design complication that all users must learn (or at least learn to ignore) and which inevitably becomes a barrier to change.

Steal from the best

Despite what the previous six-thousand words might convey, I don’t actually hate everything. In fact, Portals largely synthesizes a number of ideas from existing stuff, the most notable being:

  • The Firefox AwesomeBar
  • Quicksilver/Enso/Ubiquity
  • Wikipedia, Google, and various other sites

The things these examples do right fall under a few general themes:

  • Responsive, text-based navigation and action (e.g. search, text links, and commands)
  • Tags, not hierarchies
  • Lists sorted by recency and frequency
  • Chrome-minimal design
  • Typography-focused design

Having already trashed the alternatives, I won’t give these ideas detailed justifications, but “typography-focused” requires some explanation:

Whether you like the term Web 2.0 or not, we definitely did see a quiet revolution in web design somewhere around 2002. This new style is associated superficially with rounded corners and shiny gloss, but there’s more substance to it.

In the web’s first decade, designers strove to imitate magazine layout, wherein eye candy is stuffed into an asymmetric grid of boxes surrounded by cluttered, omnipresent headers and navbars. This style was motivated mainly by:

  • An aversion to simple flow layouts. No self-respecting designer wants their stuff to look like a Geocities page. By fighting the natural bias of HTML/CSS for flow layout, you get a look that’s hard to reproduce and therefore “professional”.
  • An inability to decide what’s really important. Business people in particular have a hard time coming to terms with that fact that, for some things to stand out, other things must be deemphasized. Of course you want visitors to partake of all your wares, but what do visitors want?

Today, good web design is typified by generously spaced and well-formatted text in one, two, or, occasionally, three columns that are allowed to flow down the page rather than divided into unnecessary widget boxes. Some good examples are:

To be clear, “typography-focused” doesn’t always mean ditching images and widgetry in favor of more text. Take for example, which is not an exemplar of the new style but exhibits subtle improvements when you compare of 2009 to of 2000. Like many shopping and portal sites, Amazon still retains much of a cluttered magazine layout, but you can see how the site today better uses images, colors, boxes, and spacing to avoid a ‘mass-of-text’ look.

The point here is that typography is about the complete presentation of the text—its context—not just the text itself. When text is presented well, you can do more with it, as many web designs in this decade have shown.

First principles

Before finally getting into the actual design of Portals, I’ll summarize the design philosophy in four slogans:

Don’t make me think

The title of Steve Krug’s book, Don’t Make Me Think, works as a great design mantra because it succinctly states that:

  1. The most important thing in interface design is the user’s thought process.
  2. Users would rather not have a thought process.

Obvious, perhaps, but easy to lose sight of when caught up in design details.

The explanation is the design

Is your design hard for users to understand? Does user proficiency hinge upon hours of practice and study? The best way to answer these questions is to start by writing the manual. Sometimes this will lead to changes in design, but often all that’s required are some changes in wording or terminology. In any case, your first concern should be how to explain the design to its users, not other designers and programmers.

The right features and only the right features

As I stated above in passing, it’s easy to devise new features, but it’s hard to devise features that make other features unnecessary.

It’s not worth it

Lastly, ‘it’s not worth it‘ is a handy, all-purpose way for me to shout down anything I don’t like:

Me: It’s not worth it!
You: What’s not worth it?
Me: It!

But the mantra has a non-abusive purpose as well. Ask yourself, say, ‘Why have we stuck with menu bars for so long?’ Well, when anyone argues that menu bars suck, the perfectly correct reply comes back that a menu bar is the optimal way to minimize mousing over an hierarchy of things. The problem is that this argument hinges upon the hidden assumption that efficiently mousing over hierarchies is of primary importance. Such hidden assumptions are the “it” to which I refer. What you think is so important perhaps isn’t.

‘It’s not worth it’ also works for cases where users themselves lose track of what’s really important. For instance, I advocate getting rid of the desktop surface, but I just know some people will object. ‘Users love wallpapers,’ they’ll say, never mind that wallpapers exist solely to (literally) paper over an unnecessary problem. The proper reply here is that good design requires balancing users’ desires to give them what they really want, and sometimes that means disregarding some desires for the sake of others.

Continued in part 2.


4 Jul

Did Kristol ghostwrite Palin’s speech?:

The odds are against her pulling it off. But I wouldn’t bet against it.

That’s some shrewd betting strategy, betting on the outcome you yourself find least likely.