I originally posted this as a screencast, but I figure a lot of people want to scan rather than sit through a whole 40 minute presentation, so here’s the same stuff (somewhat abridged) in text form.
In part 1, I made a negative case against the desktop interface as it currently exists, but I promised to make a positive case for my solutions. Because it would take at least a few weeks to put together a complete presentation, I thought it more timely if I instead present the ideas in installments (and hey, more reddit karma whoring this way). Most of the pushback (both constructive and vitriolic) to part 1 concerned my ideas about lists, so I’ll start there.
Lists good, hierarchies bad
Many of the most notable recent innovations in software have revolved around lists:
- Before Google, people had the idea to organize the web in a catalog, a big hierarchy of everything, e.g. the Yahoo directory. After Google, it became clear that a list of search results is far superior, and now such directories are mostly remembered with head-shaking bemusement (to the extent they’re remembered at all).
- Gmail greatly deemphasizes the notion of sorting mail into separate folders and instead organizes mail by tagging and search.
- Before iTunes and its imitators, users would play their music by navigating into folders, e.g. ‘music\artist\album\’. Today, iTunes simply presents everything in one big list that is textually filtered.
- A blog is basically any site on which new content appears strictly in a chronological list: new stuff comes in the top, old stuff goes out the bottom. So, for instance, on a non-blog like Slate.com, some attempt is made to hand-editorialize the presentation of content on the front page, as in a magazine, but on Boingboing.net, the authors just create new content and post it into the stream.1
- Link-driven sites, like Slashdot and Reddit, also revolve around lists.
- So do many social sites, like Twitter and Facebook.
The way these examples use lists differently is mainly in how they order their items. For instance, in Google search, results are ordered by relevance to the query whereas, in Reddit, items are ordered by a combination of chronology and user votes. The key lesson here is that, if you can find the right way to order and filter things, you probably are best off presenting them in just a big, flat list.
My favorite example of this is the AwesomeBar introduced in Firefox 3. The AwesomeBar filters my history and bookmarks as I type and orders items by “frecency”, a combination of the recency and frequency with which I’ve accessed the items. This means that I can type, say, ‘sl’, and my Slashdot.org bookmark will reliably appear at the top of the list. So when I want to visit Slashdot, I just reflexively type <alt+d>, ‘sl’, <down>, and <enter>. I don’t have to navigate a menu of any kind, I just act on reflex. This works so well, in fact, that I don’t use the regular bookmarks menu at all anymore.
The AwesomeBar isn’t without flaws, however. Consider that there are three different basic cases of search:
- In some cases, I know specifically what I want.
- In other cases, I only know generally what I want, e.g. I want to play some game, but I haven’t decided on a game, and perhaps I’m not sure about my options.
- In the remaining cases, I just want to browse. Sometimes this is because I’m just bored and looking for something to do, but often I browse because I just want a refresher on what things exist, e.g. I browse my calendar because I need to see if there’s anything there I’ve forgotten.
While the AwesomeBar is awesome when I specifically know what I want, it’s somewhat less awesome when I only know generally, and it’s not at all awesome when I know not at all. In particular, I want a way to browse the sites which I’ve bookmarked but haven’t returned to, because many of these urls are things I didn’t have time to consume at the time but bookmarked so as to consume at a later date.
One solution would be to perhaps create a distinct kind of bookmark for sites I intend to consume later rather than visit on a regular basis. Another solution would be to make the Firefox “library” window (“Bookmarks–>Organize Bookmarks…”) more usable and fix its behavior: currently when you delete history, the ‘last visit’ date for each bookmark is lost, meaning you can’t afterward browse just the sites which you’ve bookmarked but forgotten about.
Launching programs without hierarchies
The core mechanisms for program launching in Windows and Linux are hierarchical start menus. In Windows, an individual application is generally placed in its own folder in the start menu, but in Gnome, applications are sorted into categories. The problem is that such sorting is largely a fool’s game. Consider:
Sure I might think to look under Sound & Video when I want to burn an audio CD, but if I just want to burn data, it’s not going to occur to me to look there. Why put the disc burner there and not under Accessories? Well, in fact, Brasero is listed under Accessories too, but there it’s called CD/DVD Creator.
Why is Sound & Video one category and not two? Well that would leave us with two categories, each containing just one or two items, which would be silly.
These sorts of dilemmas tend to abound with categorization, leading us to settle for compromise solutions, such as:
- OpenOffice.org Drawing is the only OpenOffice.org app listed under Graphics and not Office.
- Evolution is listed under both Office and Internet.
- We have all this miscellaneous stuff, and, hey, it’s gotta go somewhere, so we stuff it under Accessories.
Combine these faults with the fact that many users find it difficult to mouse through cascading menus, and the end result is that people don’t like using the start menu, so we make up for these deficiencies by piling on other conveniences:
- Shortcuts on the desktop.
- The QuickLaunch menu on the taskbar.
- The system tray.2
- The recently-opened programs list.3
The one addition I really like, though, is the text search/filtering added to Vista’s start menu. This allows for AwesomeBar-like behavior, e.g. I can type ‘fi’ and hit enter to launch Firefox. Also really nice is that I can type some term and see all relevant Control Panel items whether my term strictly matches those items or not: for example, I can type “reso” and get “Adjust screen resolution” even though there’s no Control Panel item of that name.
Simplify, simplify, simplify
So the question is, might we be better off giving users just one or two mechanisms for launching programs rather than half a dozen? I believe we would, but my solution requires accepting a few somewhat unconventional premises:
- First, as I’ve already described, organizing things into categories is largely a fool’s game.
- Second, when mouse-only mechanisms seem too inefficient, designers tend to introduce additional mouse-oriented mechanisms, which not only create redundancy, these mechanisms often challenge users with poor mousing skills and almost always involve adding new screen elements. If we could somehow make keyboard interactions easier to discover and recall, we could stop trying to get overly clever with the mouse and could clean up some of our messes. I believe this is doable with real-time textual filtering and a few other tricks.
- Third, if you’re going to present a list, don’t be afraid to let it take up a proper amount of screen space so users can actually read and scan the damn thing. Some designers think big lists scare users, so they scrunch lists into small boxes, requiring the user to scroll a lot and manually resize columns. This is silly: if something is too scary for users to deal with, don’t present it at all. You aren’t helping by making the information hard to view.4
So what’s the solution? Well let’s start by flattening Ubuntu’s Applications menu out into one big list, and while we’re at it, let’s throw in the shutdown and settings items:
Too long, right? Maybe, but it seems pretty decent to me. It’s only about twice the height of the typical screen resolution, and as long as the most frequently used stuff is in the top half, is it really going to kill the user to occasionally scroll down? Besides, most users who care about efficiency will pick up the habit of opening most applications by filtering:
Here, the user types ‘w’, and so the list only shows the items matching a term starting with ‘w’; the word processor is listed first because that’s the item which the user has most frequently selected in the past when they type ‘w’. Users can also filter on terms that describe a program but aren’t necessarily in its title:
Here, the user’s query ‘ga’ matches the tag ‘game’, so the user sees all items with that tag.5
So that’s basically it. With a big filtered list, I don’t see a need for shortcuts in a QuickLaunch menu, shortcuts on the desktop, shortcuts pinned to the start menu, shortcuts to recently opened programs in the start menu, or shortcuts pinned to a dock/taskbar.6
I should note that this isn’t terribly radical, and in fact, it isn’t all that different from the direction Gnome and KDE have been heading. The Gnome shell prototype, for instance, introduces textual filtering. What I find odd, though, is that both projects seem very attached to the idea of categorized menus. Here, for instance, is a recent KDE screenshot:
In this design, the categories slide into view rather than pop out. Sadly, this make navigation among the categories no less annoying, just annoying in a different way.
If we can reduce program launching to just a big filtered list, could we do the same to the traditional menu bars in applications? Well, here’s what you get if you stuff everything from the menu bar of a moderately complicated program, Paint.NET, into one big list:
This is about the same length as our program menu, but for application controls, it doesn’t seem as acceptable. The fix is to pack things horizontally7:
The question, then, is how to add textual filtering. We could simply have matching items show up in a one-dimensional list, as usual:
Here the user types ‘b’, and so items beginning with ‘b’ show up, with the most frequently used items showing up first. Alternatively, we could simply highlight all items that match the query:
The solution I like best, though, is to combine these two such that we highlight the matching items but filter out sections without any matching items:
It may have occurred to you that this idea bares some resemblance to the Microsoft Office 2007 “ribbon” interface: just take the individual ribbon tabs, array them vertically, and add a text field on top:
(For our purposes, ignore that this is an offensively complicated array of controls. Obviously you wouldn’t want to bombard a user with something like this.)
The thing I really like about the ribbon is that, unlike the traditional menu bar, the ribbon directly contains complex controls, so a lot of stuff which would otherwise get punted into annoying dialog boxes can be done directly in the ribbon (or at least in little pop-out overlays, which aren’t nearly as annoying as dialogs). This is something menus going forward should imitate.
On the other hand, the most annoying part of the ribbon is that it’s modal: the user has to often switch the currently-viewed tab to get at a control. In contrast, with a pull-down menu, the user is always oriented at the same place (the top) every time it’s opened. I also believe that a big scroll is easier to scan and better facilitates the user’s spatial memory: more is visible at once, your eyes can track as you scroll, and everything is in clear spatial relation to everything else.
A pull-down menu obviously has a disadvantage, though. In the ribbon, related functionality tends to live together on the same tab, and the last-used tab stays visible; consequently, a lot of tab switching is avoided that otherwise would be required. In a pull down, while it’s nice that the menu is hidden when not needed, quickly repeated actions annoyingly require opening the menu (and potentially scrolling) for each action. The solution to this—without resorting to toolbars—I’ll discuss in a later installment.
Ubiquity is a Firefox add-on which adds a command line. Unlike a traditional command line, Ubiquity effectively guesses what the user is trying to say rather than requiring the user to precisely recall the full names of commands and their precise syntax, and it does this basically by treating the user-entered text as a query to filter the set of commands. In the next installment, I’ll describe how something very much like Ubiquity would work at the desktop level rather than just confined to the browser.8
One text field to rule them all
So it looks like we’re going to have a bunch of text fields in our desktop for doing different things:
- entering urls and searching bookmarks and history
- searching the web
- searching our filesystems
- launching programs
- searching application menus
- executing commands
Ideally, we could combine these all into just one universal text field such that I can just reflexively hit a keyboard shortcut, start typing, and then decide what kind of action to perform—whether a web search, a command, or whatever. I’ll discuss how this is managed in the next installment, which will primarily cover window management.
Continued in part 3 (coming soon).
- And notably, Slate has moved in recent years towards a more blog-like front page. [↩]
- The system tray, of course, is supposed to be for status indicators, but many programs end up abusing it. [↩]
- Found in the start menu since Windows XP. [↩]
- Be clear that there’s a distinction between hiding controls and hiding information: a bunch of controls, obviously, can intimidate and overwhelm a user, so it makes sense to be careful about how many controls the user sees at once. [↩]
- The user needs some sort of indication of how an item matches a query, so here, perhaps the tag ‘game’ should appear highlighted next to each item. [↩]
- A lot of people like how the OS X dock keeps an application’s icon always in the same place, allowing for reflexive program switching. As I’ll describe in the next installment, my design retains this affordance in a different way. [↩]
- In a list where the set of items changes, arraying things in two dimensions is generally bad because it means things tend to shift around in a confusing way; when the set is fixed, things aren’t going to move around [↩]
- This isn’t original, of course: Ubiquity actually derives from Enso and Quicksilver, which are basically command lines for the desktop. [↩]