Reinventing the desktop (part 2): I heard you like lists… [text version]

3 Aug

I originally posted this as a screencast, but I figure a lot of people want to scan rather than sit through a whole 40 minute presentation, so here’s the same stuff (somewhat abridged) in text form.

In part 1, I made a negative case against the desktop interface as it currently exists, but I promised to make a positive case for my solutions. Because it would take at least a few weeks to put together a complete presentation, I thought it more timely if I instead present the ideas in installments (and hey, more reddit karma whoring this way). Most of the pushback (both constructive and vitriolic) to part 1 concerned my ideas about lists, so I’ll start there.

Lists good, hierarchies bad

Many of the most notable recent innovations in software have revolved around lists:

  • Before Google, people had the idea to organize the web in a catalog, a big hierarchy of everything, e.g. the Yahoo directory. After Google, it became clear that a list of search results is far superior, and now such directories are mostly remembered with head-shaking bemusement (to the extent they’re remembered at all).
  • Gmail greatly deemphasizes the notion of sorting mail into separate folders and instead organizes mail by tagging and search.
  • Before iTunes and its imitators, users would play their music by navigating into folders, e.g. ‘music\artist\album\’. Today, iTunes simply presents everything in one big list that is textually filtered.
  • A blog is basically any site on which new content appears strictly in a chronological list: new stuff comes in the top, old stuff goes out the bottom. So, for instance, on a non-blog like Slate.com, some attempt is made to hand-editorialize the presentation of content on the front page, as in a magazine, but on Boingboing.net, the authors just create new content and post it into the stream.1
  • Link-driven sites, like Slashdot and Reddit, also revolve around lists.
  • So do many social sites, like Twitter and Facebook.

The way these examples use lists differently is mainly in how they order their items. For instance, in Google search, results are ordered by relevance to the query whereas, in Reddit, items are ordered by a combination of chronology and user votes. The key lesson here is that, if you can find the right way to order and filter things, you probably are best off presenting them in just a big, flat list.

My favorite example of this is the AwesomeBar introduced in Firefox 3. The AwesomeBar filters my history and bookmarks as I type and orders items by “frecency”,  a combination of the recency and frequency with which I’ve accessed the items. This means that I can type, say, ‘sl’, and my Slashdot.org bookmark will reliably appear at the top of the list. So when I want to visit Slashdot, I just reflexively type <alt+d>, ‘sl’, <down>, and <enter>. I don’t have to navigate a menu of any kind, I just act on reflex. This works so well, in fact, that I don’t use the regular bookmarks menu at all anymore.

The AwesomeBar isn’t without flaws, however. Consider that there are three different basic cases of search:

  • In some cases, I know specifically what I want.
  • In other cases, I only know generally what I want, e.g. I want to play some game, but I haven’t decided on a game, and perhaps I’m not sure about my options.
  • In the remaining cases, I just want to browse. Sometimes this is because I’m just bored and looking for something to do, but often I browse because I just want a refresher on what things exist, e.g. I browse my calendar because I need to see if there’s anything there I’ve forgotten.

While the AwesomeBar is awesome when I specifically know what I want, it’s somewhat less awesome when I only know generally, and it’s not at all awesome when I know not at all. In particular, I want a way to browse the sites which I’ve bookmarked but haven’t returned to, because many of these urls are things I didn’t have time to consume at the time but bookmarked so as to consume at a later date.

One solution would be to perhaps create a distinct kind of bookmark for sites I intend to consume later rather than visit on a regular basis. Another solution would be to make the Firefox “library” window (“Bookmarks–>Organize Bookmarks…”) more usable and fix its behavior: currently when you delete history, the ‘last visit’ date for each bookmark is lost, meaning you can’t afterward browse just the sites which you’ve bookmarked but forgotten about.

Launching programs without hierarchies

The core mechanisms for program launching in Windows and Linux are hierarchical start menus. In Windows, an individual application is generally placed in its own folder in the start menu, but in Gnome, applications are sorted into categories. The problem is that such sorting is largely a fool’s game. Consider:

sound and video

Sure I might think to look under Sound & Video when I want to burn an audio CD, but if I just want to burn data, it’s not going to occur to me to look there. Why put the disc burner there and not under Accessories? Well, in fact, Brasero is listed under Accessories too, but there it’s called CD/DVD Creator.

Why is Sound & Video one category and not two? Well that would leave us with two categories, each containing just one or two items, which would be silly.

These sorts of dilemmas tend to abound with categorization, leading us to settle for compromise solutions, such as:

  • OpenOffice.org Drawing is the only OpenOffice.org app listed under Graphics and not Office.
  • Evolution is listed under both Office and Internet.
  • We have all this miscellaneous stuff, and, hey, it’s gotta go somewhere, so we stuff it under Accessories.

Combine these faults with the fact that many users find it difficult to mouse through cascading menus, and the end result is that people don’t like using the start menu, so we make up for these deficiencies by piling on other conveniences:

  • Shortcuts on the desktop.
  • The QuickLaunch menu on the taskbar.
  • The system tray.2
  • The recently-opened programs list.3

The one addition I really like, though, is the text search/filtering added to Vista’s start menu. This allows for AwesomeBar-like behavior, e.g. I can type ‘fi’ and hit enter to launch Firefox. Also really nice is that I can type some term and see all relevant Control Panel items whether my term strictly matches those items or not: for example, I can type “reso” and get “Adjust screen resolution” even though there’s no Control Panel item of that name.

resolution

Simplify, simplify, simplify

So the question is, might we be better off giving users just one or two mechanisms for launching programs rather than half a dozen? I believe we would, but my solution requires accepting a few somewhat unconventional premises:

  • First, as I’ve already described, organizing things into categories is largely a fool’s game.
  • Second, when mouse-only mechanisms seem too inefficient, designers tend to introduce additional mouse-oriented mechanisms, which not only create redundancy, these mechanisms often challenge users with poor mousing skills and almost always involve adding new screen elements. If we could somehow make keyboard interactions easier to discover and recall, we could stop trying to get overly clever with the mouse and could clean up some of our messes. I believe this is doable with real-time textual filtering and a few other tricks.
  • Third, if you’re going to present a list, don’t be afraid to let it take up a proper amount of screen space so users can actually read and scan the damn thing. Some designers think big lists scare users, so they scrunch lists into small boxes, requiring the user to scroll a lot and manually resize columns. This is silly: if something is too scary for users to deal with, don’t present it at all. You aren’t helping by making the information hard to view.4

So what’s the solution? Well let’s start by flattening Ubuntu’s Applications menu out into one big list, and while we’re at it, let’s throw in the shutdown and settings items:

ubuntu menu 2

Too long, right? Maybe, but it seems pretty decent to me. It’s only about twice the height of the typical screen resolution, and as long as the most frequently used stuff is in the top half, is it really going to kill the user to occasionally scroll down? Besides, most users who care about efficiency will pick up the habit of opening most applications by filtering:

filtered menu

Here, the user types ‘w’, and so the list only shows the items matching a term starting with ‘w’; the word processor is listed first because that’s the item which the user has most frequently selected in the past when they type ‘w’. Users can also filter on terms that describe a program but aren’t necessarily in its title:

filtered menu 2

Here, the user’s query ‘ga’ matches the tag ‘game’, so the user sees all items with that tag.5

So that’s basically it. With a big filtered list, I don’t see a need for shortcuts in a QuickLaunch menu, shortcuts on the desktop, shortcuts pinned to the start menu, shortcuts to recently opened programs in the start menu, or shortcuts pinned to a dock/taskbar.6

I should note that this isn’t terribly radical, and in fact, it isn’t all that different from the direction Gnome and KDE have been heading. The Gnome shell prototype, for instance, introduces textual filtering. What I find odd, though, is that both projects seem very attached to the idea of categorized menus. Here, for instance, is a recent KDE screenshot:

kde menu

In this design, the categories slide into view rather than pop out. Sadly, this make navigation among the categories no less annoying, just annoying in a different way.

Application menus

If we can reduce program launching to just a big filtered list, could we do the same to the traditional menu bars in applications? Well, here’s what you get if you stuff everything from the menu bar of a moderately complicated program, Paint.NET, into one big list:

paint menu long

This is about the same length as our program menu, but for application controls, it doesn’t seem as acceptable. The fix is to pack things horizontally7:

paint menu wide

The question, then, is how to add textual filtering. We could simply have matching items show up in a one-dimensional list, as usual:

paint menu filtered simple

Here the user types ‘b’, and so items beginning with ‘b’ show up, with the most frequently used items showing up first. Alternatively, we could simply highlight all items that match the query:

paint menu filtered highlight

The solution I like best, though, is to combine these two such that we highlight the matching items but filter out sections without any matching items:

paint menu filtered combo

It may have occurred to you that this idea bares some resemblance to the Microsoft Office 2007 “ribbon” interface: just take the individual ribbon tabs, array them vertically, and add a text field on top:

word menu

(For our purposes, ignore that this is an offensively complicated array of controls. Obviously you wouldn’t want to bombard a user with something like this.)

The thing I really like about the ribbon is that, unlike the traditional menu bar, the ribbon directly contains complex controls, so a lot of stuff which would otherwise get punted into annoying dialog boxes can be done directly in the ribbon (or at least in little pop-out overlays, which aren’t nearly as annoying as dialogs). This is something menus going forward should imitate.

On the other hand, the most annoying part of the ribbon is that it’s modal: the user has to often switch the currently-viewed tab to get at a control. In contrast, with a pull-down menu, the user is always oriented at the same place (the top) every time it’s opened. I also believe that a big scroll is easier to scan and better facilitates the user’s spatial memory: more is visible at once, your eyes can track as you scroll, and everything is in clear spatial relation to everything else.

A pull-down menu obviously has a disadvantage, though. In the ribbon, related functionality tends to live together on the same tab, and the last-used tab stays visible; consequently, a lot of tab switching is avoided that otherwise would be required. In a pull down, while it’s nice that the menu is hidden when not needed, quickly repeated actions annoyingly require opening the menu (and potentially scrolling) for each action. The solution to this—without resorting to toolbars—I’ll discuss in a later installment.

Command filtering

Ubiquity is a Firefox add-on which adds a command line. Unlike a traditional command line, Ubiquity effectively guesses what the user is trying to say rather than requiring the user to precisely recall the full names of commands and their precise syntax, and it does this basically by treating the user-entered text as a query to filter the set of commands. In the next installment, I’ll describe how something very much like Ubiquity would work at the desktop level rather than just confined to the browser.8

One text field to rule them all

So it looks like we’re going to have a bunch of text fields in our desktop for doing different things:

  • entering urls and searching bookmarks and history
  • searching the web
  • searching our filesystems
  • launching programs
  • searching application menus
  • executing commands

Ideally, we could combine these all into just one universal text field such that I can just reflexively hit a keyboard shortcut, start typing, and then decide what kind of action to perform—whether a web search, a command, or whatever. I’ll discuss how this is managed in the next installment, which will primarily cover window management.

Continued in part 3 (coming soon).

  1. And notably, Slate has moved in recent years towards a more blog-like front page. []
  2. The system tray, of course, is supposed to be for status indicators, but many programs end up abusing it. []
  3. Found in the start menu since Windows XP. []
  4. Be clear that there’s a distinction between hiding controls and hiding information: a bunch of controls, obviously, can intimidate and overwhelm a user, so it makes sense to be careful about how many controls the user sees at once. []
  5. The user needs some sort of indication of how an item matches a query, so here, perhaps the tag ‘game’ should appear highlighted next to each item. []
  6. A lot of people like how the OS X dock keeps an application’s icon always in the same place, allowing for reflexive program switching. As I’ll describe in the next installment, my design retains this affordance in a different way. []
  7. In a list where the set of items changes, arraying things in two dimensions is generally bad because it means things tend to shift around in a confusing way; when the set is fixed, things aren’t going to move around []
  8. This isn’t original, of course: Ubiquity actually derives from Enso and Quicksilver, which are basically command lines for the desktop. []

38 Responses to “Reinventing the desktop (part 2): I heard you like lists… [text version]”

  1. Nathan August 3, 2009 at 5:07 am #

    Emacs’ M-x command (executes the command you type in, with some tab completion of varying quality depending on what else you’ve got loaded) is something similar that I value a lot.

    The big problem with “chuck it all in a list and search it” is when you _do_ have some useful hierarchies you can use. You’re throwing that away, and what’s bad about that is that searchable interfaces are terrible at discoverability (i.e, displaying what affordances are on offer) compared to well designed menus.

  2. Mike McNally August 3, 2009 at 6:27 am #

    Welcome back to the world of the command line. We’ve missed you.

  3. pstradomski August 3, 2009 at 6:50 am #

    In KDE I seldom use menu at all. For launching programs it’s better to press ALT+F2 which pops up krunner, where you can type any command. In KDE 4 it got a massive upgrade, you can not only type a command, but also part of program description, name of a contact from address book (launches “Compose mail” window with “To” field filled) or many more (file names, simple formulas). This is based on a plugin system, so programmers can add their own engines that present other options in response to user input.

  4. Jon August 3, 2009 at 7:35 am #

    Using a list instead of a hierarchy is different on the web and on the desktop. A hierarchical search on the web will lead you through 4 or 5 different categories before you get where you want. This isn’t problematic because it’s 4 or 5 different clicks but because it’s 4 or 5 page loads. A menu hierarchy on the desktop wouldn’t have page loads.

    Also, when you search on google, you’re providing a search term before you even see a list. That’s not the case in a start menu. If I’m using the mouse, I’d like to stick with the mouse. If I’m going to use the keyboard to launch programs, I’m going to do it with a terminal or gnome-do. Switching back and forth is suboptimal.

    Finally, I disagree with iTunes as an example of a list. Winamp had a giant list first. So did WMP. iTunes started the music library thing. They give you a list, but you can filter it by the genre>artist>album hierarchy.

  5. Raziel August 3, 2009 at 8:16 am #

    tl;dr: alt+f2/krunner

  6. p47 August 3, 2009 at 8:18 am #

    One thing to solve it all: gnome-do with deck. I literally cannot change os bc I’m so addicted to this solution.

  7. Programmer August 3, 2009 at 8:20 am #

    Ugh, the worst of all worlds.

  8. webmaster screenshotscores August 3, 2009 at 8:26 am #

    cmdline, there is no better :D

  9. Brad August 3, 2009 at 9:02 am #

    While you’re busy redesigning the desktop you might want to take a few seconds to redesign your block. You’re slightly tweaked Kubrick theme is even more hideous than the original and destroys any credibility this blog might have had otherwise.

  10. monkey ninja August 3, 2009 at 9:33 am #

    like anything.el, wait a second, could we just make emacs on OS?

  11. SomeGuy August 3, 2009 at 9:42 am #

    I really don’t think this is a solution, this is another problem. A case of engineering that no-one wants. As much as it pains me to say it, most users still drive with the mouse and requiring keyboard interaction is not only frustrating for them, it’s slower for everyone. If I want this type of behaviour I’ll use my “Run” prompt or something like Katapult/Gnome-Do. I’m quite happy to type commands, even if only partial but it is not intuitive.

    Remember the GUI is a teaching tool, it does not match command line tools for productivity except for WYSIWIG style applications, e.g. Office, Gimp/Photoshop etc. In these situations you work with a mouse and a few modifier keys at most. This in between situation means you have to know already what you want, which simply is not the case with novice-to-intermediate users and for everyone else there are launcher tools. These are not two areas that will be easily combined and you end up with a new situation where everyone is displeased with the interface.

  12. Penguin Pete August 3, 2009 at 10:30 am #

    Ha! reading this post plus the above comment reminded me of my own post on 10 Reasons Why the Command Line is More User-Friendly than the Desktop. Ain’t I the dickens?

  13. Colin LeMahieu August 3, 2009 at 10:50 am #

    So you like OSX Spotlight search?

  14. MacLover August 3, 2009 at 11:00 am #

    “The one addition I really like, though, is the text search/filtering added to Vista’s start menu.”

    You don’t have a Mac, do you?

    That is called SPOTLIGHT!
    …and it’s been there for years.

    I guess M$ still has their photocopiers greased.

  15. anon August 3, 2009 at 11:18 am #

    You left out hot keys, and a quickly accessible app-launch.

    alt+f2 and type application (my favorite launcher is katapult and the keys for that are alt+space)
    and in general, hotkeys it up… win+w is web browser, win+f is file manager, win+e is text editor, and win+d is doom3 (always a pleasure to accidentally open it between the two and get side tracked for an hour :)

  16. Nathan Spears August 3, 2009 at 11:34 am #

    Another great collection of observations and ideas.

    I have been thinking about the idea that we should be able to access any item from this one “magic bar”. My initial thought was that it would be so much work to tag every item in such a way that this could possibly be useful, that it couldn’t be useful. For instance, I want to pull up a program called LDAP Browser that I used to navigate Active Directory via LDAP (after someone gave me instructions for how to set it up) one time, except I’m not a power user and I don’t know what Active Directory or LDAP are. So I type something like “look at tree” or “see domain” into my Magic Bar. If I haven’t gone to the trouble to tag the program beforehand, I’m probably not going to get the results I want. But, as my friend Greg pointed out, someone else (like the developer) may have prepopulated the tags for me. Even better than that (Greg also said), if tags can be refreshed from a central source keeping track of what other people are tagging their applications with, then the whole tagging concept gains googleish value. At which point I had to concede that he was right. Where do you think that work of tagging will get taken care of?

    I would, however, like to see some possibility of differentiation between Magic Bar functionality. I’m sure you will address this in your Window Management talk, but I would like to see this Magic Bar present on whatever would replace my “Desktop” – except that I would want at least two Magic Bars, one for local content and one for online content. When I’m looking for a program I don’t want to see items in my browse history that match those tags or words, and vice versa. For me, there’s still a hard line between what’s local and what isn’t. Does that seem intuitive or am I just stuck in a pre-singularity mindset?

    These are just some things that you made me think about. Looking forward to the next one.

  17. Wolter August 3, 2009 at 11:34 am #

    You’re kidding, right?

    One big huge list? I have enough trouble already trying to help non-hackers with their computer without 10 billion functions jumping in their face.

    Simplicity is good, but a huge list is NOT simplicity. Same goes for a “ribbon” with 150 things on it. How are you supposed to find anything in that jumble of icons and words?

    Categories are a fool’s errand ONLY because UI developers have traditionally used hierarchical categories instead of tags. With tags, you can assign key words to your applications (and files and bookmarks, really), and then have a “hierarchical” browse mode (for people who aren’t really sure what to search for) where the same application will come up in any context where its tags match. This simplifies “awesome bar” style searching, too, as you can type “paint” and then get a list of graphical tools like gimp and ImageMagick and XV and all that stuff, then choose which one you want to use (or discover that your favorite paint program is not installed on that particular computer).

    In fact, tags themselves should be taggable, which would allow a richer tagging system that’s easier to manage. The “DVD” tag could itself be tagged “CD” and “Disk” and such, so that any application you tag “DVD” would also respond to “CD” and “Disk”.

    The same tag system should be applied to menus, since many menu functions can fall under multiple categories.

    Also, a user should be able to make his own custom menus/toolbars/ribbons with whatever he uses most often. This should be done via drag and drop (including dragging a menu item and dropping it in a toolbar or ribbon… that sort of thing).

    You can also add the concept of “perspectives”, which change the more commonly presented tools and functions based on how you want to look at what you’re working on.

    Menus are difficult to navigate because there’s no way to lock onto a category before moving the mouse to a subcategory. If you move too far up or down before reaching the submenu, you lose your context. The obvious solution is to make it so that the submenu DOESN’T come up until you CLICK while hovering over the category you want, at which point the context locks unless you CLICK on another category.

    Also, a caution on a unified search box:
    Searching takes time. If you have it do a simultaneous search of applications and files, for example, you’ll get a lot of hard drive thrashing. Add network shares in there and you get a lot of network traffic plus the increased CPU usage from the operating system’s caching mechanism. Anything that can potentially unleash a heavy load on the computer needs to include the ability to do a lighter weight option.

  18. Josh August 3, 2009 at 11:37 am #

    Your position is great for a power user, but your typical high school kid or grandmother doesn’t want to use the keyboard at all when opening applications. The hierarchies, and desktops, work great for these individuals.

  19. Coal August 3, 2009 at 11:58 am #

    You should check out launchy.

  20. Paul Keeble August 3, 2009 at 12:18 pm #

    This suffers from all the same problems as the command line does, ultimately its not necessarily easy to know what the menu command might be, or what the tag its associated with is.

    Discovering what functionality is on offer works best with a hierarchy. If the goal is to help beginning users this makes it more difficult, but for experienced users it has legs (as vi, emacs etc show just how effective the approach can be).

  21. Coen August 3, 2009 at 2:07 pm #

    You’re describing Quicksilver – http://www.blacktree.com/

  22. Coen August 3, 2009 at 2:08 pm #

    Okay, I should’ve read to the end before commenting, my bad

  23. Dave F August 3, 2009 at 2:46 pm #

    Categories provide one thing that long lists of items do not provide: that is context. If I want to know all graphics programs (a novice who probably does not even know the name of the application) then categorization provides a shortcut. Lists can do that too if additional attributes are displayed on the same line as the list item. Then it becomes a table. Nothing wrong, but just saying. But categories are useless in UI after the initial learning curve is crossed, which is nearly everyone.

  24. Matt August 3, 2009 at 3:07 pm #

    I recently came up with an idea for a taskbar/application menu/file/edit/view menu replacement (however, execution is entirely different than an idea). I also have entirely too many other projects to start on it, however, I’m convinced it would be usable. The idea is that you have a veritical menu, similar to OSX’s dock when on the right or left side. Instead of icons, you have a giant list (like you described). At the top of the list are the File/Edit/View options, clicking on them expands it downward. Under that is currently open applications (grouped/filtered however appropriate). Under that is applications by category (non-running applications). Each application can live in multiple categories. One of the categories is “All Applications”. Then at the bottom, the magic “search through everything” text box.

    The idea was that screen real estate is wasted for persistent visual feedback that is useless 98% of the time. The idea was to provide as much or as little information as the user chooses, with the information being displayed simply and easily (single click show/hide categories).

  25. Sam Thursfield August 3, 2009 at 5:38 pm #

    You seem to have just described Gnome Do.

  26. The Gnome August 3, 2009 at 7:19 pm #

    Gmail sucks. Gmail’s “search, don’t sort” philosophy sucks. Gmail is broken. Such a turd in need of polishing.

    Want IMAP compatibility? Put a slash in the label name. OOps, search breaks entirely if labels have any punction!

    Want to find all emails that contain a phrase? Type in the phrase. OOps, if you search for the phrase within a specific label, you might not get any results, but searching for the same phrase without specifying the label will find matches within that same label!

    Want to sort through emails by date, because you remember that you received an important email earlier but can’t remember enough details to do a search? TOO BAD if you’ve already archived it! Gmail’s ridiculous conversation “feature” will take an email and stuff it 5,000 threads down because it happened to be in response to an email thread that was started five months ago. Oh, and how can you fix this? Simply move every email within that label to an entirely different label, and then move it back! Gmail’s disaster of a threading “feature” will fix the sort order!

    In short, “search > sort” is an insult to users who know how to use the power of sort. Gmail needs to get its act together.

  27. David August 3, 2009 at 7:22 pm #

    And Enso itself derives from Archy, which derives from the ideas for ‘the Humane Interface’ in Jef Raskin’s The Humane Interface, which derives from Raskin’s Canon Cat. Yes, the way Ubiquity handles commands is quite similar to how the Cat worked back in 1987. Good interface ideas really do take decades to spread.

  28. John August 3, 2009 at 7:28 pm #

    Reading this article I quickly realized how much I think QuickSilver for Mac is like the AwesomeBar in that it is a frequency/relevancy list for launching programs you use (or lesser ones you may have just installed or never used).

    http://quicksilver.en.softonic.com/

  29. Callie August 4, 2009 at 9:59 am #

    The extremities of these replies are examples to me of what I have learned in my UI studies this year – that different users learn and execute their work flows in different ways. It looks like your searching/filtering/tagging mechanism might work well for a significant percentage of people (especially the following: advanced users, programmers, people who spend a lot of time on their computer, people who think primarily textually) and not as well for another large percentage of people (beginning to intermediate users, people who do not spend much time on a computer, people who think visually).

    Having multiple routes to discovery enables users to find the path they feel most comfortable with, and more quickly discover what they can do in the OS/application. As some have mentioned, some users prefer the mouse, or thinking categorically or visually. My favorite thing about items like the OSX spotlight vs. finder/dock/shortcuts or Vista’s co-compatibility with searching or sorting into folders is that users can choose what they feel comfortable with. Even though I have had the spotlight search for years on my mac (and it is beautiful and sometimes I use it), my spatial recognition is stronger and I often access applications or items by remembering where they are on the desktop or dock. Accomodation of spatial/visual thinking is just as important as intelligent “keyword” searching/filtering.

    I agree that needless categorization sometimes plagues our electronic efficiency and ease of use. But even an intelligent command line is more of a programmer’s solution. I can teach my mother to use Google in only a few seconds, but how long will it take for her to search by keyword? Multiple avenues of discovery are valuable. Please don’t leave the spatial thinkers out in the cold. :)

    Loved the article!

  30. Callie August 4, 2009 at 10:00 am #

    *Addendum: how long will it take for her to search for everything that she wants to do on her computer by keyword, I meant! :)

  31. maik August 4, 2009 at 4:20 pm #

    Isn’t the next logical and natural extension to this telling your computer what to do via voice commands?

  32. Nathan Spears August 5, 2009 at 8:38 am #

    It seems worth mentioning that even the commenters who don’t like any of your ideas aren’t bothering to defend the broken aspects of Windows that you are criticizing. The point I think most of them are missing is that some adaptation to a new system is inevitable. It feels easy to click on things on the desktop now after decades of doing so, but that “discoverable” behavior isn’t any more natural than typing in a command line. A well designed command line would be discoverable, and easier to use. Just because gmail’s label and search features aren’t all you want them to be yet doesn’t mean the ideas of label and search are bunk.

  33. Nicholas Harris August 5, 2009 at 1:05 pm #

    I worry that most users are too lazy to tag.

    Are we expected to upload all our documents to “the cloud” where a computer can compare their topics based on their summaries and tag them for us? Would we not worry about the security of our private data?

    Do we need hierarchical ontologies after all?

    In your linked Clay Shirky video he talked about Flickr users tagging photographs which could be retrieved by the filter: “cats in sinks”. You implied that this was superior to a hierarchical filing system which would have be categorized (no pun intended) as either cats/sinks, or sinks/cats. Yet, as soon as I read this I wondered about:

    “kittens in sinks”

    Surely the system should know that a kitten is a kind of cat and whilst placing a priority on Flickr photographs tagged with “kitten” and “sink” it should also include “cat” (which is a conceptual generality of the specific ‘kitten’ class, in other words a “superclass”, ontologically-speaking) as it may otherwise yield zero hits for being too specific.

    ‘Yes, we particularly want kittens, but we would accept cats, some of which may turn out to be kittens in sinks tagged incorrectly as cats in basins.’

    Who gets to decide upon the structure of this ontology? What are its political ramifications? Perhaps, the imprecision of any such interface could be helped by making it conversational… approximating iteratively to a point of common understanding between man and machine. Rather than expecting it to know what we mean on the first attempt.

  34. Forrest Landry August 10, 2009 at 12:37 pm #

    Hi,

    Perhaps some of these questions can be clarified by thinking about the interaction between user and computer as a sort of ‘communication event’ embodying a ‘language’. Implicit in a lot of discussion is the idea that there are two kinds of language of interaction: textual as implemented via the keyboard, and visual as implemented via the mouse. Similar can be said about the information/files stored on the computer: that there are two basic kinds of stored data: document files of various kinds (readable text, source, etc) and image files (everything else — applications, pictures, videos, music, etc).

    As such, we can reformulate our questions/thoughts about effective UI design as a question about language: what is the most natural language for human interactions? Is it a visual language or a textual one?
    Note that these last questions are not about computers per se — we could be considering a mutual dialog between two people or people and community as much as with any sort of hardware.

    However, in observing actual interactions between real people, we notice immediately that while there are lots and lots of very well established textual languages (English, Spanish, French, Hindi, ect, etc), there are very very few visual languages in common usage. Examples of visual languages include traffic lights, some types of road sign, symbols indicating bathrooms, danger, and various sorts of mapping symbols.
    Beyond that, it is fairly much admitted among professional linguists that a true visual or “symbolic” language is something of an oxymoron at best. What visual language elements do exist in common usage tend to be brief, sparse in meaning content, lacking in connotative richness. Moreover, even the most well developed symbolic/visual languages tend to have vocabularies several orders of magnitude smaller than the average vocabulary of most textual languages.

    The simple fact of the matter is that human bodies do not have an image projector built in at birth. While we can easily receive visual images at high bandwidth, we cannot express images with similar bandwidth. We do, however, have a voice — an auditory projector that can generate information as well as ears to hear with. Hence, it is *much* more natural for human beings to communicate complex information via voice (textual) than it is to communicate (interactively) similar information via any sort of image media. The expressive bandwidth of bodily physical motion (ie, to use a mouse pointer) is always going to be much lower than the expressive bandwidth of text (voice commands and/or keyboard usage). The linguist perspective about actual languages in common usage merely confirms this.

    Visual languages find their niche in situations where the complexity of the information to be communicated is low. Selecting single items from small sets of available/potential items is basically the *only* situation where visual representation is preferred. Otherwise, we see visual language used as indicators (traffic lights, stop signs, bathrooms, etc).
    However, this format of communication does not scale. As applications become more complex (as we use computers to work with ever larger amounts of data in more and more complex ways) the disparity between the carrying capacity of a visual language and a textual language grows ever more evident. A visual language works well when there is a fairly limited set of contexts (or modes) that need to be identified. Yet when the number of possible contexts (application states) grows without bounds, the meaning of any single visual element quickly becomes washed out (meaningless). Attempting to identify commonalities that span all possible applications becomes an ever more difficult task as the variety of applications continues to increase, and is best avoided from the onset.

    Therefore, while there will always be a niche for visual languages, it is evident that textual interactions are here to stay and are ultimately more natural for _ongoing_interactive communication_. In situations where there is little need for back and fourth dialog/communication, visual elements work well (think presentation mediums: billboards, printed magazines, television). For situations that require user interaction and participation (social organizations, CAD software, etc) textual representations will always dominate (either voice or keyboard).

    The reality of the above paragraph is evident when you consider that the people who most vehemently promote visual models of interaction (icons, flash, 3D effects, etc) are generally business types interested in promoting (selling) a product. Their interest is to generate only one type of interactive user response/expression: dollars spent. The use of flashy visual elements is to entice new users into spending money, not to generate lots of continuing and ongoing dialog with the user (communication is expensive for a business). Optimizing a UI design to get a new user to start using an application is a far different activity than optimizing a design to facilitate ongoing user interaction. Presentation only oriented things will be visual — interactive oriented things will ultimately be textual at their root, regardless of whatever else gets integrated with it.

    Would you rather teach your children a “pigden” language that they can ‘get up and running with’ right away, or a hard language (English) that will last them their whole lifetime? What sort of job opportunities would you really expect to get if your entire working vocabulary consisted of only a few hundred words (a illegal alien laborer) vs. someone who had real mastery of a language (doctorate students typically have vocabularies of at least 50K words). How obvious does it need to be that people who can type are going to be far more effective with a computer (and more employable) than people who can only operate a mouse?

    Where complexity is increasing and interactions are ongoing and long (computers are here to stay) it is clear that textual interactions with a computer are going to be ultimately better (voice commands/dialog or keyboard). We should design with the end in mind (re “7 habits”) rather than continuing to even worry about the illusionary short term gains of purely visual computing. The evidence is in and the verdict is clear: prefer textual interactions in GUI design than visual ones. Arguments/discussion of the “grandmother newbe user” largely miss the point that the effort to hide complexity is itself complex and in the long run, make the situation worse for everyone. This sort of mindset does not scale and this becomes more an more of a problem as things DO actually scale larger.

    As far as the ideas of this Blog are concerned, I largely support them and have even done some work/experimentation along similar lines with varying levels of success. I encourage further principled dialog along these lines and look forward to seeing these ideas developed further.

    Forrest Landry,
    San Diego, CA.

  35. Kevin Cannon September 20, 2009 at 3:34 pm #

    Have you seen the way the help in OSX Leopard works? It’s very nice – I think you’d like it.

    http://www.youtube.com/watch?v=ggoTQqNXOQs

  36. Candy Randy October 8, 2009 at 4:25 pm #

    “Before iTunes and its imitators, users would play their music by navigating into folders, e.g. ‘music\artist\album\’. Today, iTunes simply presents everything in one big list that is textually filtered.”

    Using folders for your music is a GOOD idea, unless you’re one of the masses of idiots who pretends to ‘like’ music, and has to pretend to ‘want’ to listen to ‘rock’ today, or ‘jazz’ tomorrow, thus needing iTunes and its bloody irritating ‘filing’ system to find them what they pretend to ‘want’ to listen to.

    I know where every single song in my 4,000 track MP3 collection is, all filed in folders, either by artist, or about 10% in ‘genre’ folders, such as ’80s’, ‘Love’, ‘Dance’, etc. I don’t want a long list of 4,000 tracks, thank you, that’s for idiots.

    Your ‘long list’ idea doesn’t work when you are dealing with thousands of files. I remember the position of some of the 4,000 songs SPATIALLY, i.e. I know it’s in the first, top level folder,
    A-F
    not in
    G-M
    or whatever, and then within
    A-F
    it’s within
    Dance
    etc.

    Your ‘huge list’ idea is stupid. Microsoft’s ‘Ribbon’ is equally stupid.

    Spatial memory is where it’s at. I will reveal all very soon…

  37. Candy Randy October 8, 2009 at 4:27 pm #

    “You left out hot keys, and a quickly accessible app-launch.

    alt+f2 and type application (my favorite launcher is katapult and the keys for that are alt+space)
    and in general, hotkeys it up… win+w is web browser, win+f is file manager, win+e is text editor, and win+d is doom3 (always a pleasure to accidentally open it between the two and get side tracked for an hour :)”

    That is nowhere near as efficient and easy as it could be. There is a much better way, but I will reveal all soon…

    I’m just amazed nobody has thought of it already – it shows how unable to think outside the box the entire world of computer users is…

Trackbacks and Pingbacks

  1. Reinventing the desktop (part 2): I heard you like lists… « Jason J. Gullickson - August 3, 2009

    [...] jasongullickson 2:13 am via [...]

Leave a Reply