Archive | March, 2007

Law of Demeter

31 Mar

From Wikipedia:

When applied to object-oriented programs, the Law of Demeter can be more precisely called the “Law of Demeter for Functions/Methods” (LoD-F). In this case, an object A can request a service (call a method) of an object instance B, but object A cannot “reach through” object B to access yet another object to request its services. Doing so would mean that object A implicitly requires greater knowledge of object B’s internal structure. Instead, B’s class should be modified if necessary so that object A can simply make the request directly of object B, and then let object B propagate the request to any relevant subcomponents. If the law is followed, only object B knows its internal structure.

More formally, the Law of Demeter for functions requires that a method M of an object O may only invoke the methods of the following kinds of objects:

  1. O itself
  2. M’s parameters
  3. any objects created/instantiated within M
  4. O’s direct component objects

In particular, an object should avoid invoking methods of a member object returned by another method.

I’m not sure how valuable this advice is. If you follow it strictly, you’ll be spending a lot of time adding (and modifying) wrapper methods, which is probably just as bad, complexity-wise because what you gain in direct decoupling is lost by indirect coupling (not to mention interface pollution). Perhaps the better advise is to be conscious when you break the Law, i.e., when you write:

x.y().z();

…you should pause and think whether the class of ‘x’ would tolerate having a method of its own that accomplishes the same work, thus simplifying things for its users. Often, however, a type of object returned by the methods of a class serve too many multifarious purposes, making wrapping its methods quite heavy business and, in fact, more confusing than just leaving the class’s clients to deal with that type directly.

So a good rule is to ask ‘how broad is the scope of the returned type’s uses?’ If the functionality needed of the returned type is significantly narrower than that type’s public interface, consider wrapping it; otherwise, you’re just likely to do more harm than good.

HOWTO Decompose your code into functions

28 Mar

For the sake of keeping code readable and as comprehensible as possible, good functions comport to a few simple rules:

  1. Give it one purpose. And the counting of the purposes shall be one. Not two purposes, not three purposes. Five is right out.
  2. Continuing with the theme of having one purpose, subtasks should be split off into their own functions. How do you identify subtasks? A subtask is a contiguous section of code which uses its own set of variables (give or take a few) and for which you can come up with a name for what that section does. An even better hint that a chunk of code is a subtask is if it is found repeated in multiple places in code (either within the same function or found in different functions).Of course, going too far with splitting off subtasks into their own functions would produce infinite regression, so we have to draw the line at some point: as long as a subtask can be confined to a few simple enough lines at a single place in code, its presence inside the function may be just fine.
  3. Give it a solid name—one which is strictly a verb phrase (or at least an ‘action’ phrase, e.g. ‘fileToString’)—that describes clearly the one thing that the function does. Too many programmers shy away from long names, which is silly with today’s pervasive code completion editors. Better a name be long than cryptic. (The exception to this is when you know the function will be used often in complex expressions, as say, math functions often are; in this case, you may want to keep the name pretty short.)
  4. Keep the logic simple. Whenever you see about 4 or 5 levels of nested logic, you should consider splitting some of the deeper logic off into its own function. (Rather than just counting nesting, a more accurate measure is cyclomatic complexity.)
  5. Keep it short. Once a function gets longer than will fit entirely on your screen, it becomes much harder to understand and work with. The optimum function length for code comprehend-ability is probably around one-third screen height (shorter than that and you get diminishing returns as you get a greater preponderance of functions in return for the greater function simplicity).
  6. Keep down the number of local variables. Good functions rarely have more than several local variables. Now, exceeding this number is rarely a sufficient reason by itself to split a function up into multiple functions, but this symptom rarely arises in isolation, so it is something to watch for. Besides splitting the function up into multiple functions, there’s not much else you can do to treat this symptom.
  7. Keep down the number of parameters. Functions taking more than several parameters are ugly. To avoid parameter preponderance, you can do a few things: a) package the information the function needs into one single package (in the form of an array or record of some kind, depending upon your language); b) put the information in variables external to the function; c) the function simply may need to be split into multiple functions.

These last two principles are the hardest to follow: in our structured languages, decomposing a problem into functions means not just dividing up the labor but also segregating the data, so the trick becomes to maintain access to variables where you need them (without resorting to making every variable global: the whole point of structured programming is for data access to be highly regimented).

The practical solution to this problem—and the problem of decomposition in general—is pretty obvious, but too embarrassing for many programmers to admit: rather than write perfectly decomposed code on the first attempt, it’s much simpler (and more common) to just go with whatever first occurs to you and work from there. Indeed, the pseudocode decomposition process commonly taught in schools is unrealistic, for in practice, code is very rarely any good when you first type it out, no matter how well you plan; exceptions certainly occur, but they typically only do so when you’ve solved essentially  the same problem(s) before; when it comes to tackling a problem you haven’t solved before, you’re destined to get things quite wrong on your first attempt 98% of the time.

So it’s important to embrace refinement as your most basic coding activity. Rather than writing good functions, the real important skill is to rewrite bad functions.

Nobody in here but us chickens!

27 Mar

Dmiessler makes a good point I’d been meaning to make myself: the axiom against ‘security through obscurity’ is often taken too far. First of all, any kind of cryptography, whether the algorithms are publicly known or not, always ultimately relies upon ‘obscurity’ in the form of an unrevealed piece of information, the key. Second, well, just ask Osama bin Laden.

Arguably this comes down to a semantic disagreement: ‘obscure’, here, surely means ‘unrevealed information’, but the authors of the phrase ‘security though obscurtiy’ also seem to assume ‘obscure’ implies ‘reliance upon unrevealed information that is guessable by brute force means (no matter how arduous, as long as it’s doable) or derivable by logic (no matter how byzantine)’. I don’t think everyone shares that definition, hence a lot of unnecessary back-and-forth around this slogan. (Not that there isn’t a real substantive disagreement here—just that much of the argument is probably unnecessary.)

Thine desktop runneth over

26 Mar

I don’t have to be Aunt Tillie to crave a simpler desktop computing experience. Whether I’m using Windows, OS X, GNOME, or KDE, my current work flow gets tangled as I juggle several open folder windows, half a dozen instances of Firefox with 30 tabs between them, a text editor, Google Talk, Eclipse, terminal windows, a media player, and sometimes more; on top of this is the ever expanding mess that is my hard drive. This basically sums up my two main computing problems: my desktop is an unstructured mess of windows, and my hard drive is (between major cleaning jaunts) a mess of files.

The second problem, the files problem, is being partially addressed by desktop search (Google Desktop, Beagle, etc.), but that solution doesn’t really help keep my drives clean—it just let’s me cope better with the mess I create. I think the real solution will be in adding tag-based directory structure onto our current hierarchical directory structure (yes, we would have to meld the two), but that notion will have to wait for a later post.

As I argued in the previous post, the root of the first problem (window management) lies in the desktop metaphor itself and the dominant GUI conventions of the last twenty years. To deal with this problem, we should first step back and analyze what purpose the desktop serves. The GUI desktop is, at a minimum:

  • An interface for starting applications, for switching between open applications, and for allotting screen space between open applications.
  • A common set of interface elements for applications, often including guidelines for the use thereof to achieve a cross-application standard look-and-feel.
  • A data sharing mechanism between apps (copy and paste).
  • A common mechanism for application associations—what applications should be used to open such-and-such a file or send a new email, etc.
  • A set of system-wide keys, e.g. ctrl-alt-delete on Windows.

And because most users don’t/can’t/won’t use a command-line, desktops include a minimum set of apps:

  • File management.
  • System/hardware configuration and system utilities (e.g., a GUI non-destructive partition resizer—which has been too long in coming to Linux live-CD’s, frankly).
  • Program installation/removal.

So what should the desktop not do? The standard Linux distros all come with an additional assortment of basic apps (web browser, office suite, mail, etc.), which is great, but I think integrating such things into the desktop (or giving naive users the illusion of integration) is very risky and of dubious benefit. The problem with such notions is that they smell of ‘ad hoc-ist design’, i.e. design which tries to meet needs by making exceptions to the rules rather than by applying the rules. Ad hoc-ist design evinces an unfortunate fact about design: it’s easy to come up with features; the hard part is devising features which make other features unnecessary. At a minimum, ad hoc-ism fails to minimize complexity; at it’s worst, ad hoc-ism introduces complexity, and that goes for all parties— for designers, for implementers, and for users. If your design evinces ad hoc-ism, it’s likely because your core mechanisms don’t fit your needs well enough, so rather than piling on more features, it may be time to rethink those mechanisms. Any proposal to expand the desktop should heed this advice, but most such proposals I’ve seen don’t (e.g. see here).

For instance, writing two years ago on O’Reilly Net, Jono Bacon (of LugRadio fame) suggested that project management should be promoted to first-class status as part of the desktop [link]. Jono’s suggestion basically requires three things: 1) integrating features of a personal information management (he calls it ‘project’ management) application into the desktop; 2) implementing an automated personal information data sharing mechanism for applications; 3) making applications work with this mechanism.

Now, Jono may actually be on to something here: his core complaint is that some applications should be able to automatically share certain kinds of data. Problem is that automated data sharing between desktop apps in general is the proper problem to tackle—after all, what’s so special about PIM? Now, lack of a general desktop data-sharing mechanism—let alone an automated one—is actually a long standing problem. Arguably the time for a solution has come, but narrowing the problem and tying the solution to programs of a specific domain would be hurtful in the long run, both for the sake of solving the general problem and the domain-specific problem: the mechanism that would result would likely be too tightly coupled to particular apps, discouraging the formation of an ecosystem of competing experimentation and natural selection.

I can understand Jono’s frustration: the Unix people tend to see every data sharing problem in terms of the mechanisms already existing in Unix (there are quite a few of them, after all), and so they don’t exactly rush to fill in gaps that hinder certain problem domains. Who knows, maybe something already exists to meet desktop data sharing: copy and paste via web services, anyone?

Anyway, I’ll finally present my desktop UI design later this week.

I hate Macs

19 Mar

Continuing a discussion of desktop UI from the previous post. Be clear that most of what follows applies equally to Windows and the Linux desktops; my point is that Apple popularized these ideas and the others—-misguidedly—-still follow Apple’s lead.

Compared to Microsoft, I don’t especially begrudge Macs for their existence, and I do recommend them over Windows to naive users who don’t have anyone to maintain and set up their boxes for them. Still, it annoys me how many people have drunk the Apple koolaid and believe that OS X has anything on the basic Windows experience beyond gloss (and an ‘it just works’ quality earned only by a controlled, minute hardware and software landscape); just about everything beyond a few features of the OS X interface is just arbitrarily different from other desktops. Macs wouldn’t annoy me so except for how their influence perpetuates through fashion stupid graphical interface design ideas which both Microsoft and Unix desktops have slavishly followed. Contrary to popular opinion, Macs are not the end-all/be-all of usability, and in fact, Macs have long perpetuated some erroneous thinking about usability. There are several things seriously wrong with the desktop/windows metaphor, and Apple is responsible for most of them.

[...]

Fuck Aunt Tillie

18 Mar

The Linux community is often accused of being poor at catering to non-expert users, but this is a misdiagnoses. Contrary to myth, the community is not full of old Unix beards demanding the rest of us master perl before they give the rest of us the time of day. Surely, many Linux users lament that more people don’t see the virtuous ways of the command line, but very few of them still refuse to accept that most people just have better things to fill their heads with than cryptic Unix utility names, parameters, and Bash syntax. And this acceptance is not just reluctant: there are large projects (GNOME and Ubuntu first among them) whose primary mantra is to make Linux bare bones simple enough to bring naive and non-programmer users a painless Linux experience.

The real problem with the Linux community is not that they disregard non-expert users but that the Linux community fails to account for the sliding-scale of expertise in between ‘competant Linux installer/admin’ and ‘Aunt Tillie who thinks that the blue E is the Internet’. (“Aunt Tillie” is the name of the naive Linux user in Eric Raymond’s essay about Linux usability, The Luxury of Ignorance.)

[...]

A brief explanation of Java versions

3 Mar

How does one make sense of Java’s version history for learners? The full story is at http://en.wikipedia.org/wiki/Java_version_history, but here’s the brief version:

First be clear that that there is only one Java language—one set of syntactical rules for how to write Java code. This set of rules has grown a few times since Java’s introduction in 1996, but valid Java code written in 1996 will still compile with today’s Java. However, the standard Java libraries have also evolved over the years, and some parts of the library have been deprecated (meaning you shouldn’t use them in new code you write because they are flawed and may be completely removed in future versions), so old code may need to be reworked to use modern libraries in place of deprecated ones.

While there is only one Java language, there are several implementations of the language, meaning there are different compilers, different JVM’s (Java Virtual Machines), and different implementations of the libraries. These varying implementations (mostly) all conform to the Java standards, but some have extra features and some have performance advantages over others. The most widely used Java implementations are from Sun, the originator of Java. Recently, Sun has begun releasing its Java implementation under the GPL (Gnu Public License), a free / open source license.

Java is, in truth, a standard and not just a particular line of software downloads released from Sun. For simplicity, though, we’ll only consider the Sun releases and their terminology, as Sun’s Java is the most popular implementation and the one learners should use.

Standard Edition (SE) vs. Enterprise Edition (EE) vs. Micro Edition (ME)

Java comes in three “editions”, which, strictly speaking, are specifications, not actual implementations of a JVM (Java Virtual Machine) and Java libraries. In practice, though, the terms are most commonly used to distinguish between the three Java implementations freely downloadable from Sun.

  • SE is the baseline JVM and libraries. This is the edition used by most PC end-users.
  • The JVM you get with EE is one and the same with the one in SE, but EE adds a large number of libraries for server-oriented programming. It generally never hurts to download and run Sun’s EE Java distribution, but as a learner, you’ll likely never use the EE libraries. I prefer to browse the SE documentation so I don’t have to wade through EE stuff I never use.
  • ME is Java adapted for resource-constrained (i.e. low memory, low power) devices, e.g. cellphones and PDA’s. Because of their typically low storage and memory capacities, such devices can’t afford to have all the libraries found in SE (and certainly not in EE). Small devices differ significantly from one another and so Sun leaves it up to device makers to offer proper implementations of ME for their particular platform. Sun has an ME reference implementation that runs on PC’s which is used for software development (just because you’re writing software for a cellphone doesn’t mean you want to write it on a cellphone!).

Java Development Kit (JDK) vs. Java Runtime Environment (JRE)

On PC’s, you have the choice of using Sun’s JDK or the JRE. The JRE, meant for end-users, comes in only the SE flavor and contains the JVM (Java Virtual Machine) and the SE standard libraries. The JDK comes in the three editions and includes not just the JVM and libraries, but also development tools, including javac (Sun’s java compiler), so this is what you’ll want as a programmer. If you have the JDK installed, you don’t need to install the JRE separately to run any Java software (though it’s likely your browser will have an embedded JRE of its own). (The term ‘JDK’ is an echo of the more general term ‘SDK’ (Software Development Kit)).

Version numbers

Without getting into the features added by the versions, here at least is the naming history:

  • The original release of Java was 1.0, released in 1996.
  • Then came 1.1 in 1997.
  • 1.2 in 1998 is when Sun split Java into the three editions. Sun began calling this version and subsequent versions Java 2, and so you will see references to J2SE, J2EE, and J2ME (Java 2 Standard Edition, Enterprise Edition, and Micro Edition, respectively). Sun began giving the releases codenames while they were in development, calling 1.2 “Playground“.
  • 1.3 in 2000 (confusingly still called Java 2). Codenamed Kestrel.
  • 1.4 in 2002 (confusingly still cased Java 2). Codenamed Merlin.
  • Sun then decided the version numbers weren’t getting big enough fast enough, so they decided to call the next version, released in 2004, variously 1.5, 5.0, Java 5, or (most egregiously) J2SE 5.0, J2EE 5.0, or J2ME 5.0. Codenamed Tiger.
  • Sun finally drops the Java 2 business, deciding that, from now on, the public name of releases will be Java n while the internal development name will be 1.n.0. In late 2006, Java 6 is released, known internally as 1.6.0. Codenamed Mustang.
  • Planned for release in 2008 is Java 7 (1.7.0), codenamed Dolphin.

So, in summary, the sequence of most used designations goes: 1.0, 1.1, 1.2, 1.3, 1.4, Java 5, Java 6, Java 7.

As for the feature improvements these versions represent, in general, each release saw bug fixes and performance improvements for the development tools and JVM along with additions to and refinement of the libraries. Aside from these behind-the-scenes changes, Java 5 is the one release which significantly added new features to the language itself, including generics, annotations, autoboxing, enumerations, “varargs”, and the “enhanced for” loop. (None of these features have analogs in PygeonJava, as they are all in one way or another inessential conveniences.)