HOWTO Decompose your code into functions

28 Mar

For the sake of keeping code readable and as comprehensible as possible, good functions comport to a few simple rules:

  1. Give it one purpose. And the counting of the purposes shall be one. Not two purposes, not three purposes. Five is right out.
  2. Continuing with the theme of having one purpose, subtasks should be split off into their own functions. How do you identify subtasks? A subtask is a contiguous section of code which uses its own set of variables (give or take a few) and for which you can come up with a name for what that section does. An even better hint that a chunk of code is a subtask is if it is found repeated in multiple places in code (either within the same function or found in different functions).Of course, going too far with splitting off subtasks into their own functions would produce infinite regression, so we have to draw the line at some point: as long as a subtask can be confined to a few simple enough lines at a single place in code, its presence inside the function may be just fine.
  3. Give it a solid name—one which is strictly a verb phrase (or at least an ‘action’ phrase, e.g. ‘fileToString’)—that describes clearly the one thing that the function does. Too many programmers shy away from long names, which is silly with today’s pervasive code completion editors. Better a name be long than cryptic. (The exception to this is when you know the function will be used often in complex expressions, as say, math functions often are; in this case, you may want to keep the name pretty short.)
  4. Keep the logic simple. Whenever you see about 4 or 5 levels of nested logic, you should consider splitting some of the deeper logic off into its own function. (Rather than just counting nesting, a more accurate measure is cyclomatic complexity.)
  5. Keep it short. Once a function gets longer than will fit entirely on your screen, it becomes much harder to understand and work with. The optimum function length for code comprehend-ability is probably around one-third screen height (shorter than that and you get diminishing returns as you get a greater preponderance of functions in return for the greater function simplicity).
  6. Keep down the number of local variables. Good functions rarely have more than several local variables. Now, exceeding this number is rarely a sufficient reason by itself to split a function up into multiple functions, but this symptom rarely arises in isolation, so it is something to watch for. Besides splitting the function up into multiple functions, there’s not much else you can do to treat this symptom.
  7. Keep down the number of parameters. Functions taking more than several parameters are ugly. To avoid parameter preponderance, you can do a few things: a) package the information the function needs into one single package (in the form of an array or record of some kind, depending upon your language); b) put the information in variables external to the function; c) the function simply may need to be split into multiple functions.

These last two principles are the hardest to follow: in our structured languages, decomposing a problem into functions means not just dividing up the labor but also segregating the data, so the trick becomes to maintain access to variables where you need them (without resorting to making every variable global: the whole point of structured programming is for data access to be highly regimented).

The practical solution to this problem—and the problem of decomposition in general—is pretty obvious, but too embarrassing for many programmers to admit: rather than write perfectly decomposed code on the first attempt, it’s much simpler (and more common) to just go with whatever first occurs to you and work from there. Indeed, the pseudocode decomposition process commonly taught in schools is unrealistic, for in practice, code is very rarely any good when you first type it out, no matter how well you plan; exceptions certainly occur, but they typically only do so when you’ve solved essentially├é┬á the same problem(s) before; when it comes to tackling a problem you haven’t solved before, you’re destined to get things quite wrong on your first attempt 98% of the time.

So it’s important to embrace refinement as your most basic coding activity. Rather than writing good functions, the real important skill is to rewrite bad functions.

No comments yet

Leave a Reply