Monday, October 20, 2008

Abbreviations and Code Readability

James Leigh, in a recent blog post, makes a couple of good comments on the importance of code readability and the presence of redundancy.

The accompanying poll question, however, (which asks if easily readable code is important) begs the deeper question...of course easily readable code is extremely important but the real question is how to achieve it.

For example, using abbreviations in identifier names is a poor way to make the names shorter and more concise.

Abbreviated names suffer several problems including:

1) ambiguity: is 'getReq' short for getRequest, getRequirement, or getRequisition?

2) cognitive burden: abbreviations requires much more mental effort to remember which fragment of a word is being employed. This "ideolexical" design makes the API seem much more complex and daunting than it should.

As an example, is the abbreviation for 'declareDescription' going to be:

declareDescript,
declareDescrip,
declareDescr,
declareDesc,
declDescrip,
declDescr,
OR
declDesc?

3) lack of consistency: even with only one programmer creating the abbreviated identifier names, it seems highly probably that inconsistencies will creep into the naming scheme, making it harder to use.

4) loss of readability and documentation: longer names are often clearer and document the code better than abbreviations (or shorter names).

In these days of IDEs there is little reason not to use longer, clearer, self-documenting names: it is trivial to start a name and then hit the appropriate completion key. Even if you program in a non-IDE (as I do....I use Emacs a lot of the time) the importance of good names as documentation cannot be over-emphasized and is well worth a tiny bit of extra typing.

2 comments:

James Leigh said...

Hi Tom,

While, I agree that using abbreviations can be a poor way to make the names shorter. I find the best way to choose how to name an method or class is to model the name after a verbal name used in common conversations. For example things like http://www.blogger.com/ are often referred to by the /abbreviation/ URL and therefore is a perfectly fine variable name. While this is a very common abbreviation, other less common abbreviations are equally good in some domains. For example the project BigData using the abbreviation SPO to name classes that have a Subject, Predicate, and Object. Within the BigData community SPO is a common way to refer to it.

In my post I used the example that a RepositoryConnection class might be a bit long. This stems from the fact that this class acts as a central hub and is often simple referred to as a "connection", therefore the prefix is redundant since it is it usually omitted during conversation.

Tom Hicks said...

Yea, I agree, that's a good example of redundancy: the Class name is already constrained by the package in which it occurs and doesn't need to repeat it.