Naming Things

Introduction Imagining Objects Name Hunting Breaking Methods Apart The Shape of Code When Opposites Confuse The Tautologous Name Trap Names are Fickle Abstracting Too Soon Speaking the Native Tongue Naming and Teaching Author’s Note

Naming Things

Thoughts on one of coding’s most elusive tasks.
By Ka Wai Cheung of DoneDone.

Start reading...

Introduction

Here’s a little insight into what I think about when I write code.

I’m working on—what I’ll call for now—a data migration feature. It’s for the brand new version of DoneDone, the task tracking and customer support app I’ve worked on for the past decade. Developing a product for this long, I’m intimately familiar with its code—like a scupltor chiseling away endlessly at a large piece of stone.

The goal of this migration feature is to give our customers an easy way to bring their existing data over from the old version of DoneDone (which we call Classic) to this new version. I wish I could tell you this is a simple mapping of database tables from Classic over to the new version, but it’s far from it. The new DoneDone is markedly different from its predecessor; Some data maps simply, other data requires some massaging, and some stuff simply can’t be mapped at all.

For the next week, I chisel away at this feature from top to bottom. I develop a screen to sign in to the old system, one to let users choose the projects they want to move over, and one to see the progress of their migration requests. On the backend, I work on a number of database updates to store these requests. I then write a separate service that picks up these requests to perform the arduous work of moving this data over “cleanly”. There are other tangential pieces I build along the way, like emailing the requester when the migration is complete and broadcasting error notifications.

It’s intense work. But after a week, I feel confident about this new feature.

Before I’m ready to release it, I give my code another onceover—like re-reading a manuscript from the beginning again with a fresh set of eyes. I tend to pick out things I don’t like about my code best this way.

The first thing I look at is naming. I try to use the same terms when I write code as when I talk about a feature. This avoids any unnecessary mental mapping when I transition between the screen and the rest of the world. So naturally, my codebase is littered with derivates of the word migration.

There’s a ClassicMigrator project in my codebase.
Methods named QueueMigrationRequest() and MigrateClassicProjects().
Class properties like EligibleForMigration and HasMigratableProjects.
There are models, views, and controllers with every derivative of Migrate sprinkled around. The copy on the application uses the words migrate and migration too.

On this re-read, the word is eating at me. Mike (my business partner) and I have been using the word “migration” in reference to this feature the whole time. It’s an important word to get right.

Sometimes you use a word so much that you no longer think about what it actually means; You just know what it’s supposed to mean. Here’s the problem: We aren’t actually migrating data.

Migrate has this connotation that something is leaving one place to go to another, like a flock of birds migrating south for the winter. In our case, data isn’t leaving the Classic version. That data is still there—untouched—after the migration. I wrote it this way so existing customers can try the new system using their existing data, but if they don’t like it, they can stick with Classic.

Migrate is misleading. Using that word in the application copy might make customers apprehensive about their existing data. Using that word in code might confuse future developers about what the feature actually does.

I contemplate replacing migration with copy. It’s clear that copying doesn’t mean removing the original. But, this isn’t quite right either. As I mentioned earlier, this data transfer isn’t a literal copy. There are some things that don’t translate perfectly, or at all. Copying also seems like a fast, mindless operation—a simple CTRL+C CTRL+V exercise. That’s not what this is.

I tell Mike about my conundrum.

After some thought, he suggests we use the word import instead. Ah ha!

There’s a heftiness to the word import that feels right. Whenever I think about importing data, I envision metallic gear icons and the momentary spiking of CPU graphs. Even when you import things in the real world, it has that same feeling of heft—huge cargo ships meandering across the ocean lugging thousands of tons of goods.

The word import, as it’s normally used in technical terms, also doesn’t feel like data is leaving one place and going to another. I think of importing data from a file I’ve uploaded. I know the data on the file doesn’t disappear. I also don’t necessarily expect a one-to-one mapping between my data and the imported data. Import seems like the perfect word to me.

So, I end up substituting migrate, and all its various derivatives, with import. I’m much happier with this change.

I obsess over names. Finding that perfect name gives me the same kind of adrenaline boost I get after I’ve solved a difficult problem or figured out a much cleaner approach to an ugly solution.

My obsession with naming started from a quote I read some time ago. If you’ve written code long enough, there’s a good chance you’ve heard it as well.

“There are only two hard things in Computer Science: cache invalidation and naming things.”

Phil Karlton

Karlton was a software architect at Netscape. There is suprisingly little other information I could find about him though I’m sure he’s done great work. But, it’s this quote that he will be forever remembered by in the programming industry.

The first time I read this quote, I remember chuckling to myself. First, because I can think of many things that are difficult for me in programming. Second, because naming wasn’t initially among those things; I had never thought about naming as a difficult exercise. Yet, we all have written and read names that confuse, misdirect, conflate, or otherwise mistify us.

So, how do you name things well? Unlike so many other things in programming, an incoherent name won’t be caught by the compiler. There are no metrics for naming. A bad name won’t break your code. A good name won’t speed up your build.

Naming is elusive. It has a lot to do with gut, feel, style and even aesthetics. It is, in my humble opinion, the most subjective of technical subjects.

Though there are no metrics for good names, it deserves as much attention as all the other skills we preach in programming—like good architecture, writing “clean code”, or rigid testing. While these other practices are critical, they share the common drawback that you cannot see these things instantly. It’s only after digesting the codebase and working with it for awhile that you reap its benefits.

On the other hand, a codebase with good names pays off immediately. They are the first things a programmer sees when reading new code. They make code more approachable. With modern tooling, you can also change the names of things safely and quickly.

This book is about how focusing on names can drive us toward better code—regardless of the languages, tools, or development environment you use. Many of the examples in this book come directly from, or were inspired by, real code I’ve written for DoneDone over the past decade. Let’s begin.

Imagining Objects

I think the term object-oriented programming is misleading.

Textbooks tend to explain concepts like classes and interfaces by using dogs, cats, and other domesticated cuddly animals. But, these aren’t realistic examples for most coders. Most of the things we call “objects” in code don’t have a direct physical representation in the real world.

We should really call it noun-oriented programming. A noun can be a person, place, thing, or idea. Most classes really are ideas with functionality.

Classes sometimes manifest because we find ourselves with a set of functions and properties that have a common purpose that we want to wrap up in a neat little bundle. But, these classes don’t always have a physical translation. This is where wishy-washy names like UserManager, MessagingHelper, and AppHandler are born.

Working through a codebase littered with these kinds of class names reminds me of working in a bloated organization where everybody is some form of middle-management. When should I direct a question to the Regional Vice President versus the District Assistant Principal?

When I read code like this, I have to dig into these classes to figure out what purpose they serve. When I know there’s some functionality out there I want to leverage, I have a harder time remembering where it lives. Was it in that helper doohickey or in the other manager thingamajig?

There are ways around this. Generic names might be a sign that the guts of the class belong elsewhere. For instance, maybe the methods inside that UserManager class can be moved into the User class itself. It might also be a sign that the class does too many things and needs to be split up into smaller pieces. Perhaps there are natural groupings inside that AppHandler class—one that handles initialization, one that handles routing, one that deals with exception handling, and so forth. More specific names can be derived from there.

If it’s neither of those cases, sometimes I just have to face reality: A class can be hard to name because it does something that doesn’t easily translate in the real world. That’s when a little imagination helps. Even when a class is responsible for something that only makes sense in my code, there’s usually some metaphorical noun out there I can apply to it to make it memorable. This makes it easier to recall when I need to revisit that “object” again.

The other day, I was looking at code I wrote awhile back that allows DoneDone users to reset their password. The reset password function works like most other apps:

A user enters their email address in a “Forgot Password” form from the app.
They receive an email with a link containing an encrypted token embedded in the querystring.
When clicked, the request passes the token to the server and is decrypted.
The information from the decryption determines whether the reset link is still valid and which user originally requested the reset.

On this re-read, I don’t like how the token concept is implemented. Bits of logic are sprinkled in too many places. The token is encrypted on one layer of the stack and decrypted on a completely different layer. There are also a couple of places where I write repeated code to check whether the token is still valid.

But, what makes me most hungry to clean this code up is that the concept is nearly identical to the process a user takes to complete their initial registration.

It’s clear that a better approach is to wrap up this token into a single object. Let the object handle all of it—the encryption, decryption, and validation of the token. Then, I can re-use it for both password resets and user registrations.

I get to a place I’m quite happy with. The guts of the object look something like this.

public sealed class AuthToken
{
  private readonly DateTime _utc_date_issued;

  public readonly int UserID;
  public readonly string EmailAddress;
  public readonly string EncryptedToken;

  public AuthToken(int user_id, string email)
  {
    UserID = user_id;
    EmailAddress = email.ToLower().Trim();
    _utc_date_issued = DateTime.UtcNow;
    EncryptedToken = // Omitted for simplicity...
  }

  public AuthToken(string encrypted_token)
  {
    try
    {
       EncryptedToken = encrypted_token;
       UserID = // Deduced from the token...
       EmailAddress = // Deduced from the token...
       _utc_date_issued = // Deduced from the token...
    }
    catch
    {
      throw new InvalidInput("Cannot decrypt token!");
    }
  }

  public bool IssuedWithinMinutes(int minutes)
  {
    return (_utc_date_issued.AddMinutes(minutes) > DateTime.UtcNow);
  }
}

Taking a quick walkthrough of this object, you’ll notice that there are two constructors.

One hydrates the properties of the object with a user_id and email of the user when a password reset (or registration completion request) is initiated. The timestamp of the token is set to the moment the instance is created. It also wraps all of this data together into an encrypted token.

The other hydrates the same object with the encrypted token. The token is decrypted and the other properties are deduced from the decryption. This is called during the request when a user clicks the link they received from the email.

The IssuedWithinMinutes public method allows code elsewhere to decide whether to honor the request. For instance, a password reset link might be valid for only ten minutes whereas a user registration link could be valid for a few hours.

I’m giddy with the promises of such an object. I’m able to clean up some duplicate logic used by both the password reset and user registration processes. Whereas the encryption and decryption process once lived in random helper methods on different layers of the stack, they now have a comfortable home.

The last hurdle, however, is a big one. What do I name this thing? This “object” is not a dog or cat. I don’t really know what this thing is.

My initial attempt, AuthToken was half-hearted. I just wanted to get something down so I could finish the implementation. Reading this name again brings up all sorts of questions and lackluster answers.

Does “Auth” mean “Authorization” or “Authentication”? In this case, it kind of means both. That doesn’t really help.
Does this object represent the encrypted token? Kind of. It really represents the encrypted token in addition to the data the token represents. Calling it AuthToken while also having a property with the name EncryptedToken is confusing. It’s more than just the token. It’s easy to get into the trap of naming an object for only part of its reason for being.
What is this object supposed to be used for? In the lexicon of object naming, AuthToken is about as generic as UserManager.

This clearly isn’t the right name. But, what comparable thing possibly exists in the real world like this?

I begin to think about something that an authority creates for someone, who can later exchange this thing to do whatever the authority said they could do.

A ticket comes to mind. But, that conjures up thoughts of going to a sporting event or movie premiere, as if there’s a specific time and place to redeem it. It also makes me think of a parking ticket. Password resets and user registrations are neither particularly exciting nor dreadful. The metaphor doesn’t feel quite right.

A permit? A permit is valid for a set period of time and it lets someone do something agreed to by an authority until it expires. Plus, a permit is something given to you usually by a governing body, not your local movie theater or sporting venue. This feels spot on.

I end up changing the class name to PermitForUserUpdate. The implementation instantly feels more readable. For instance, the method IssuedWithinMinutes() reads more naturally when used in context. Permits in the real world are normally issued. Here’s how I can validate a password reset permit hasn’t expired yet.

var permit = new PermitForUserUpdate(encrypted_token);

if (!permit.IssuedWithinMinutes(10))
{
  throw new Exception("This permit is expired!");
}

When opportunities like these present themselves in your code, stop for a minute and see if you can find a replacement for that really vague class name. Renaming classes might require a little bit of imagination, but done repeatedely, your objects become more memorable and your codebase starts to read more fluidly.

Name Hunting

There never seems to be time to clean up code.

No client wants to pay for it. No product manager wants to see energy spent with no feature improvements. You sometimes have to clean up surreptitiously as you go. Do it behind their backs.

“Of course, many [managers] say they are driven by quality but are more driven by schedule...In these cases I give my more controversial advice: Don’t tell! Subversive? I don’t think so.”

Martin Fowler, Refactoring: Improving the Design of Existing Code

The good news is, you don’t need a long stretch of dedicated time to make positive impacts on your code. You can get a lot done in small spurts. One of my favorite exercises is one of the simplest: Find things to name. I look for bits of overexposed logic, then replace the logic with a method or property that I can define with a meaningful name. Done repeatedly, it can quickly make souring code sweet.

On the new DoneDone, we’re using Vue.js to deliver most of our front-end. I notice a conditional on a Vue element that looks like this:

<div v-if="![’xs’, ’sm’, ’md’].includes($mq)">
    ...
</div>

If you’re unfamiliar with Vue syntax, no big deal. The v-if attribute works just like a normal if statement, with a Typescript expression inside of it. Inside an HTML element, it determines if that element should be rendered at all. In this case, I have a <div> that I only want to show if the statement ![’xs’, ’sm’, ’md’].includes($mq) is true.

But, what does this code actually mean? Well, it evaluates to true if the “extra small”, “small”, or “medium” media query breakpoints are not hit based on the size of the browser. Put more meaningfully, it tells us if the current browser width is sized to at least the width of a normal desktop screen.

When I scan this statement, it looks cryptic. Out of place. Too specific in the context of the code around it.

I can quickly fix this line by swapping the logic with a well-named property, like isDesktopWidth.

isDesktopWidth(): boolean {
  return ![’xs’, ’sm’, ’md’].includes(this.$mq)
}

Now, I get the satisfaction of cleaning up my original code up with something much more approachable.

<div v-if="isDesktopWidth">
    ...
</div>

Not only does this read better, but I have a property I can reuse again in other parts of the application.

You can spot an opportunity like this from a mile away—an overly technical bit of code lying around without a proper home. I usually write code like this on the first pass, when I’m just trying to get a feature to work right and I don’t care about where all the pieces fit. But, if I never make that second pass, then things quickly turn ugly.

In my earlier days, it would be easy to forget to do that second pass because I got lost in the relief of simply getting code to work right—or I was already past the deadline I had set for myself to do so.

I don’t skip that second pass anymore. It’s this pass where I focus heavily on naming. Where I look to say what rather than how. Where the readability of my code improves dramatically. It’s as critical a step as the first.

Take a moment to look at your own code—regardless of where you are in your stack. You might be surprised how many bits of messy logic are sprinkled about that you could wrap up into a meaningful name.

Empowering your objects

Logic bits don’t always have to look overly technical to benefit from replacing it with a named method or property.

Even in a case where I might see business logic already using the well-named properties of an object, there’s usually a way I can name that piece of logic and push it back into the class definition. Here’s an example.

I have a Person class that houses some basic information used throughout my codebase.

public class Person
{
    public string FirstName;
    public string LastName;
    public DateTime LastAccessTimestamp;
    public AccountRole Role;
    ...
}

Instances of Person naturally spring up all over the place.

For example, on a person’s profile screen, I have an instance of Person named authedPerson used to display an user’s full name and a few links to other sections of the application, but only if they are an admin or owner in the account.

<div>
  <h2>@authedPerson.FirstName @authedPerson.LastName</h2>
  @if (authedPerson.Role == AccountRole.ADMIN || 
       authedPerson.Role == AccountRole.OWNER)
  {
      <a href="...">Edit</a> | <a href="...">Cancel</a>
  }
</div>

In another part of the application, I use a person’s first name and last initial to prep notification messages when they update a task.

var subject = person.FirstName + " " +  person.LastName.Substring(0,1) + ". updated the task.";

I also have a method inside of a security class that checks if a person has accessed the application within an hour. If not, I require them to log in again.

if ((person.LastAccessTimestamp - DateTime.UtcNow).TotalMinutes > 60)
{
    // Log out and send to the login screen.
}

There are a handful of other occurences like the examples above, where little bits of business logic against a Person’s properties are sprinkled about. Most of these bits feel so inconsequentually minor—simple, one-line constructions and statements—you might not even consider them to be “business logic” at all.

For instance, in the profile screen example, displaying a person’s first and last name might not seem like logic, but it is—it represents a person’s full name. I can push this bit of logic back to the Person class itself and name it something meaningful.

public string FullName
{
  get
  {
    return FirstName + " " + LastName;
  }
}

The same can be done for the special format of the person’s name in the subject line of the notification message. I could call this an abbreviated name.

public string AbbreviatedName
{
  get
  {
    return FirstName + " " + LastName.Substring(0,1) + ".";
  }
}

Back on the profile screen, I can move the check for whether a person is an admin or owner as a property of the Person instead. In this case, the check determines whether this person has administrative access. HasAdminAccess is a sound name for this new property.

public bool HasAdminAccess
{
  get
  {
    return Role == AccountRoleType.ADMIN || Role == AccountRoleType.OWNER;
  }
}

The application access logic against the Person object within the security class can be pushed back to the class in a couple of ways. Here’s that conditional statement again.

if ((person.LastAccessTimestamp - DateTime.Now).TotalMinutes > 60)...

There are a few ways I could go about moving this. The entire statement is asking if the person has been idle for more than 60 minutes, so I could take this entire statement and turn it into a boolean property off Person like so:

public bool HasPersonBeenIdleForMoreThan60Minutes
{
  get
  {
    return (person.LastLoggedIn - DateTime.Now).TotalMinutes > 60;
  }
}

This cleans up all of the logic from the security method. But, the name feels way too specific. If someone were just inspecting the Person class, they might ask why such a specific property exists. In addition, if I change the requirements around the idle time, I might easily forget to change the name of the property.

I don’t like these tradeoffs. In this case, I’d rather pull back on the specificity of the property so it has a better chance of being reused and maintained well over time.

For instance, I could push just the calculation of the idle minutes into the object and call this property MinutesIdle.

public int MinutesIdle
{
  get
  {
    return (person.LastAccessTimestamp - DateTime.Now).TotalMinutes;
  }
}

Or, I could convert the logic to a method and let the caller pass in the idle minutes to compare.

public bool IdleLongerThanMinutes(int minutes)
{
  return (person.LastAccessTimestamp - DateTime.Now).TotalMinutes > minutes;
}

These two examples are both decent options. But, I like the first option—it feels more straightforward and reads more coherently.

With these updates, the Person class now develops into something a lot more powerful. Here’s what the full class now looks like with these additional, well-named properties.

public class Person
{
    public string FirstName;
    public string LastName;
    public DateTime LastLoggedIn;
    public AccountRoleType Role;
    
    public string FullName
    {
      get
      {
        return FirstName + " " + LastName;
      }
    }  
    
    public string AbbreviatedName
    {
      get
      {
        return FirstName + " " + LastName.Substring(0,1) + ".";
      }
    } 
    
    public bool HasAdminAccess
    {
      get
      {
        return Role == AccountRoleType.ADMIN || 
               Role == AccountRoleType.OWNER;
      }
    } 
    
    public int MinutesIdle
    {
      get
      {
        return (person.LastAccessTimestamp - DateTime.Now).TotalMinutes;
      }
    } 
    ...
}

By moving this logic into the Person class, it’s now easier to DRY up my codebase. There will likely be other places that require displaying a person’s full name or knowing whether they have administrative privileges. Those answers are already baked into the object itself.

Besides reuse, the biggest gain comes from the improved readability of my code. Here’s how the improved implementations look like. In my HTML markup, the business logic visually competes far less with the HTML around it.

<div>
  <h2>@authedPerson.FullName</h2>
  @if (authedPerson.HasAdminAccess)
  {
      <a href="...">Edit</a> | <a href="...">Cancel</a>
  }
</div>

The subject of the email notification can also be interpreted with one glance. You don’t spend time focusing on the details of how the person’s name is being displayed anymore.

var subject = person.AbbreviatedName + " updated the task.";

Finally, the conditional check on the person’s login date can be understood instantly, instead of having to parse (even if for a brief moment) through the date math.

if (person.MinutesIdle > 60)
{
  // Log out and send to the login screen.
}

The best part of this work is that, once you get accustomed to the game, it’s not a heavy effort. It actually becomes a bit addictive. You can make these simple refactorings quickly and stop whenever the time you’ve devoted is up (or your manager comes back from lunch).

Keep hunting for places where you can corral bits of logic into meaningful names, whether as standalone properties or back into the objects they derived from. It will do wonders for the clarity of your code.

Breaking Methods Apart

As you add more parameters to a method, two problems tend to occur.

First, the method starts doing too much. There might be tricky conditional logic that would be better separated into their own methods. Second, the name of the method becomes increasingly more vague or more misleading. Whenever I’ve augmented a method to support a new feature, I revisit it to see if breaking it apart can help alleviate both problems—even if the updates seem miniscule.

For years, DoneDone has only allowed customers the option to cancel an account immediately—it was instantaneous and irreversible.

I have a method off of a billing repository class that’s responsible for invoking the cancellation when requested. The guts of the method are involved, but the method signature is about as simple a read as you can imagine.

public class BillingRepository
{
  public void CancelAccount(int account_id) { ... };
  ...
}

Over the years, we’ve had customers who’ve wanted to cancel their account at the end of their term which could be several months out. Rather than remembering to cancel their account in a few months, they wanted the account to automatically cancel on the last day of their term.

I start implementing this by tacking on a parameter to CancelAccount(). Since there are now two cancellation options, cancelling immediately or at the end of their current billing period, I choose the simplest parameter type that fills the need, the trusty boolean.

I decide to name it cancel_at_period_end. Now, I can pass in true to handle this new special case, and false to handle the original case.

public void CancelAccount(int account_id, bool cancel_at_period_end);

After I’ve implemented the new code to handle the update, I then update all existing references to this method that handle the immediate cancellations.

_billing.CancelAccount(account_id, false);

But something doesn’t feel right about this method signature.

At a glance, it’s hard to tell what the false parameter means. Having just written the updates, it makes sense to me now. But it might not to someone else (or to myself in a few days). They’ll have to look at the method signature and perhaps even drill into the method to be sure.

I add a comment above each call to CancelAccount() for clarity. It also helps differentiate between the new code I’ll be adding later to handle the new option of canceling at the end of the period.

// Cancel the account immediately...
_billing.CancelAccount(account_id, false);

Better. But comments never age well. The method call still feels strange. The standard cancellation case (canceling immediately) accepts the false parameter. Passing in false as the base case just feels odd—it’s as if I have to suppress something to perform the default action.

I can get around this pretty quickly though. Since I’m working with a boolean parameter, I can change its meaning so that the standard case passes in true and update my code accordingly.

I swap the cancel_at_period_end parameter with cancel_now and modify the implementation. Now, the default case passes in true.

public void CancelAccount(int account_id, bool cancel_now) { ... }

...

// Cancel the account immediately...
_billing.CancelAccount(account_id, true);

But, I’ve introduced a more onerous problem. By simply reading the CancelAccount() method signature, I can’t quite tell what passing in false would do. Would it cancel in a day? In a month? At the end of the period?

At this point, I’ve exhausted my options with the boolean parameter. While it allows for both options, the options aren’t clear from the method signature.

I often find this is the case with booleans when the concept it describes has two cases but the options aren’t really opposites. That’s the case here—the natural opposite of cancel_now isn’t cancel_at_period_end in the way the natural opposite of open is closed.

I could try introducing an enumerated value instead.

enum CancelType 
{
  NOW,
  AT_PERIOD_END
}
...

public void CancelAccount(int account_id, CancelType cancel_type);

...

_billing.CancelAccount(account_id, CancelType.NOW);

This is an improvement. On the plus side, I’ve gotten rid of the ambiguity issues I had with the boolean parameter. It also leaves me better positioned to introduce additional cancelation types in the future.

However, this just doesn’t feel like one of those features we’d continually augment in the near future. There just aren’t that many variations of canceling DoneDone that would make sense. And, now I’ve introduced a new type as well as a new parameter.

I ultimately decide to simplify things. I create two distinct cancelation methods and convey the type of cancelation in the method names.

public void CancelAccountNow(int account_id);
public void CancelAccountAtPeriodEnd(int account_id);

With this update, the standard and unique cancelation implementations both read clearly. There’s zero ambiguity in either what the method does or what the parameters mean.

I also get an additional benefit. Breaking the method out into two methods allows me to separate the implementations of each. In the original approach, I’d have to do something like this.

The method body would not only be much longer, but it would have more than one responsibility. Breaking the methods apart not only clarify their use, but will make finding and updating their implementations easier down the road.

void CancelAccount(int account_id, bool cancel_at_period_end)
{
  if (cancel_at_period_end)
  {
    // Implementation for canceling at period end
  }
  else
  {
    // Implementation for canceling immediately
  }
}

I find a similar opportunity arises when null values are passed to a method. I can usually a new method to handle the null case with a much clearer name.

In DoneDone, I have a series of “bulk edit” methods that live in a services layer. Each accepts a list of task_ids and performs some action on the task they represent. One of these is a function to bulk update the tasks’ due dates.

Here’s the abbreviated signature. (Due dates are optional—hence the nullable DateTime? object representing the due date in the parameter list.)

public void UpdateDueDates(List<long> task_ids, DateTime? due_date, ...);

Crawling up to the application layer, here’s where I call the bulk edit due date method based on the user’s input:

switch (input.ActionChangeType)
{
  case BulkActions.UPDATE_DUE_DATE:
    _service.UpdateDueDates(input.TaskIDs, DateTime.Parse(input.Value),...);
    break;
    ... 
}

In a recent feature update, we wanted to explicitly add an option to remove due dates from all tasks. Because the UpdateDueDates() method already gives the option to pass in a null value, the update is straightforward:

switch (input.ActionChangeType)
{
  case BulkActions.UPDATE_DUE_DATE:
    _service.UpdateDueDates(input.ItemIDs, DateTime.Parse(input.Value), ...);
    break;
        
  case BulkActions.REMOVE_DUE_DATE:
    _service.UpdateDueDates(input.ItemIDs, null, ...);
    break;
    ... 
}

But, explicitly passing null makes the code less readable. When I revisit this line of code down the road, I have to trickle into the method to be certain of what null refers to. So why not eliminate the need altogether?

Back on the service layer, I create a new method specifically for removing due dates that wraps the original UpdateDueDates() method. And once again, I can give it a more meaningful name: RemoveDueDates().

public void RemoveDueDates(List<long> task_ids, ...)
{
  UpdateDueDates(task_ids, null, ...);
}

With this small addition, I can tidy up the application layer code. In context, the two actions around due dates are now much easier to parse.

switch (input.ActionChangeType)
{
  case BulkActions.UPDATE_DUE_DATE:
    _service.UpdateDueDates(input.TaskIDs, DateTime.Parse(input.Value), ...);
    break;
        
  case BulkActions.REMOVE_DUE_DATE:
    _service.RemoveDueDates(input.TaskIDs, ...);
    break;
    ... 
}

In both examples, adding a new method rather than relying on the parameters of an existing method were fairly easy decisions. In both cases, there was only one variant to the parameters, requiring one additional method.

If the amount of variations are small (say, three or less), and these variations are unlikely to change over time, creating new methods that take the place of a parameter makes a whole lot of sense to improve your code readability.

The Shape of Code

Code has a certain shape to it. Its spacing. The indentations. Where line breaks are made. All of this stuff matters.

Good shape makes code easier—and more enjoyable—to read. So, at times, my reason for naming something is unapologetically superficial: It might have more to do with the code’s shape than the meaning of the name.

Consider this simple for loop.

for (int i=0; i < tokens.length; i++)
{
  if (tokens[i].Used)
  {
    usedTokens.Add(tokens[i]);
  }
}

To me, this code has good shape. The importance of the variables are roughly equal to their size. When I read this code, it doesn’t take me long to figure out that there is a tokens array, and used tokens in that array are added to a usedTokens list.

Here’s that same code block again with one name change. Instead of using the generic variable name i, I substitute it with a much more precise name, curIndexOfTokenArray.

for (int curIndexOfTokenArray=0; curIndexOfTokenArray < tokens.length; curIndexOfTokenArray++)
{
 if (tokens[curIndexOfTokenArray].Used)
 {
   usedTokens.Add(tokens[curIndexOfTokenArray]);
 }
}

If I think only in terms of clarity, then the code should be an improvement. Certainly, curIndexOfTokenArray is clearer than i. But, is it better?

While the current index is a critical anchor of a for loop, representing it in a verbally meaningful way isn’t. For one thing, an index is common to every kind of for loop. In addition, its scope is small—it only exists within the small loop. If someone were really confused about the variable name, they’d only need to look around a small visual radius to get re-familiarized.

The stars of the show here ought to be the tokens array and usedTokens list. Adding more description to i brings a peripheral stage crew member into the spotlight. The lines are altogether more difficult to digest—a whole lot of verbosity to explain something quite simple.

The original version values the shape of the entire statement over the specificity of the variable name. There are certain times where this is a better trade-off. This is one of them.

Here’s another example. In DoneDone, systemTimeZones represent a collection of objects each describing a time zone. The method below loops through this collection, then extracts time zone information to build a list of DropDownComponents while marking the passed-in time zone as Selected:

public List<DropDownComponent> BuildTimeZonesDropDownList(string selected_time_zone)
{
  var result  = new List<DropDownComponent>();
  var systemTimeZones = TimeZoneInfo.GetSystemTimeZones();

  foreach (var systemTimeZone in systemTimeZones)
  {
    var comp = new DropDownComponent(systemTimeZone.Id, systemTimeZone.DisplayName);

    comp.Selected = (systemTimeZone.Id == selected_time_zone);

    result.Add(comp);
  }

  return result;
}

The immediate problem with this code is how repetitive some of the variable names look. There are several names in this method that all look similar at a glance:

The systemTimeZones collection
The systemTimeZone object scoped in the foreach loop
The selected_time_zone parameter
The GetSystemTimeZones() method call.

Scenarios like these are tricky because each name, in isolation, is appropriately descriptive. No single name is egregiously lengthy, misleading, or overly detailed.

The problem, however, is when we step back and read the method as a whole. It reminds me a lot of the lines of a Dr. Suess story. One of my kids’ favorites is Hop On Pop, which begins:

UP PUP - Pup is up.
CUP PUP - Pup in cup.
PUP CUP - Cup on pup.

Dr. Seuss, Hop on Pop

The lines of Dr. Seuss, of course, are intentionally dizzying. My kids love to get lost in the pattern of the words and giggles at the absurdity of its rhythm. But, reading dizzying code at 5pm isn’t that fun. For this, I need to make some name improvements.

The most immediate and impactful name change I can make is with the name of the systemTimeZone object. It appears four times within the method; No other similarly named construct appears more than twice. Also, because its reach is small (only scoped to the foreach loop), I can get away with a less descriptive name without doing much harm to the reader’s understanding.

I try reducing systemTimeZone to something shorter, like tz:

public List<DropDownComponent> BuildTimeZonesDropDownList(string selected_time_zone)
{
  var result  = new List<DropDownComponent>();
  var systemTimeZones = TimeZoneInfo.GetSystemTimeZones();

  foreach (var tz in systemTimeZones)
  {
    var component = new DropDownComponent(tz.Id, tz.DisplayName);

    comp.Selected = (tz.Id == selected_time_zone);

    result.Add(component);
  }

  return result;
}

I like this change already. Now, the details of the foreach loop are much easier to scan. In addition, all of the other similarly-named constructs benefit. They’re given more room so that the similarity in their names aren’t as distracting on the eyes as they were before.

Just like in the prior example, the variable tz feels sized appropriately. Conceptually, a small name like tz feels like just one element in a longer-named collection like systemTimeZones. These are the kinds of subtle visual cues that all lend themselves to good code shape.

Next, I decide to pare down the variable systemTimeZones to just timeZones. I originally named this list systemTimeZones to follow how the .NET framework named the method I’m using (GetSystemTimeZones()). But, the word system doesn’t add any helpful meaning here. This change also helps better differentiate it from the selectedTimeZone variable used inside the foreach loop.

public List<DropDownComponent> BuildTimeZonesDropDownList(string selected_time_zone)
{
  var result  = new List<DropDownComponent>();
  var timeZones = TimeZoneInfo.GetSystemTimeZones();

  foreach (var tz in timeZones)
  {
    var component = new DropDownComponent(tz.Id, tz.DisplayName);

    comp.Selected = (tz.Id == selected_time_zone);

    result.Add(component);
  }

  return result;
}

I could go further and rename the other variables more uniquely, but I don’t feel it’s necessary. When I read the updated method, the dizzying effect is gone. Mission accomplished.

One last example. In DoneDone, there are places where I write code to build up larger segments of text. For instance, I use a StreamWriter to generate documentation for our API and a StringBuilder to write lines of dynamic text into an email object.

Here’s a case where I’ve used a Document object to build up our search catalog. The search catalog library I’m using is a port of the well-known Java search library Lucene, so naturally, a name like lucene_document feels appropriate here. Here’s a bit of the code snippet that builds an instance of Document before it’s added to the search catalog:

var lucene_document = new Document();

lucene_document.Add(new Field(_FIELD_item_event_id, item_event.ItemEventID...);
lucene_document.Add(new Field(_FIELD_item_id, item_event.ItemID....);
lucene_document.Add(new Field(_FIELD_project_id, item_event.ProjectID...);
lucene_document.Add(new Field(_FIELD_creator_id, item_event.CreatorID...);
lucene_document.Add(new Field(_FIELD_created_on, item_event.CreatedOn...);
lucene_document.Add(new Field(_FIELD_is_convo_thread, item_event.IsConvoThread...);
lucene_document.Add(new Field(_FIELD_is_non_convo_creation, item_event.IsNonConvoCreation...);
lucene_document.Add(new Field(_FIELD_item_title, item_event.Title...);
lucene_document.Add(new Field(_FIELD_item_event_desc, item_event.PlainTextDescription...);

Is it specific? Yes. But, at a glance, the name lucene_document is overwhelming. It pushes the interesting part of the code further to the right without adding much information. I decide to trim this name down to something much smaller, like doc.

var doc.Add = new Document();

doc.Add(new Field(_FIELD_item_event_id, item_event.ItemEventID...);
doc.Add(new Field(_FIELD_item_id, item_event.ItemID....);
doc.Add(new Field(_FIELD_project_id, item_event.ProjectID...);
doc.Add(new Field(_FIELD_creator_id, item_event.CreatorID...);
doc.Add(new Field(_FIELD_created_on, item_event.CreatedOn...);
doc.Add(new Field(_FIELD_is_convo_thread, item_event.IsConvoThread...);
doc.Add(new Field(_FIELD_is_non_convo_creation, item_event.IsNonConvoCreation...);
doc.Add(new Field(_FIELD_item_title, item_event.Title...);
doc.Add(new Field(_FIELD_item_event_desc, item_event.PlainTextDescription...);

Now, the code feels less cluttered. The interesting parts are quicker to get to. It’s much easier to consume.

When you edit names, don’t look at them in isolation. That habit can drive naming decisions that actually harm the overall readability of the surrounding code.

Instead, look at the context in which these names live. Find out what makes a section of code difficult to read and solve the larger problem. It might be a more descriptive variable name, but it might also be a terse one. Let the full context drive those decisions.

In the previous examples, I didn’t value the precision of certain variable names as much as I did the shape of the lines around them, particularly because the scope of those variables was small or the extra precision added little context.

When editing prose, you read whole sentences and paragraphs to get a sense of readability and style. I find code writing to be similar.

When Opposites Confuse

The English language has enough breadth that most words have a clear opposite.

For a piece of functionality that allows you to move a file out of the trash, you don’t have to say undelete; Restore makes perfect sense. It’s clear that a file that can be restored has already been deleted or removed.

But, sometimes finding the opposite isn’t always straightforward.

We have a new concept in DoneDone called workflows. A workflow defines a series of statuses available for a task. A typical workflow might have statuses like “Open”, “In Progress”, “Not Reproducible”, “Closed” and so forth. We let users create workflows to tailor them to their particular business processes.

A workflow has its own status too. It starts as unpublished. When you’re ready to use the workflow on a project, you must publish it. Simple enough.

But, once a workflow is published, you can still unpublish the workflow. There are a few caveats to this, so I decide to wrap this logic inside of a convenient little method in the Workflow class. My first attempt at a method name is IsWorkflowUnpublishable().

Quickly, I realize there’s something unsavory about this name. Someone could easily mistake this name to mean that a workflow cannot be published rather than that a workflow can be unpublished. Those are, of course, two very different things.

The problem is that there just isn’t a good adjective that means the opposite of published. Draft might be the best way to describe this concept. It’s something I’ve seen before in blogging tools for instance. But a method name like IsWorkflowDraftable() or IsWorkflowAbleToBeInDraftMode() feels like I’m headed in the wrong direction fast.

In cases like this, I look for a completely different angle to the name. After staring at my options for a few minutes, I realize the obstacle lies with the prefix isWorkflow. It forces me into having to come up with an adjective to describe the state of the workflow I want to achieve—and there aren’t any good ones that make the method name read clearly.

Instead of starting the name with the adjective-requiring “is”, I start with the verb-requiring “can” and come up with CanUnpublishWorkflow().

Now I might be onto something! Notice I still use the word unpublish, but as a verb the intent is no longer ambiguous. It’s clear we are checking whether this workflow can be unpublished as opposed to checking whether the workflow can’t be published.

If the opposite version of a name makes it ambiguous, instead of pushing harder on finding a different name, see if you can reword it altogether. You might already have all the pieces you need without knowing it.

The Tautologous Name Trap

The French philosopher René Descartes once said “I am, therefore I am.” What makes for a great philosophical statement to ponder also makes for an unhelpful line of code to read.

I’m working on a piece of code that processes incoming emails for DoneDone. One of its responsibilities is to send an auto-response email back to the original sender if certain conditions are met.

In the first iteration of this feature, I send the auto-response only if the email was received outside of a company’s office hours. I write a method named isCurrentlyOutsideOfficeHours() which queries the company’s work hours and figures out if the current time falls outside of them.

In my incoming email handler, I call this method to determine if DoneDone should send an auto-response.

if (isCurrentlyOutsideOfficeHours()) 
{
  sendAutoResponse();
}

The method name isCurrentlyOutsideOfficeHours() makes the conditional statement read coherently. Even a non-programmer can read the statement above and guess what it does: “If it is currently outside of the office hours, then send an auto-response.”

Suppose sometime later we decide to allow users to configure company holidays as well as office hours. This way, the auto-response will always be triggered during a company holiday regardless of the time of day.

Back under the hood, I add the new functionality. Now, the check for whether to send the auto-response has this new condition.

if (isCurrentlyOutsideOfficeHours() || isCompanyHoliday()) 
{ 
  sendAutoResponse(); 
}

Because I’ve added more complexity to the auto-response logic, I might push the conditional expression into its own method and call the new method in place of the expression. At first, a name like shouldSendAutoResponse() sounds completely sensible. Here’s what that would look like:

private bool shouldSendAutoResponse()
{
  return isOutsideOfficeHours() || isCompanyHoliday();
}

With a new property extracted, I can replace the original conditional so my code is tidy again.

if (shouldSendAutoResponse()) 
{ 
  sendAutoResponse(); 
}

But, do you spot the new problem? Descartes might. The statement, while tidy, has lost all of its meaning. Of course we send the auto-response if we should send the auto-response!

When you read the conditional in isolation, you don’t know why the auto-response is being sent. You need to trickle into the shouldSendAutoResponse() method to find out. The extraction doesn’t give us better comprehension—it just tucks away logic.

I find this happens a lot with these quick extraction exercises. My immediate inclination is to name the newly extracted property after the outcome of the conditions being met rather than the meaning of the conditions. I name the property after the effect rather than the cause.

Not only does this create tautalogous statements, but it’s less likely I’ll reuse the new construct somewhere else. At a glance, I wouldn’t think to employ shouldSendAutoResponse() anywhere else besides the place in code where I want to send auto-responses. If I had named the method after what causes the sending of the auto-response, I better my chance of reuse later.

So, why are we sending the auto-response? In this case, the cause of sending an auto-response message is because the office is closed. Let’s try that.

bool isOfficeClosed()
{
  return isOutsideOfficeHours() || isCompanyHoliday();
}

...

if (isOfficeClosed()) 
{ 
  sendAutoResponse(); 
}

The conditional now reads much more meaningfully. In addition, isOfficeClosed() is a method that has far more obvious applications than shouldSendAutoReponse() does.

Tautologous conditionals aren’t necessarily bad, though. There are times where describing the effect is the cleanest option. Let’s continue with this example.

Suppose that we introduce a few more conditions to determine whether to send an auto-response. Namely, we let an account toggle auto-responses altogether and we also want to exclude auto-responses to emails that are flagged as spam. My first step is to augment the conditional one more time.

if (isOfficeClosed() && autoResponseEnabled && !_email.IsSpam) 
{ 
  sendAutoResponse(); 
}

The conditional expression bloats up again—it seems ripe for packaging things up. But, I have trouble finding an elegant solution. Is there a meaningful name that could consolidate all (or some) of the expression isOfficeClosed(), autoResponseEnabled, and !_email.IsSpam? There doesn’t appear to be a common relationship other than they all factor into sending an auto-response.

I can either leave things as is, or name the entire expression for what it affects. In this case, I choose the latter.

I live with the tautologous statement to keep the code succinct, even though I’m unlikely to reuse this method elsewhere.

bool isOfficeClosed()
{
  return isOutsideOfficeHours() || isCompanyHoliday();
}

bool shouldSendAutoResponse()
{
  return isOfficeClosed() && autoResponseEnabled && !_email.IsSpam;
}

if (shouldSendAutoResponse()) 
{ 
  sendAutoResponse(); 
}

I make tradeoffs with names depending on how a section of code evolves. When I can find an appropriate name for the cause, I choose it. When I can’t, I might decide that naming an extraction for its effect is still better than not extracting it at all.

Names are Fickle

Keeping a codebase with well-intentioned names is often just about remembering to do so.

Every time I make a change to a codebase, I reconsider the names of all the pieces I touched. It’s easy to leave working code as-is without considering the debt I’ve just handed over to the next person reading this code.

Awhile back, I had a method that updated various pieces of account data, which I aptly named UpdateAccountInfo:

public void UpdateAccountInfo(int account_id, string account_name, int owner_id, byte[] logo) { ... }

As you can probably tell by the signature, this method lets you change the name, owner, and logo tied to the account with the passed-in account_id.

At some point, it became advantageous to handle the uploading of the logo somewhere else. As part of this update, I pulled the logo parameter out of this method.

public void UpdateAccountInfo(int account_id, string account_name, int owner_id) { ... }

Now, we want to allow for accounts to have multiple owners. Since the change is fairly large, I decide it best to manage owners in an entirely separate part of the application. Part of this update naturally requires removing the owner_id from this method’s signature.

public void UpdateAccountInfo(int account_id, string account_name) { ... }

In the flurry of updating code, I might leave UpdateAccountInfo() named as is. But, stripped of most of its original responsibilities, the name doesn’t feel right anymore. It would be much more precise to rename it UpdateAccountName()—that’s all it’s doing anymore. It also doesn’t hurt to shorten the parameters account_id and account_name to just id and name, since it’s obvious at this point what those parameter refer to.

public void UpdateAccountName(int id, string name) { ... }

This change sounds obvious to make, but it’s only because I’ve isolated the discussion of what changed to just this lone method—not the myriad of method additions, refactorings, and adjustments that come as a natural part of every kind of change we make to our code.

After you move code around your application to get the pieces fitting just right, revisit how you’ve named the methods, properties, and classes that have undergone the facelift. Do these names still make sense? Do the comments around these methods still apply?

“One of the biggest sins you can commit is to stop programming when it works.”

Brandon Rhodes, PyCon 2013

When you’re in the same code daily, you might not even notice that the name of a variable or method is antiquated because you’re so familiar with it. But, to someone coming into the codebase fresh (or, if you happen to take a few weeks off and come back to it later), misleading names become huge hurdles to their understanding of the system.

Nothing will automatically remind you to do this but your own habits. There won’t be a failed unit test or compiler warning telling you that a construct’s name is no longer relevant. That’s why even well-tested code atrophies over time. Good semantics are not something you get for free.

Abstracting Too Soon

Changing software architecture is a bit like driving a stick shift.

When the codebase is small and the features are light, I can write concretely. But, as features evolve, I begin to introduce abstractions as a way of absorbing the complexities of the system better.

Deciding when to create more abstraction is the difficult part. Like shifting gears on a car, if I abstract too slowly, the system starts to break down. Abstract too quickly, however, and it takes forever to accelerate even to the smallest degree. Knowing when the right time is requires remembering what landmarks I’ve already passed and a little foreknowledge of what’s coming down the road. It’s a delicate art.

Naming should also undergo the same kind of scrutiny.

A crutch of mine is trying to broaden the meaning of a name too quickly. Perhaps it’s my desparate attempt to get a name right, once and forever. But abstracting a name too soon usually inflicts more pain than comfort, just like abstracting code too soon.

I’m going through a refactoring in DoneDone’s codebase to consolidate a few objects that hold very similar kinds of information, but in slightly different ways. They all revolve around the history of events that happen to a given task.

Ultimately, the properties of these objects manifest in a few different places. On a task’s detail page, the collection of events for a single task is written out from a TaskDetailEvents list. On the activity calendar, a list of events on a given day is extracted from a list of EventHistory. When someone updates a task, yet another event object, TaskMessagingEvent, is packaged up. A mailing service unpacks the data and formats the right bits of information to send email updates.

On the surface, each of these objects looks very similar. My guess is that they could all be consolidated into one class.

After a few hours of passing the code through the refactoring press, I’ve reduced these classes into a single TaskEvent class that can support the needs of each of these three distinct use cases and more going forward. This also gives me an opportunity to consolidate similar names in the old objects like CreatedOn, CreatedDate, or CreateDate into one consistent name. I’m feeling pretty good.

In the process of this consolidation, however, I discover one particular string off of TaskMessagingEvent that doesn’t have an equivalent in the other objects. It’s currently called an EmailSubjectAction—a short description used as the beginning of the email subject, like "New fixer" or "Priority Update" or "Closed".

The dilemma I face here is what exactly to call this string. At present moment, it’s only used by the mailing service and not by either the detail page or activity dashboard. But, since I’ve done all the pruning to get toward a single object, I’m compelled to come up with a name that can satisfy all future implementers if they ever needed this property. EmailSubjectAction feels way to specific for this consolidated object.

My natural inclination is to call this something like AbbreviatedAction, especially because it juxtaposes nicely with the Action attribute that stores a more verbose description of the action (“John Doe was assigned to fix the task” or “The priority was updated to Critical”). I go with this name.

But, after a few days, I find myself going back to this property name a couple of times again because I can’t remember exactly what AbbreviatedAction refers to, or what it’s used for. I decide to comment the name to remind me it’s only being used by the mailing service as part of the subject of the email.

/* 
   Only being used by the mailing service right now as part of the email subject,
   but I’m sure it will have other uses one day. 
*/
public string AbbreviatedAction
{
  get { ... }
}

Whenever I have to remind myself what a name means, and eventually resort to a comment, there’s a good chance I’ve prematurely abstracted the name.

Instead of finding a name that simultaneously fits anything but nothing in particular, I decide to go back to what it was originally named. If I’m honest with myself, EmailSubjectAction works a lot better.

public string EmailSubjectAction
{
  get { ... }
}

You might argue that I should move this particular property to its own subclass that inherits the TaskEvent class. This way, I can use this subclass specifically for the emailing case. But, the overhead of managing another class just for this one unique property exposed isn’t worth the tradeoff right now.

You might also argue that such a name might might hide the full capabilities of the property. One day, I’ll need the very thing that EmailSubjectAction brings to the table—a terse description of a task action—for something entirely unrelated to emails and I will introduce an identical version of that construct and name it something else.

In my experience, I find this to be the exception rather than the rule. New features are rarely developed completely isolated from anything else going on with the existing application. For instance, when we introduced in-app notifications a few months later, much of the content was inspired by what was already being displayed in our emails, including the EmailSubjectAction. It was far easier for me to find the property because its name hadn’t been abstracted.

At that point, because the construct was being used in multiple ways, I’ able to justify the broader name change to AbbreviatedAction.

Speaking the Native Tongue

People that write code today are often the same ones making product decisions or collaborating directly with clients. This makes it even more critical that programmers understand why they’re building what they’re building.

One of the best ways to encourage this is to infuse the language of the business directly into your code.

If I’m building software to manage a law firm’s organization, I want the names of my constructs to be as closely aligned to the words law professionals use to describe their own concepts. Practice Areas instead of Topics. Practice Groups instead of Teams. Attorneys and Paralegals instead of Employees. This way, I don’t have to make the mental mapping between what the client calls something versus what I do.

Business concepts should also drive the kind of structures you introduce into a codebase.

If I were building a patient management system for a medical clinic, I might define a doctor’s patient list with a structure like List<Patient>. It seems reasonable enough. I get all the normal functions associated with a List out-of-the-box, like Add(), Remove(), Count() and so forth.

But, doctors typically do not call these things patient lists, they call them patient panels. Doctors speak to how their panels are full or when new spots in their panel will open up.

Because medical professionals have defined a specific concept called panels already, it’s almost certain there are specific attributes tied to the panel itself—not just a convenient shorthand to a list of patients.

As it turns out, panels aren’t just assigned to doctors, but to medical assistants and health educators. Panels also have a certain optimal number of patients (so that all clinicians generally see the same number of patients). Panels can have openings or be full. It turns out there’s a lot of other attributes tied to the concept of a panel.

What may have started out, in my mind, as a list of Patients has now become something a whole lot richer.

public class Panel
{
  private List<Patient> patients;  
  
  private Doctor doctor;
  private HealthEducator healthEducator;
  private List<MedicalAssistant> medicalAssistants;
  
  private bool hasOpening;
  ...
}

To add a patient to a panel, my first sketch at a method signature might be something like:

public class Panel
{
  public void Add(Patient p)
  {
    patients.Add(p);
  }
  ...
}

Certainly, Add is clear, and it follows directly from the Add() method associated to the List. But, doctors usually talk about patients subscribing to a panel. So, Subscribe is a far more fluent approach.

public class Panel
{
  public void Subscribe(Patient p)
  {
    patients.Add(p);
  }
  ...
}

These name and construct choices seem inconsequential at first. But, as the codebase grows and the functionality gets more complex, the benefits become more significant. By naming concepts in code the way your clients and customers talk about them…

You remove an unnecessary layer of interpretation.
You can have more productive discussions with your clients.
You might even be able to show your client working code to better explain complexities and hurdles in logic they may not have thought of.
You have a better grasp of the business concepts you have to maintain.

Naming the relationships between concepts is a particular area that can be easily overlooked. For instance, if you’re familiar with relational data modeling, you know one of the key concepts is the associative table (or associative entity).

An associative table’s core characteristic is that it contains two or more foreign keys to tables that have a “many-to-many” relationship. For instance, a student enrolls in multiple classes and a class can have multiple students. You might represent students in a Students table and another in a Classes table. The associative table would hold the relationship between those two tables alongside any other data relevant to that relationship (like a student’s current grade in that class).

It’s customary to name such an associative table by gluing the adjacent table names together. This approach usually creates a platable table name like StudentsClasses. It reads decently.

Let’s take another example. I have an associative table that links a Users table with an Accounts table. A user can belong to many accounts and an account can have many users. If I blindly name the associative table that links these two tables together, I’d call it a UsersAccounts table.

Down the road, I might track other pieces of data, like the date someone was added to the account or the particular role a user has in a specific account. Both those columns would fit naturally in the UsersAccounts table.

But, saying “Nora’s user account began on March 1st, 2017” is much less natural than saying “Nora’s membership began on March 1st, 2017” or “Nora’s membership has admin privileges.” The associative table is far more clear with a name like Memberships.

A UsersPublications table might be better named Subscriptions. A CustomersProducts could be a PurchaseOrders. A PassengersFlights table could be a Booking.

And this isn’t relegated to just table naming. Object-relational mappers usually map directly from tables, so it’s common to see the same name patterns appear in our code— as data transfer or business domain objects.

The trick with good names is getting your head out of the technical details of your work and thinking about what real-world thing you’re actually modeling. You just might find a more appropriate name that way.

Naming and Teaching

Imagine you’re teaching a new programmer about one of the most fundamental concepts in object-oriented programming—object instantiation. How would you describe an example?

My first introduction to the concept was in my college “Intro to C++” class. Even decades later, I still recall my textbook using an example that looked similar to this:

Object myObject = new Object();
myObject.DoSomething();

For an experienced programmer, this example feels like a good starting point. The names are abstract enough so you can focus on an academic discussion of class constructors and method invocation without being sidetracked by the specificity of the example.

As the student, this was not the experience I recollect. I remember staring at this line of code until the end of the class period unable to decrypt the mystery of its meaning, still struggling with what each representation of the word Object meant, or what possible something this object would want to do. It was maddening.

After all, the word Object appears four times in two lines. It creates that dizzying effect I discuss in The Shape of Code. For a programming newbie like me, this example was as clear as this syntactically correct line:

Xy8lksp0 myXy8lksp0 = new Xy8lksp0();
myXy8lksp0.yXlx8x3();

If I couldn’t grasp one of the most basic concepts in an introductory-level course, what hope for me was there in this industry? I was never going to be a programmer.

This is the hopelessness I felt—and the hopelessness I’m sure many others have felt when they first learned about code. I got lucky. I would learn a lot more about code over the years because I had specific needs that required me to learn concepts organically. But what about so many others that quit after that introductory class?

One of the easiest ways to teach code better is simple: Always use relatable names in your examples.

I understand the impulse to shy away from specifics. By showing an actual application of a concept, we fear misleading the student into some myopic understanding of the Big Idea. The reach of the concept will never be given its due justice.

It’s this kind of false belief that has also led many of us to use metasyntactic variable names like foo, bar, baz, and qux when teaching concepts.

Metasyntactic variables are used to name entities...whose exact identity is unimportant and serve only to demonstrate a concept, which is useful for teaching programming.

Wikipedia entry for metasyntactic variables

Here’s the application of these variables to describe method overloading.

void foo(int bar);
void foo(int bar, string baz);
void foo(int bar, string baz, string qux);

...

foo(0);
foo(1, "arg0");
foo(2, "arg0", "arg1");

I don’t get the sense that the meaninglessness of the variables helps teach this concept. The student, if lucky, might learn the technicalities of the concept but will certainly struggle with sensing when an occasion arises to actually use the concept.

What does help immensely is concreteness. Object instantiation, method overloading, and most other programming concepts are only understandable when you apply them to something real. Their reason for existing only makes sense after you understand why they’re necessary in the first place. No one will ever grasp the benefits of method overloading because passing in bar, baz, and quxto foo() produces different useful results.

As I discussed in Imagining Objects, another teaching anti-pattern I notice is we use unrealistic scenarios as examples—dogs, cats, birds, cars, and planes. The very best examples, to me, are the ones that aren’t only specific, but relevant to something you might actually do in code. Compare the original example of object instantiation above to an example like this.

User requester = new User(username, password);
requester.Login();

If I showed this to someone learning the concept, they would still have many questions about the concept. But it would be much easier for me to begin explaining things like:

The difference between the User type and the User() constructor method. And instead of using an empty constructor signature, I pass in variables with relatable names to better show what a constructor could do to initialize the instance.
The difference between a class (User) and an instance of that class (requester).
An example of an appropriate method that a User instance might own.

Similarly, I can improve the method overloading example:

void addPerson(int age);
void addPerson(int age, string first_name);
void addPerson(int age, string first_name, string last_name);

...

addPerson(27);
addPerson(34, "John");
addPerson(41, "Jane", "Smith");

Again, there will still be many questions about the concept of method overloading, but the student would have a much better grasp of the intention right out of the gate. Even a newbie could tell you what happened when each method version was invoked.

A 27-year old was added.
A 34-year old named John was added.
A 41-year old named Jane Smith was added.

Discussions about why method overloading is useful, what alternative approaches could be taken, or what trade-offs using each specific implementation entails would be far more productive with an example like this.

The Big Idea should be the final stop in the learning process, not the first stop. Once the student can anchor their understanding to something real, they can more quickly grasp other real examples. After that, the final step to fully understanding something is a much smaller one.

Even after coding for more than two decades, I still get lost when looking at examples that use metasyntactic or abstract variables. Never have I learned a concept and thought to myself, in hindsight, “Boy, I really wish they used foo and bar more so I would understand the concept better!”

The Big Idea behind this entire book is just this. Your code—no matter how elegantly you believe you’ve written it—will always feel foreign and mysterious to someone else at first. That somebody else might even be you reacquainting yourself with your own code down the road. It’s your job to invite the reader into your space with open arms.

It’s like putting the welcome mat down before the front doorstep of your code. So long as human beings are still writing code, the practice of naming things well will never go out of fashion.

Naming things well can be the difference between a home people want to keep alive and thriving versus one we’d soon rather leave behind.

Author’s Note

I’ve had the unique opportunity of working on the same piece of software, DoneDone, for more than a decade. And yet, even after all these years, every time I dig back in, I still spot some place where I think a name could be expressed better. It’s with this perspective that I hope this book can bring something unique to the topic of naming things.

I am sure there are many thoughts and strategies you may have that I did not cover, and I’m equally sure my viewpoints are up for debate. If you’d like to connect about this book in any way, message me on Twitter @developerscode or email me directly at [email protected].

If you’re interested in other things I’ve written, read some of my past thoughts on Life Imitates Code or grab a copy of my book on software development life, The Developer’s Code.

Thanks for reading!
Ka Wai Cheung
February 2020

Naming Things

Naming Things

Thoughts on one of coding’s most elusive tasks.By Ka Wai Cheung of DoneDone.

Start reading...

Introduction

Imagining Objects

Name Hunting

Empowering your objects

Breaking Methods Apart

The Shape of Code

When Opposites Confuse

The Tautologous Name Trap

Names are Fickle

Abstracting Too Soon

Speaking the Native Tongue

Naming and Teaching

Author’s Note

Thoughts on one of coding’s most elusive tasks.
By Ka Wai Cheung of DoneDone.