Origins of “The Rule of Three”

30 Jan 2023

When you come across guidance in books and blog posts you might naturally gravitate to those that have references or links to other works or even scientific papers, which lend them a sense of legitimacy. Or maybe a senior colleague or someone you respect in your community makes a pronouncement about “the way things should be done” and you take them at their word. Over time these lines end up repeated over and over until they are accepted without question, spreading among practitioners as hallowed mantras. Devoid of their original context, and in the absence of any qualifying nuance, they have become myths – containing a nugget of truth perhaps, but distorted by space and time.

The Leprechauns of Software Engineering takes aim at a few of the foundational myths of my field – “The Cone of Uncertainty”, “The 10x programmer” and “Waterfall development”, for example – and shows them to be either misreadings of, or based entirely on flawed research. The author, Laurent Bossavit, is no enemy of the practice of software engineering research, but he is frustrated at how bad some of that research is and how quickly we as practitioners rush to swallow perceived wisdom: “we should raise our expectations of rigor in software engineering writing, especially writing that popularizes research results”, he argues.

To that end, I wanted to try my hand at digging into a piece of software folklore that I’ve come across from time to time: “The Rule of Three”. Wikipedia refers to it as

a code refactoring rule of thumb to decide when similar pieces of code should be refactored to avoid duplication. It states that two instances of similar code do not require refactoring, but when similar code is used three times, it should be extracted into a new procedure.

Maybe you’ve heard of it? If you haven’t then a cursory search will show you how widely popularised it is, and I must admit that it certainly has an appealing “surface plausibility”, to borrow a phrase from Bossavit. I’ve personally encountered it in pull requests, where a reviewer might notice that what’s being added is similar to some existing code and might suggest that a single, new abstraction replace the redundant instances. The rule itself isn’t always invoked explicitly, but I’ve seen that happen too. I don’t necessarily disagree with the approach, depending on the situation, but I wanted to try to understand where this piece of gospel truth came from.

Jump to heading Don Roberts

The article on Wikipedia attributes it to Martin Fowler’s book Refactoring (1999, if you sign up for a free trial of O’Reilly (my former employer) then you can find the relevant text near the top of chapter two), where he himself attributes it to Don Roberts:

Here’s a guideline Don Roberts gave me: The first time you do something, you just do it. The second time you do something similar, you wince at the duplication, but you do the duplicate thing anyway. The third time you do something similar, you refactor.

Or for those who like baseball: Three strikes, then you refactor.

I am struck by how the Wikipedia article has taken quite generic advice from Roberts and transformed it into a recommendation about when to write “a new procedure”, which my brain translates (perhaps mistakenly?) as “a new function”. The implicit suggestion being that the rule should apply at a pretty granular level, whereas Roberts seems deliberately vague on this point. It’s not too far from the original quote, but you can already imagine how something can incrementally become distorted in the retelling, especially with Wikipedia as a source of the distortion.

The Refactoring bibliography doesn’t make mention of any works by Roberts, unfortunately, and I could find no other instance where Roberts repeated the advice until I emailed Refactory, where Roberts consults. They very kindly replied, pointing me to ‘Evolving Frameworks: A Pattern Language for Developing Object-Oriented Frameworks’ (1996), authored by Roberts and Ralph Johnson (of Design Patterns fame).

The paper provides general advice about how to develop reusable frameworks, giving the reader three patterns as inspiration. The first pattern, “Three Examples”, is the most direct source of the “Rule of Three” as we see it today:

The general rule is: build an application, build a second application that is slightly different from the first, and finally build a third application that is even more different than the first two. Provided that all of the applications fall within the problem domain, common abstractions will become apparent.

This further reinforces my feeling that the Wikipedia article is mistaken when it narrowly frames the rule as talking about “procedure”-level code artifacts; it’s clear that Roberts is interested in refactoring across entire systems, code abstractions that are only apparent after the creation of multiple related applications.

Jump to heading Tracz and Biggerstaff

Roberts and Johnson describe their pattern as “a special example of the Rule of Three”, referencing a summary by Will Tracz of a RMISE Workshop on Software Reuse held October 14–16 1988. In it there are a couple of quotes attributed to Ted Biggerstaff that appear to support Roberts’ and Johnson’s approach:

Biggerstaff’s 3-system rule: If you have not built three real systems in a particular domain, you are unlikely to be able to derive the necessary details of the domain required for successful reuse in that domain. In other words, the expertise needed for reuse arises out of ‘doing’ and the ‘doing’ must come first.

Like Roberts and Johnson, Biggerstaff is talking about systems within a given domain: you need to build at least three payment processing applications, say, before you can determine what might be reusable across them.

I also liked this other quote from Biggerstaff:

Mangers should note: there is no silver bullet, no free lunch. Reuse is like a savings account, you have to put a lot in before you get anything out.

There is always a cost to refactoring, an important caveat to the rule that hasn’t made it to the present day (though you might argue it goes without saying).

Jump to heading Grasso, Lanergan & Poynton

Unfortunately neither of the Biggerstaff works referenced in the RMISE workshop summary repeat the “3-system rule”, but in ‘Confessions of a Used-Program Salesman: Lessons Learned’ (1995) Tracz says that Biggerstaff based his rule “on Bob Lanergan’s observations at Raytheon”. Biggerstaff’s ‘Reusability Framework, Assessment and Directions’ (1987), referenced in the RMISE workshop summary, in turn references ‘Software Engineering with Reusable Design and Code’ (1984) by Robert Lanergan and Charles Grasso.

In this paper the authors discuss the benefits of reusable code in the COBOL programs at Raytheon and how they achieved them using shared “logic structures”. The paper doesn’t phrase anything as a rule, per se, and the closest thing I could find to Biggerstaff’s guideline was this:

after a programmer uses a structure more than three times (learning curve time) a 50 percent increase in productivity occurs.

This is similar to Biggerstaff’s warning of “no free lunch”, but the emphasis is more on the “learning curve” for individual programmers rather than the opportunity cost of refactoring the code in the first place: the cost of reuse has to be paid by every person who has to use the new abstraction, which is another interesting nuance we’ve since lost.

There’s no empirical evidence provided for this finding, though. The authors reference another paper by Lanergan and Brian Poynton—‘Reusable code – The application development technique of the future’ (1979, very kindly scanned for me by the Science Reference Section at the Library of Congress)—but this only reiterates the “three times” learning curve without further elaboration.

I was able to find ‘Software Engineering with Standard Assemblies’ (1978), also authored by Lanergan and Poynton, which again repeats the “learning curve” assertion, but also tantalisingly leads with “Three major applications using this technique averaged 60 percent re-used code”. Frustratingly, this is never expounded on in the rest of the paper.

The paper’s methodology doesn’t seem like it would stand up to modern day scrutiny either, at least to this layperson. Supervisors self-selected the programs they thought would be worth assessing for potential reusability, working closely with the people conducting the study, rather than there being a random, blind selection of programs across the company. The confidence levels of some of the numbers being thrown around also seem suspiciously round and broad: “15 to 85 percent reusable code was attained”; “40 to 60 percent reusable code can easily be attained for an average program”; “this results in a 30 to 50 percent increase in productivity in new programs”. Clearly there’s some quantifiable benefit to reusing these logic structures, but rigorously defining what that was clearly wasn’t the aim of this paper.

All-in-all this paper was a disappointing end to my search, but it does contain some great 70s illustrations:

Image of an early 20th Century automobile in a state of construction, with a tagline reading 'We still build software the way we built automobiles in 1902'"

Image of modular car factory with vehicles being constructed as they move along an assembly line, with a tagline reading 'In this day and age we know better'"

Jump to heading Conclusion

There seems to me to be a clear path from the “Rule of Three” that we know today to software reuse experiments conducted at Raytheon in the 70s, but the write-ups of those experiments are sadly light on details. From my cursory reading of software reusability papers written during this period I don’t think this is particularly unusual; a lot of them seem to be more interested in building consensus in the community and sharing what worked for various organisations rather than designing experiments intended to be reproduced by others. We can see, though, that over the years these well-meaning efforts become canonicalised as footnotes and citations, to be accepted unquestioningly.

I do not mean to criticise Don Roberts in this; his work seems to be in the same vein as Biggerstaff and Lanergan: trying to get people to write better software more efficiently. I don’t get the impression he ever set out to create or popularise a refactoring “rule”, and it would be unreasonable to hold him accountable for others’ unthinking treatment of his writings. I agree wholeheartedly with all of the authors referenced above, in fact: we should strive to reuse code where appropriate, and doing so can realise great productivity benefits. The Open Source movement itself is probably the most obvious example of this.

However, the widely popularised “Rule of Three” that we see today, unmoored from the context that occasioned it, is not really a rule, and is barely a rule of thumb. It has an attractive air of “surface plausibility”, and its core tenet—that you cannot generalise from a single example—is common sense, but there’s no reason that it shouldn’t be the “Rule of Five” or the “Rule of Seven” instead. None of the authors above have provided any evidence for “three” being the magic number beyond a couple of isolated examples, and indeed it would likely be impossible to conduct an experiment to give an accurate number.

So by all means we should continue to look for opportunities to refactor and reuse code, but we needn’t use three examples as some special inflection point. We should also bear in mind this corollary advice that seems to have gotten lost over the years:

If this has inspired you to do something similar for a piece of software folklore that you’ve encountered then I’d again recommend Laurent Bossavit’s book for further examples and approaches. Otherwise stay vigilant, and keep an eye out for leprechauns.

For any papers that I cite whose links no longer work, I’ve collected PDF scans of them in a GitHub repository for posterity.