Cost of Testing

October 1st, 2009 · 31 Comments ·

A lot of people have been asking me lately, what is the cost of testing, so I decided, that I will try to measure it, to dispel  the myth that testing takes twice as long.

For the last two weeks I have been keeping track of the amount of time I spent writing tests versus the time writing production code. The number surprised even me, but after I thought about it, it makes a lot of sense. The magic number is about 10% of time spent on writing tests. Now before, you think I am nuts, let me back it up with some real numbers from a personal project I have been working on.

Total Production Test Ratio
Commits 1,347 1,347 1,347
LOC 14,709 8,711 5,988 40.78%
JavaScript LOC 10,077 6,819 3,258 32.33%
Ruby LOC 4,632 1,892 2,740 59.15%
Lines/Commit 10.92 6.47 4.45 40.78%
Hours(estimate) 1,200 1,080 120 10.00%
Hours/Commit 0.89 0.80 0.09
Mins/Commit 53 48 5

Commits refers to the number of commits I have made to the repository. LOC is lines of code which is broken down by language. The ratio shows the typical breakdown between the production and test code when you test drive and it is about half, give or take a language. It is interesting to note that on average I commit about 11 lines out of which 6.5 are production and 4.5 are test. Now, keep in mind this is average, a lot of commits are large where you add a lot of code, but then there are a lot of commits where you are tweaking stuff, so the average is quite low.

The number of hours spent on the project is my best estimate, as I have not kept track of these numbers. Also, the 10% breakdown comes from keeping track of my coding habits for the last two weeks of coding. But, these are my best guesses.

Now when I test drive, I start with writing a test which usually takes me few minutes (about 5 minutes) to write. The test represents my scenario. I then start implementing the code to make the scenario pass, and the implementation usually takes me a lot longer (about 50 minutes). The ratio is highly asymmetrical! Why does it take me so much less time to write the scenario than it does to write the implementation given that they are about the same length? Well look at a typical test and implementation:

Here is a typical test for a feature:

ArrayTest.prototype.testFilter = function() {
  var items = ["MIsKO", {name:"john"}, ["mary"], 1234];
  assertEquals(4, items.filter("").length);
  assertEquals(4, items.filter(undefined).length);

  assertEquals(1, items.filter('iSk').length);
  assertEquals("MIsKO", items.filter('isk')[0]);

  assertEquals(1, items.filter('ohn').length);
  assertEquals(items[1], items.filter('ohn')[0]);

  assertEquals(1, items.filter('ar').length);
  assertEquals(items[2], items.filter('ar')[0]);

  assertEquals(1, items.filter('34').length);
  assertEquals(1234, items.filter('34')[0]);

  assertEquals(0, items.filter("I don't exist").length);
};

ArrayTest.prototype.testShouldNotFilterOnSystemData = function() {
  assertEquals("", "".charAt(0)); // assumption
  var items = [{$name:"misko"}];
  assertEquals(0, items.filter("misko").length);
};

ArrayTest.prototype.testFilterOnSpecificProperty = function() {
  var items = [{ignore:"a", name:"a"}, {ignore:"a", name:"abc"}];
  assertEquals(2, items.filter({}).length);

  assertEquals(2, items.filter({name:'a'}).length);

  assertEquals(1, items.filter({name:'b'}).length);
  assertEquals("abc", items.filter({name:'b'})[0].name);
};

ArrayTest.prototype.testFilterOnFunction = function() {
  var items = [{name:"a"}, {name:"abc", done:true}];
  assertEquals(1, items.filter(function(i){return i.done;}).length);
};

ArrayTest.prototype.testFilterIsAndFunction = function() {
  var items = [{first:"misko", last:"hevery"},
               {first:"mike", last:"smith"}];

  assertEquals(2, items.filter({first:'', last:''}).length);
  assertEquals(1, items.filter({first:'', last:'hevery'}).length);
  assertEquals(0, items.filter({first:'mike', last:'hevery'}).length);
  assertEquals(1, items.filter({first:'misko', last:'hevery'}).length);
  assertEquals(items[0], items.filter({first:'misko', last:'hevery'})[0]);
};

ArrayTest.prototype.testFilterNot = function() {
  var items = ["misko", "mike"];

  assertEquals(1, items.filter('!isk').length);
  assertEquals(items[1], items.filter('!isk')[0]);
};

Now here is code which implements this scenario tests above:

Array.prototype.filter = function(expression) {
  var predicates = [];
  predicates.check = function(value) {
    for (var j = 0; j < predicates.length; j++) {
       if(!predicates[j](value)) {
         return false;
       }
     }
     return true;
   };
   var getter = Scope.getter;
   var search = function(obj, text){
     if (text.charAt(0) === '!') {
       return !search(obj, text.substr(1));
     }
     switch (typeof obj) {
     case "bolean":
     case "number":
     case "string":
       return ('' + obj).toLowerCase().indexOf(text) > -1;
    case "object":
      for ( var objKey in obj) {
        if (objKey.charAt(0) !== '$' && search(obj[objKey], text)) {
          return true;
        }
      }
      return false;
    case "array":
      for ( var i = 0; i < obj.length; i++) {
        if (search(obj[i], text)) {
          return true;
        }
      }
      return false;
    default:
      return false;
    }
  };
  switch (typeof expression) {
    case "bolean":
    case "number":
    case "string":
      expression = {$:expression};
    case "object":
      for (var key in expression) {
        if (key == '$') {
          (function(){
            var text = (''+expression[key]).toLowerCase();
            if (!text) return;
            predicates.push(function(value) {
              return search(value, text);
            });
          })();
        } else {
          (function(){
            var path = key;
            var text = (''+expression[key]).toLowerCase();
            if (!text) return;
            predicates.push(function(value) {
              return search(getter(value, path), text);
            });
          })();
        }
      }
      break;
    case "function":
      predicates.push(expression);
      break;
    default:
      return this;
  }
  var filtered = [];
  for ( var j = 0; j < this.length; j++) {
    var value = this[j];
    if (predicates.check(value)) {
      filtered.push(value);
    }
  }
  return filtered;
};

Now, I think that if you look at these two chunks of code, it is easy to see that even though they are about the same length, one is much harder t write. The reason, why tests take so little time to write is that they are linear in nature. No loops, ifs or interdependencies with other tests. Production code is a different story, I have to create complex ifs, loops and have to make sure that the implementation works not just for one test, but all test. This is why it takes you so much longer to write production than test code. In this particular case, I remember rewriting this function three times, before I got it to work as expected. :-)

So a naive answer is that writing test carries a 10% tax. But, we pay taxes in order to get something in return. Here is what I get for 10% which pays me back:

  • When I implement a feature I don’t have to start up the whole application and click several pages until I get to page to verify that a feature works. In this case it means that I don’t have to refreshing the browser, waiting for it to load a dataset and then typing some test data and manually asserting that I got what I expected. This is immediate payback in time saved!
  • Regression is almost nil.  Whenever you are adding new feature you are running the risk of breaking something other then what you are working on immediately (since you are not working on it you are not actively testing it). At least once a day I have a what the @#$% moment when a change suddenly breaks a test at the opposite end of the codebase which I did not expect, and I count my lucky stars. This is worth a lot of time spent when you discover that a feature you thought was working no longer is, and by this time you have forgotten how the feature is implemented.
  • Cognitive load is greatly reduced since I don’t have to keep all of the assumptions about the software in my head, this makes it really easy to switch tasks or to come back to a task after a meeting, good night sleep or a weekend.
  • I can refactor the code at will, keeping it from becoming stagnant, and hard to understand. This is a huge problem on large projects, where the code works, but it is really ugly and everyone is afraid to touch it. This is worth money tomorrow to keep you going.

These benefits translate to real value today as well as tomorrow. I write tests, because the additional benefits I get more than offset the additional cost of 10%.  Even if I don’t include the long term benefits, the value I get from test today are well worth it. I am faster in developing code with test. How much, well that depends on the complexity of the code. The more complex the thing you are trying to build is (more ifs/loops/dependencies) the greater the benefit of tests are.

So now you understand my puzzled look when people ask me how much slower/costlier the development with tests is.

Tags: Uncategorized

31 responses so far ↓

  • najcik // Oct 1, 2009 at 11:52 pm

    I really admire your ability to put things so straight and so simple. Respect!
    One or two things to notice:
    People are mostly reluctant to write test because they find themselves in a situation where their code is very tightly coupled and don’t have enough skill to break those couplings. Tests are code – so you need to maintain them. So some think: if you can’t keep how would you maintain more code better. But without a test suite there’s little if no chance at all that your project will stay clean thoughout its life time. Is as Unclebob says: it may be more important than keeping production code clean.

  • najcik // Oct 2, 2009 at 12:00 am

    Just to conclude: TDD as a policy to keep the investment in test as low as 10%. Writing tests afterwards will be more expensive and difficult.

  • Franco Lombardo // Oct 2, 2009 at 12:00 am

    Misko, good points in your post, as usual, but I have a few annotations:

    1) The maintenance cost of a software project is a function of LOC (someone says it’s a linear one, others say it’s an exponential one ;-) ). Tests add about 40% of LOC to your codebase, so they add maintenance cost: I think this is the hidden cost of tests.

    2) You spend about 10% of your time in writing tests because you are a great programmer. I think that an average programmer will spend not less than 50% of his time in writing test (I’m a bad programmer, so I spend almost 60% of my time in writing tests, when I write them ;-) )

    3) You say “The more complex the thing you are trying to build is (more ifs/loops/dependencies) the greater the benefit of tests are.” So, are there some applications, typically CRUD ones, that are not worth the cost of tests?

  • Jonathan Hartley // Oct 2, 2009 at 1:28 am

    We probably spend more time than 10% on our tests. I haven’t measured it, although maybe you have inspired me to. I would guess it might be 40%, including the time spent refactoring tests

    One reason for this higher figure is that we write more than just the unit tests Misko shows, we also write orthogonal system tests, which fire up the whole application and stimulate GUI controls, to test the behaviour of the application as a whole, as a user would see it.

    For the record, we consider this cost well worth paying. We believe that our testing, along with other things like pair-programming, makes us many times more productive than we would otherwise be.

    @Franco Lombardo: Hey there. Regarding your point (1). Maintenance cost as a function of LOC is an oversimplification. As Misko points out, test code is much simpler to write than production code. Also, for the same reasons, it is much simpler to read and maintain. Each test is a simple laundry list of actions, and is decoupled from all other tests, and ideally also decoupled from almost all production code other than the features it is testing. For this reason test code does not, in my experience, contribute greatly towards maintenance costs.

    Point (2) I am not sure the ability of a programmer determines how much time they spend writing tests. I don’t think writing test code is harder than writing production code – instead it is actually easier. However, writing tests certainly is *different* from writing other code, and requires a new mindset. Because of this, I think how much experience a person has at writing tests is more important in determining how much time tests take them.

    Best regards.

  • Dmitry Nikolaev // Oct 2, 2009 at 3:33 am

    Misko, I think function in production code must be refactored. Test – it’s great but readability matter too. hurry ? :-)

  • misko // Oct 2, 2009 at 7:31 am

    @najcik,

    I cannot overstate how much your comment is true. Writing tests after the fact, not only takes longer, it is no fun at all, I personally hate it.

  • misko // Oct 2, 2009 at 7:35 am

    @Dmitry,

    you are probably right, the code should be refactored :-) . However, I am not worried, I have tests, I can refactor today, or a year from now when I need to add functionality, the cast will be same as I have that code tested. As a matter of fact, I think with my tests, most of you should be able to refactor or rewrite the code from scratch, and I could just drop it in in my codebase at things would work just fine.

  • misko // Oct 2, 2009 at 7:39 am

    @Jonathan,

    You are right, the 10% refers to unit tests only, the end to end test cost more. Different kinds of tests catch different kinds of bugs. I actually, need to write some end-to-end tests as the place where my code keeps breaking is the top level wiring, exactly the place where end-to-end test would help out.

    However, I think that if you are new to testing you should start with unit tests, as those will give you the most bang for the buck and will enable you to write the end-to-end tests later.

    – misko

  • Kaleb Pederson // Oct 2, 2009 at 8:26 am

    A couple of minor corrections:

    “t write” => “to write”
    “bolean” => “boolean”

    I’m curious if the latter actually means you forgot to test with a boolean object, since I’d be surprised if a boolean toString’d to “bolean” :) .

    As always, a great post, thank you.

  • David Holbrook // Oct 2, 2009 at 8:59 am

    I think the point you make about application restart time is very important.

    There was recently an article on TheServerSide.com about the amount of time programmers report spending on application server restarts (http://www.theserverside.com/news/thread.tss?thread_id=57978)

    When you code to tests, you get the test running then move on to the next thing. No need to re-deploy what you have works! On a large JEE application the restart time can be many minutes. Over the course of the day if you are “hand testing” each change the cost can be extraordinary.

    The only times I find my self re-deploying several times a day is for cosmetic UI changes, and those deployments seldom require an app server restart.

    I feel like the resistance I see against test driven development is one of education. Most people take a stab at unit testing after the fact, find it hard, and decide it is a waste of time.

    Thanks for the blog Misko, I try and point everyone I can here.

  • Trevor // Oct 2, 2009 at 9:42 am

    Can we only assume tests are meant for logic / business rules / etc. and not so much for visuals? I ask because most of my time is spent dialing in visual transitions and managing complex hierarchical states when I’m deving as3 flash sites.

  • Michael Tsai - Blog - Cost of Testing // Oct 2, 2009 at 11:26 am

    [...] Miško Hevery: Now, I think that if you look at these two chunks of code, it is easy to see that even though they are about the same length, one is much harder to write. The reason, why tests take so little time to write is that they are linear in nature. No loops, ifs or interdependencies with other tests. Production code is a different story, I have to create complex ifs, loops and have to make sure that the implementation works not just for one test, but all test. This is why it takes you so much longer to write production than test code. In this particular case, I remember rewriting this function three times, before I got it to work as expected. [...]

  • Cost of unit testing « Pragmatic Agile Weblog // Oct 2, 2009 at 12:29 pm

    [...] Cost of unit testing I found an interesting article about the topic of how much unit testing cost: http://misko.hevery.com/2009/10/01/cost-of-testing [...]

  • misko // Oct 2, 2009 at 1:23 pm

    @Trevor,

    yes test are for logic, i.e. it is hard to write a test for CSS. Or for coolness of the transition, but you can write a test to prove that transition logic works properly, i.e. that the animation moves an object from point A to point B.

    – Misko

  • misko // Oct 2, 2009 at 1:26 pm

    @Kaleb,

    Congrats! you found a bug in my code. You are right, I got carried away and wrote code which tests were not asking for and hence it does not work. My guess is that i would have discovered it in production at some point, and than I would have gone back to add a test case and fix it, but now I am going to add the test case. Many thanks.

    – Misko

  • 写测试代码的开销 | 开心e点 // Oct 2, 2009 at 7:16 pm

    [...] 关于 TDD,总是有很多的争论,说写测试代码浪费时间等等等等。看到一篇讨论测试代码所花时间的文章,写的很不错,文章后面的回复讨论也很有价值。 [...]

  • Dennis Gorelik // Oct 3, 2009 at 4:48 pm

    Misko — great post.
    I think 10% proportion between writing tests time and writing production code time is a good rule of thumb.
    If tests take longer than that [proportionally], then probably production code doesn’t worth being covered by auto-tests.
    For example, it’s much harder to write auto-tests for UI production code and … suprise… it’s not as needed as covering business-logic code with auto-tests.
    Do you cover UI production code tests with auto-tests?

    If auto-tests take longer than 10% of time to write, then probably it would take longer to maintain these tests as well.
    So not only such tests would take too much development time up-front, but they would also require significant maintenance time when production functionality changes.

  • Weekly Links #73 | GrantPalin.com // Oct 4, 2009 at 4:20 pm

    [...] Cost of Testing A good explanation of the cost/benefit ratio of doing unit tests. [...]

  • Itay Maman // Oct 5, 2009 at 1:57 am

    I always felt that your bottom line is true but I never bothered to quantify it. Thanks.

    I think programmers tend to over-estimate their cognitive abilities and, at the same time, to under-estimate how quick a program grows complex. They think they understand all the assumptions, invariants, etc. where in reality they are long past the point where they stopped understanding them.

    Also, regarding why testing code is simpler than production code, I think it all boils down to the fact that the main thing that testing code does is to associate a sample input with an expected output. This is one of the simplest thing that a program can do.

    Production code does a much more: It needs to synthesize the output from the input in a generic way (i.e.: without hard-coding all possible inputs and all possible outputs).

  • igorbrejc.net » Fresh Catch For October 6th // Oct 6, 2009 at 2:03 am

    [...] Cost of Testing [...]

  • Ionut G. Stan // Oct 7, 2009 at 2:36 am

    Hi Misko,

    I’m wondering whether you’re doing TDD incrementally. I mean, do you write a test case, then go implement it, then another test, then implement it, or you rather write all tests for a single unit at once and then all the implementation for that unit?

    I found the second choice to be very hard, and can’t actually do TDD that way. It is much more easier to grasp TDD when it’s done in small, incremental steps.

  • misko // Oct 7, 2009 at 8:23 am

    @Ionut,

    I think everyone does TDD in the first way you described. We write a little test and than we implement it, we write another and than we implement that. I have not heard of anyone writing all of the tests at once.

    I do sometimes write multiple test but they are on different level. I will write larger functional test and than write a smaller unit test.

  • Ionut G. Stan // Oct 7, 2009 at 5:47 pm

    Thanks for your response Misko.

    That’s interesting. My first attempts with TDD were by writing tests for the whole class, which obviously didn’t work. I believe there are more people like me, so what I usually advise now is to tackle them method by method. The red, green, refactor thing was a kind of a revelation for me.

  • Mark’s Testblog » Blog Archive » Internals: To test, or not to test? - …for these are testing times, indeed. // Oct 8, 2009 at 6:01 pm

    [...] Hevery recently posted something interesting on his testing breakdown and, while most of us won’t reach his level of testing efficiency, [...]

  • Costs of Testing, Costs of Quality | Test And Try // Oct 20, 2009 at 2:34 am

    [...] Cost of Testing by Miško Hevery Test Driven Development and the costs of writing code base tests. [...]

  • matt harrison // Nov 18, 2009 at 10:23 am

    This research paper suggests overhead of 15-35% for doing TDD
    http://research.microsoft.com/en-us/projects/esm/nagappan_tdd.pdf

    Great blog by the way!

  • venc // Dec 22, 2009 at 8:08 am

    Well…

    14k LOC and 1.2k Hours thats about 12 lines of code per hour. What did You spend the rest of the time on? I suppose it was somewhat connected to the test process i.e. it had to do with makeing the design so that it is testable. At least part of the time spent on the design should be included in the time spent on test coverage.

  • misko // Dec 22, 2009 at 8:56 am

    @venc,

    excellent question. I consider myself a quick typist and fast at creating code, so the slowness cannot be attributed to it. Thinking back at how the code got created, there are times when I have produced a lot of code in short amount of time with tests. So writing tests is not what slows you down. The slow down comes from realizing that your current design is not the right one, and as a result you start refactoring. My experience is that everyone at some point realizes that their design is suboptimal, but most people say it is not worth the trouble and keep on coding more code. With tests you can step back and say, this is not right, and rearange/rewrite things and still be sure that you did not break anything. I think that you cannot look at LOC in the raw form. You need to look at Feature / LOC and that is hard to calculate. So the question should not be why i have only written 12 LOC per hour but why can I get high feature for little LOC.

  • Tomasz Jedzierowski (venc) // Dec 22, 2009 at 3:47 pm

    I agree that LOC is a bad metric when it comes to developer productivity. Just 12 LOC/hour is very low and I think that a lot of the time is spent on making the codebase testable by changing the design. This does not necessarily have a huge impact but in my opinion it is important to include such considerations in the calculations of the total cost of TDD. Quite often more testable code means better design therefor surelly not all the time spent on improving the design should be added to the overhead of TDD but it can easily be overdone. The developer which will have to maintain the project will thank you but the manager who pays the bills might not.

    [OT] Changed to real name as I see that none commenter so far used a nickname. Great blog by the way :)

  • Oleg // Dec 25, 2009 at 9:17 am

    Hi.

    Well, 12 LOC/hour is very good! This is above industry average. Of course you should measure LOC of project / hours of project, so your hours include analysis, design, testing etc. etc. Under project I include even one release cycle with a couple of change requests.

    We had a project, where we had like 40 LOC/hour during some unhealthy period, but it was cut down to half very quickly when we started actually testing our code :-) )

    Oleg

  • a response to – “What would you say is the average percentage of development time devoted to creating the unit test scripts?” | Down Home Country Coding With Scott Selikoff and Jeanne Boyarsky // Feb 14, 2010 at 11:11 am

    [...] Misko Hevery comes up with a figure of 10% cost.  He calls it a 10% tax and points out the benefits that come a tax.  Note that he is writing tests as an integral part of his process and is fluent in doing so.  He also actively dispels the myth that testing takes twice as long. [...]