Quantifying the cost of Technical Debt

pdf version


To make sure a cleanup effort produces the maximum positive impact on the code base, in the past I used a heuristic that centers on frequently changed code that also suffers from Technical Debt and poor automated test coverage. The idea is to minimize Technical Interest, which I define as the effort lost due to Technical-Debt-generated resistance to change. In order to quantify Technical Interest accrued for a set of changes over a period, without measuring actual development time against a benchmark, I propose the following rough formula:


Which Technical Debt?

In this article I mention Technical Debt frequently, but it’s far from a well-defined term, moreover, the common understanding of this concept has diverged from what its creator, Ward Cunningham, originally meant.
Cunningham defined Technical Debt as the misalignment between a team’s current best understanding of the problem domain and what is instead expressed in the code: imagine a cab fleet management system written in perfectly clean code, its design expressively describing the concepts of drivers, vehicles and car positions. Let’s then imagine that at some moment the system’s developers discover that by introducing the concept of ‘areas of availability’ they would be able to both enrich and simplify their complicated cab-selection features. If they decide they can’t afford to do the necessary refactoring to introduce this new concept right now and instead keep this insight in their mind, but not in their code, for a while, they are accruing Cunningham’s definition of Technical Debt.

The common understanding of Technical Debt is instead related to code which is just poorly implemented: obscure and convoluted logic that does not express the underlying problem domain at all; poor responsibility distribution; absent, inconsistent or not isolated components and more hallmarks of poor technique. By this definition the cab fleet management system would suffer Technical Debt when the driver module owns and manages the current car’s mileage, the car position is a plain string containing gps coordinates that get passed around in every other entity of the system and slow calls to remote resources are all synchronous with the UI.
If the first definition is concerned about expertly painted portraits not catching facets of a complex personality, the second is about accidental brush strokes and misplaced facial features.

This article is concerned with the Technical Debt by the common definition: the thing that is not just preventing software to excel on the long term, but that is able to calcify its evolution to the point where, after just a few man-months of work, dozens of man-days are needed to add a new drop-down box, while refactorings are as traumatizing as full rewrites, and even less likely to succeed.

Debt and Interest

For years now people have been talking of Technical Debt: developers wail about it; seniors prove their salt attacking it with sweeping refactorings whenever the stakeholders are looking the other way, often loosing themselves and their credibility in the crusade; many a failed project’s corpse has been imputed to a manager letting this pest breed in it ‘until after the release’.Regardless on how frequently it’s blamed, calcifying low quality pervades the industry.
I believe that one of the reasons for stakeholder complacency in generating Technical Debt (a complacency that starts with obviously low-quality-producing staffing practices) and developers’ ineffectiveness in repaying it, is the fact that, while Technical Debt is quantified, its effects are not.
We know (or at least believe we know) how much we would need to work to ‘fix it all’, but we have no clue how much we are being slowed down by not fixing it right now.This also means that, provided a large landscape of debt in a codebase, we don’t know where reducing the Technical Debt will produce the greatest benefit for future efforts.Which part of our indebted code is generating the highest interest, the highest amount of attrition to our limited development resources?

Paying Interest

Unlike monetary debt, whose interest rate is expressed over time, Technical Debt does not generate interest linearly, nor continuously, with time. The most hideous working-mess-of-code will not generate extra effort if it never needs to evolve. Much like a game where the situation stays still until one of the players makes a move, in software development nothing happens unless you have to act on the code.
When you do have to act though, depending on the depth of debt in the area you are working on, you’ll pay more or less interest. This will happen in the form of time spent understanding complicated code, manually testing untested code and bug-fixing regressions.
So, when are we paying Technical Interest? When we change code. For every modified statement there’s a price to pay.
How much do we pay? We could measure it empirically by developing a feature in the system as it is, then refactoring the system until it gets to near-zero Technical Debt and re-developing exactly the same feature. The difference in effort is the Technical Interest. The problem with this approach is obvious: from a commercial point of view it is an exercise in futility.In the absence of such empirical data I propose a formula that I believe might approximate truth:

    • The effort to ‘understand the code’ is linear to the Cognitive Complexity of the function containing the changed statement : U * CC
    • U can be considered constant in most cases and can be set to 1 until the amount of effort actually spent in developing the change is available, at which point it can be set to u
    • The effort to ‘bug-fix’ the change and induced regressions is linear to the lack of automated test Branch Coverage of the function containing the changed statement T * (1 – BC)
    • T can be considered constant in most cases and can be set to 1 until the amount of effort actually spent in developing the change is available, at which point it can be set tot


  • Finally, the overall interest paid for a given period is the sum of all bits of interest paid for each software change applied in the period:
  • first


    When U and T cannot be assumed constant

    If system module boundaries are fuzzy (signatures are not there, or just wrong…) or entirely absent, a developer who needs to understand the code that he needs to change will not be able to stop at function calls, but actually move around reading implementations of things that are being called by the target code. Since efferent coupling gives an indication of how many things you depend on from your code, in the case of weak boundaries it also gives an indication of how many things the developer will have to understand beyond the immediate scope he has to act upon. This is why I suggest a second approximation that defines the U factor as a function of the Efferent Coupling from the function we are changing:

  • U_2


    Similarly, if test isolation is very poor the obvious symptom is that many tests break at every change and the more tests break, the harder to find the source of the break. Presuming homogeneously poor test isolation, the more a change is depended upon, the more the test failures (We might avoid making this assumption by using a metric that I call ‘test distance’ and which I’ve not yet documented). I thus suggest a second approximation that defines the T factor as a function of the Afferent Dependencies to the function we are changing:

  • T_2


    Combining everything together this gives the following second approximation formula:


    Final Thoughts

      • In this article I’ve used Cognitive Complexity as a way to evaluate the effort to understand code; while it is not perfect, I prefer it over Cyclomatic Complexity. I believe that, by dropping the relation to logical branches, Cognitive Complexity better models how a human brain is impacted by code structures; but since Cognitive Complexity has only recently been defined and it is supported only by some code quality tools, all my direct experiences on the topic of Technical Interest are based on Cyclomatic Complexity.
      • To reiterate, the gist of the second section: what I’m proposing relates to targeting a cleanup effort, raw code quality improvement, not a conceptual refactoring (again, referring to Cunningham’s definition of Debt). Refactoring from one conceptual model to another one should not be conditioned by change-frequency considerations, but rather by how salient the new model will be for the future of the system being developed.
      • While I’ve used variants of the formula proposed above on code-bases I was intimate with, in the absence of a dedicated tool it’s impractical to go through the history of a project to find the hotspot. As a result, I often find myself (and others) taking a shortcut to find the next cleanup target: pick the most highly complex code. The rationale for this is that, considering that the high complexity induces high change cost that such a large amount of logic is very likely to attract future changes, high complexity functions are a safe bet. I’ve recently developed a set of scripts to find out exactly this: are high-complexity functions a good target for refactoring if we don’t have the luxury of a full historical analysis of Technical Interest? I hope to report the results soon.

Things I have learnt as the software engineering lead of a multinational

A surprisingly long cycle has just closed for me and I think it’s a good time to share some lessons learned.

I have been collecting these points in the last six months, but none of them popped up recently, they have taken shape over a few years and they include both things I did and that I failed to do.

Most of these points act as a personal reminder as well as a set of suggestions to others; don’t be surprised if some of them read cryptic.

So, here it is. A summary of what I learnt in the last six years :

The productive organization

1.  When the plan is foggy, that’s the moment to communicate as broadly as possible. In fact you should not frame it as a plan, you frame it as the situation and the final state you want to achieve. When you have a detailed plan, that’s when you DON’T need to communicate broadly. So, clearly state the situation and clearly state the foggy goal to everyone who will listen.

2.  Don’t be prudish. If you fear that people will lose faith in you because of the foggy goals and dire situation statement you are painting yourself in an heroic corner. People just need to hear and share the situation they are into. Having a common understanding will act as bonding among them and with you, which is actually all you need to make them work out the right answers.

3.  Don’t assume that a specific communication medium can’t be effective. Mass emails and top-down communication are not taboo: just because most such communications are irrelevant it doesn’t mean yours will be perceived as such.

4.  Teams don’t self-organize unless you organize them to do so.

5.   Fostering personal initiative in every developer requires showing the vital need for personal initiative. These are birds that mostly don’t fly out of the cage just because you open the door. You must find a way to show them that the cage is burning.

6.  People sitting side-by-side can communicate less than people sitting a continent away. Communication is a chemical reaction that requires catalysers, the thing you get by co-locating people is lowering the cost of the catalysers, but no setup creates automatic communication.

7.  Within a development organization both good and bad communication exist, but they are not a function of politeness or rudeness, it’s much more a matter of clarity and goals. You need to learn what the good kind of communication looks like, find some examples and use them as a reference for everyone.

8.  Fire people whenever you can. There’s often someone to fire, but not many opportunities to do so. When you are given a lot of opportunities to fire people, it is often due to a crisis situation and you’ll likely fire or otherwise loose the wrong people. People appreciate when you fire the right people, so don’t worry about morale. Also, the average quality of people tends to grow more when dismissing than when recruiting.

9.  Hire only for good reasons. Being overworked is not a good reason to hire. Instead, hire to be ready to catch opportunities, not to survive the current battles.

10.  It’s often better to loose battles than to staff desperately and win desperate battles at all costs (World War I anyone?).

11.  Don’t export recruitment, recruitment must be the direct responsibility of everyone in the organization.

12. People must select their future colleagues, there are infinite benefits in this, but it must not become a conclave. Keep the process in the hands of the people who do the work, but make it as transparent as possible.

13.  Always favor actual skill testing in recruitment. When you don’t feel that you are directly testing the candidate skills you are either not competent enough in that skill or you have switched to just playing a set piece (I call this The Interview Theatre) and you will ultimately decide on a whim. Not good.

14.  Build some of your teams as training and testing grounds for freshmen. Put some of your best people there.

15.  Lack of vision is not agile, it is not data-driven, it is not about ‘taking decisions as late as possible’, it is not something that you should paint out in a positive light at all. It’s just lack of vision, and it’s not good.

16.  Construction work is not a good metaphor for software/product development. Factory work neither. Allied junior-officer initiatives during the first week after the d-day in WWII is probably a good guideline, but it is still not a good metaphor overall and, anyway, not known enough to base your communication on.


17.  Train people to do all of the previous points. Including this one.

18.  Don’t shy away from leading without doing, it is unavoidable, so just do it. Then do some work to stay pertinent.

19.  If you are not able to hire and fire people, leave. Or stay for the retirement fund if you can stomach it.

20.  The Sith are right, rage propels. But the Jedi are right, you must not let it control yourself. What nobody tells you is that the rage game is intrinsically tiring and rage will take control as soon as you get too tired, so stop well before.

21.  Write down the situation, for your own understanding just as much as for the others’.

22.  If you feel like you don’t know what you are doing it’s probably because you don’t know what you are doing and that’s bad. Anyway, until you learn you don’t really have much of an alternative. Just don’t let that feeling of desperation numb your ability to learn. It does.

23.  There’s more and more good content to read and absorb on effective organizations. Don’t despair and don’t stop reading.

24.  Don’t let entropy get at your daily routine. Avoid entropy-driven work.

25.  Ask questions to people in order to make sure they understand. Trust people who do the same to you. “Do you understand?” is NOT a valid question.

26.  Avoid having people waiting on you. Don’t create direct dependencies on your work or decisions, make sure people feel that they can take decisions and still stay true to the vision without referring to you (hence the importance of point 1).

27.  Take the time to coach people in depth. Really, spend time with the people who are or have the potential to be great professionals in the organisation.

28.  The time you spend with the people you see most potential in is endorsement enough. Avoid any other kind of endorsement of individuals. Unless you are leaving.

The Entropic Organization

29.  An organization populated by a majority of incompetents has less than zero net-worth : it is able to destroy other adjacent organizations that are not similarly populated.

30.  Incompetence is fiercely gregarious while knowledge is often fractious; the reason for this is that raw ideas transfer more easily through untrained minds than refined ideas transfer through trained minds. There’s a reason why large organisations focus so much on simple messages, pity that difficult problems often have simple solutions that don’t work.

31.  Entropy self-selects. Hierarchical  and other kinds of entropic organizations always favor solutions that survive within entropic organizations. Thus they will favor easy over simple, complex over difficult, responsibility-dilution over empowerment, accountability over learning, shock-therapy over culture-nurturing. This is the reinforcing loop that brings ever-increasing entropy in the system : entropy generates easy decisions with complex and broken implementations, which in turn generate more entropy. An example of easy decision with a complexity-inducing implementation: this scenario “our company does not have a coherent strategy, as such many projects tend to deliver results that are not coherent, hampering the organic growth of our capabilities.” will be answered by the most classic knee-jerk decision-making pattern “we don’t know how to do X, so let’s overlay a new Y to enforce X”, in this case :”Group together strategic projects into a big strategic program that will ensure coherence”. The difficult but simple option will not even be entertained : “let’s discover our real strategy and shape the organization around it.”

32.  Delivery dates have often irrelevant but very simple to understand impacts. Good and bad solutions have dramatic but very difficult to understand impacts. The Entropic Organization will thus tend to make date-based decisions. The Entropic Organization will always worry about the development organization ability to deliver by a given date, never about the ability to find the right solution. There are some very rare cases where delivery date is more important than what you are delivering, but modern management seems to delight in generalizing this unusual occurrence to every situation. People do get promoted for having been able to deliver completely broken, useless and damaging solutions on time. If that’s the measure of project success you can expect dates to rule (even when they continuously slide). After all, if you are not a trained surgeon, and the only thing you are told is that a given surgery should last no more than X hours, guess what will be the one criterium for all your actions during the operation. This showcases the direct link between the constituents incompetence and the establishment of classic Entropic Organization decision-making.

33.  Having a strategy will only go so far when you face the Entropic Organization, since it will be only able to appropriate that strategy at the level of energy (understanding) they can attain, which, being entropic, is very low. This results in something that does not look like a strategy at all : ever seen a two-years old play air-traffic control? He got the basic understanding of “talking to planes”, but that’s it.

34.  Partially isolating the Development Organization to stay effective does not work. Adapting your organization to be accepted by an incompetent background does not work either. What is left in the scope of alternatives is radical isolation supported by the attempt to radical results and crossing fingers for top-management recognition (also known as `Deux-ex-machina for the worthy’) and the top-down sales-pitch (or POC) to the CEO (also known as “He who has the ear of the King…”). But don’t forget : Nemo propheta in patria, so, act and look like an outsider as long as you can.

35.  Growth-shrink symmetry. When an organization grows unhealthily (too fast, for bad reasons or through bad recruitment) it will also shrink unhealthily. When it grows it’s bold and confused, when it shrinks it’s scared and nasty.

36.  Most of the ideas that will pop up naturally from the Entropic Organization are bad in the context of modern knowledge-based work, but possess a superficial layer of common-sense to slide through. Exercise extreme prejudice.


A quick thought.

Today I started reading Roy Fielding’s PhD thesis : Architectural Styles and the Design of Network-based Software Architectures and the first chapter begins with a priceless sentence :

In spite of the interest in software architecture as a field of research, there is little agreement among researchers as to what exactly should be included in the definition of architecture.

Of course he moves on to define it and it is a reasonably good definition, with encapsulation in the core of it and a clear explanation that every level of abstraction manifests an architecture of its own.

Yet, there’s something fishy for a topic that has professionals, books, courses named after it, whole hierarchies of people working in it and yet… what exactly should be included?

Test Driving a Unity Game

Test Driving a Unity Game

It’s been so long that I feel a newcomer to WordPress’ user interface.

A few years ago, just after the last Italian Agile Day I attended (2013), I was thinking of writing something about using TDD when developing in Unity.

Recently I saw an email passing by the tdd mailing list asking about exactly this topic and, as it happens, I’m on holiday right now, so I finally got around at writing the article I should have written years ago. What a serendipitous accident.

First some notes :

  1. This is not about testing pure “unity-neutral” C# code. In Unity, at some point, you start using objects which have no relationship on Unity types (like MonoBehaviour, Transform etc..), but this happens quite low in the call stack and, at times, not at all. Besides, you can test drive those with normal tooling, describing that is redundant to any good tdd book.
  2. It is unlikely that many of your interesting game features will be described entirely within the context of unity-neutral code. Unity is pervasive and it is not built for technological isolation (probably my biggest issue with a product I otherwise love). After all, if you are writing a 3d game, most of the stuff you need to code is touching concepts like a transform, a collision, an animation; it is possible to express them neutrally, but Unity has decided to sacrifice isolation for immediacy. I’ve tried reintroducing that isolation and it’s not nice, I thus do it very selectively.
  3. A lot of the design questions you need to answer when developing in Unity are related to the distribution of responsibilities among the MonoBehaviours you attach to GameObjects (Unity’s game logic is structured around the Extension Object Pattern, see Gamma’s paper in PLoP 96). Skipping those parts of your logic just because they are unity-dependent pauperises tdd in Unity into irrelevance.
  4. Since a lot revolves around which MonoBehaviour of which GameObject does what, the collaborations between those behaviours are equally critical to your design; those collaborations are wired into life by Unity’s declarative dependency-injection mechanism: the Scene. The Scene is thus the seed of all your fixtures; trying to bypass it, while possible (factories, resource load, stubs), is often not worth it.

Now, all of the above is an admission that, if I want to apply my usual holistic approach to tdd, most of my tests will not look like unit tests and will need to cope with Unity’s environment.

It took me some time to accept this fact, but when I did, I started to see that there were interesting advantages in accepting a Unity scene as my test runtime; the rest of this post will be about how exploiting those advantages shaped my approach to doing tdd in Unity.

You can and should simulate unit test isolation by exploiting physical space locality

This means that if the Scene is your runtime, if you take care to build your test in such a way that its effects stay within a well-defined volume, your Scene will behave similarly to a suite of properly code-isolated tests in classic unit testing.

This has the side effect of forcing me to avoid as much as possible world-spanning searches of other objects through tags or names: all of my GameObject-to-GameObject interaction is defined by colliders and explicitly injected dependencies. The effort to keep the tests isolated in space is already influencing my design.

After a while I’ve come to actually materialise the bounded volume of a test with a cage. This makes boundary enforcement more natural while building and running the test (you can’t ignore that something is leaving the test volume when that volume is graphically represented) and has the nice benefit of giving your test scene the look of a well-managed zoo :


If I really feel hardcore I can use a more advanced kind of cage that has colliders for walls, these colliders destroy anything that they touch and throw an exception, failing the test. Frankly, while cool, this is overkill if you run your test Scene in Unity and glance at what is happening, but I believe that if the tests are meant to run in a headless runtime (not even sure if it is possible), say, within a CI build, those exception-throwing cage walls become necessary: they are the only way to spot an abusive test early on, before it pollutes other tests.

A final note on the physical space isolation : if you look at what Unity has provided as automated testing tools, they have taken a different approach. Every test is run sequentially and, while it runs, only the GameObject representing the test (and its hierarchy) exists; the test can thus play with the whole empty scene, without limitations. I was already using my “caged” approach before Unity published the testing tools, so I am biased to my solution, but I can articulate a bit why I prefer my solution : first, the limitation of having everything present in the scene at the same time while running the tests informs my design, as I explained above: it ensures that my logic is intrinsically bounded in space and not world-spanning; second, my approach runs all tests at the same time, which is critical when most of your tests require one or two seconds to pass to let objects move around; I can run dozens of tests within a few seconds, with everything happening at the same time.

Here you can see what happens on the ground floor of my test zoo within a few seconds, a dozen tests doing their thing at the same time :


The cage must contain only the (Game)Objects you want to test

The problem with not limiting the test to pure, unity-neutral logic, is that the test can grow to become a monster. Just like a classic unit test should not setup the whole system with dozens of components only to test a specific case, I always make sure that I can setup a minimum amount of components, all of them neatly bounded by the cage, and still test what I need to test. If my design is correct, I will be able to demonstrate the feature I want with only the objects that will contain the feature.

This quality is core. Failing this, the tests are not declaring a unit of expected behaviour and everything devolves to automated smoke tests. Interesting, but big, clunky and of little value as a design tool.

Here’s how I built the test that brought the “Planter” and the “ConstructionZone” concepts into my code. This is the very first cage I built for this game (it’s about city construction, in case you were wondering) :


The test goal is to declare that the planter tool, when triggered by the user’s finger touching a construction zone, builds terrain and a building. I created a construction zone as a collider (the bottom, green square) and a dummy finger (the yellow line), replacing the user mouse or touch, that “clicks” on the square at Start. See below.


Then I created a second collider, the top, green cube highlighted below, which contains an assertion that succeeds if a “building” collides with it.


The solution to this test has been to implement two MonoBehaviours, one attached to the “finger”, the Planter tool, the other attached to the bottom collider, the ConstructionZone.

Once everything is working the construction zone and the planter tool spawn a building as soon as the finger “clicks”, the building collides with the assertion collider and the test passes. If something goes wrong, no building, no collision, failure exception in the Unity console.

Below is the result that appears when everything is fine. The building is the white cylinder, admittedly ugly, but that’s not the point.


Below is the setup of the finger and the zone, showing how few and simple the objects involved are (the Dummy Finger and Tool User are the classes I developed to act on behalf of the user, the TestBuilding referenced in the Planter is the white cylinder).

After this first test was succeeding I moved on to refine the behaviour of the Planter by, for instance, stating that it is not affected by any obstacles, which is a characteristic I need on every tool available to the user : if it touches an action zone it doesn’t matter if it first intersects a cloud or other piece of landscape, it must still trigger it. Below you see the second cage : the flat, solid white panel is the obstacle which the Planter must ignore to touch the zone below and spawn the building. The rest of the test logic is the same as the first cage, except that, by this time, I had also created the pavement mesh, with its nice grass & ground texture that you can see below the building in the previous test result, and as such I could place the assertion collider below, where the pavement spawns, and I don’t need to actually spawn an ugly cylinder, the pavement collides and passes the test.


Much later on, after I had completed most of the building logic, I moved to develop artillery, with tests that look like the one below. Here you can see the cannon ball flying towards a rotated cube, where I attached my “Structure” MonoBehaviour that will get damaged (depending on the angle) by the ball colliding with it.

The assertion is also sitting on the cube, waiting for a collision and checking that the Structure’s health is lower than the initial health.


You must write your own assertion and dummy player logic to simulate every interaction you need

What I did start to use out of Unity’s testing tools are some of the assertion components, but they are far from sufficient and anyway I always need to write custom scripts to generate the actions and transient behaviours that I need to simulate events that happen in the game and that my logic needs to react to.

For instance, in the movie below I’m testing that, even if the user wants to fire, the cannons on the wall will actually fire only when a target is in firing range.

The piece of pavement that moves on the left is the target; it contains a small script (part of my testing utilities for this game) that moves it at specific intervals by a specific amount. I called it the KinematicMover (since it does not use physics to move the object). “Using” statements edited out.

public class KinematicMover : MonoBehaviour {

        public Vector3 movement;
        public int steps;
        public float pause = 1;
        private Delay delay;

        void Start() {
                delay = gameObject.AddComponent<SystemDelay>();
                delay.repeat(steps, pause, () => this.transform.Translate(movement));

Some of the test harness logic you create will likely stay such, like the assertion checking that there were indeed some cannon balls flying within a collider during a specific time window. It is attached to the middle collider, to have the test pass of fail depending on the timing of the cannons firing from the wall.

public class ProjectileChecker : MonoBehaviour {
         public float windowStart = 0f;
         public float windowEnd = 10f;
         private bool complete = false;
         void OnTriggerEnter(Collider other) {
                 if(complete) return;
                 if(other.gameObject.GetComponent<Projectile>()) {
                         this.complete = true;
                         if(Time.fixedTime > windowStart && Time.fixedTime < windowEnd) {
                         } else {

        void Update() {
                if(complete) return;
                if(Time.fixedTime > windowEnd + 1) {
                        this.complete = true;

On the other hand I’ve found that, even more frequently than in classic non-unity tdd, some of the logic driving the events for your tests turns out to be very useful game logic of its own; for instance, the gunner logic that tries to fire all the time became almost instantly part of the game’s basic opponent AI. Meet the Aggressive Gunner :

public class AggressiveGunner : MonoBehaviour {

        public City city;
        void Update() {

In short, you must not be scared to create quite a bit of test stubs, custom assertions, movers and shakers. They are key to well isolated and focused tests, while at the same time potentially reusable in the main code itself. They should be easy to write, if your design is ok.


My tdd approach in Unity ends up being what many people would define as very granular, very isolated integration testing that only later on, for very specific logic, gets down to pure C# tests. It does work pretty well and produces almost all of the design feedback I need, along with a nice test Scene(s) that makes me feel safe and in control as the game logic gets more complex.

Web Apps in TDD, Appendix, the User

Here’s the User class and its collaborators as it is right now. It is a bit more evolved than its original form : when I first wrote it, all of the logic was in the User itself as I had no need to evaluate Javascript outside of html, later I separated the two responsibilities (parsing xml/html and evaluating Javascript) since I had a need to evaluate Javascript no matter where.

public class User {
    private final JavaScript javaScript = new JavaScript();
    private final Result result = new Result();

    public User() {

    public User lookAt(String htmlPage) {
        JavaScriptSource source = new XomJavaScriptSource(htmlPage);
        return this;

    public String currentSight() {
        return result.nextValue();

    private void triggerOnLoad() {
        javaScript.evaluateScript("window.onload();", "onload");


And here’s “JavaScript”, which manages everything Rhino-related.

public class JavaScript {
    private final Context context;
    private final ScriptableObject scope;

    public JavaScript() {
        context = Context.enter();
        scope = context.initStandardObjects();

    public Object valueOf(String variableName) {
        return scope.get(variableName);

    public void evaluateScript(String script, String scriptName) {
        context.evaluateString(scope, script, scriptName, 1, null);

    public void evaluateScript(String script) {
        evaluateScript(script, "script");

    public void evaluateFile(String sourceFileName) {
        try {
            context.evaluateReader(scope, read(sourceFileName), sourceFileName, 1, null);
        } catch (IOException e) {
            throw new RuntimeException(e);

    private InputStreamReader read(String sourceFileName) {
        return new InputStreamReader(getClass().getClassLoader().getResourceAsStream(sourceFileName));

This is “Result”, which extracts values from the results array in Javascript.

public class Result {
    private NativeArray output = new NativeArray(0);
    private int current = 0;

    public void readOutput(JavaScript javaScript) {
        output = (NativeArray) javaScript.valueOf("output");

    public String nextValue() {
        return (String) output.get(current++);

Finally, this is the class that hides the fact that scripts are mixed within html.

public class XomJavaScriptSource implements JavaScriptSource {

    private final Document document;

    public XomJavaScriptSource(String htmlPage) {
        document = parsePage(htmlPage);

    public void evaluateWith(JavaScript javaScript) {
        Nodes scriptNodes = document.query("//script");
        for (int i = 0; i < scriptNodes.size(); i++) {
            evaluateNode(scriptNodes.get(i), javaScript);

    private final Document parsePage(String htmlPage) {
        try {
            return new Builder().build(htmlPage, null);
        } catch (Exception e) {
            throw new RuntimeException(e);

    private void evaluateNode(Node scriptNode, JavaScript javaScript) {
        if (scriptNode instanceof Element) {
            Attribute sourceAttribute = ((Element) scriptNode).getAttribute("src");
            if (sourceAttribute != null) {

Web Apps in TDD, Part 4

Multiple buildings!

I bet that you can tell where this is going, I’ve one building, now I want multiple buildings on my page and then finally put the player in the middle.
So, I add two more rectangles to my investigation.html.

            var map = new Raphael(0,0,600,400);

Which produces :

Now, for the test that will lead me to implementing this…

    public void shouldRenderMultipleBuildings() throws Exception {
        HtmlScreen htmlScreen = new HtmlScreen();
        User user = new User().lookAt(htmlScreen.render());
        assertThat(user.currentSight(), is("A Rectangle at [10,10], 40px high and 50px wide"));
        assertThat(user.currentSight(), is("A Rectangle at [80,10], 40px high and 30px wide"));
        assertThat(user.currentSight(), is("A Rectangle at [10,70], 40px high and 100px wide"));

The current result is :

Expected: is "A Rectangle at [10,10], 40px high and 50px wide"
     got: "A Rectangle at [10,70], 40px high and 100px wide"

Which is the last building only. This is due to my implementation of the html screen. Which stores the last building and overwrites the previous one. Easy fixed.

    private String renderBuildings() {
        String renderedBuildings = "";
        for (Building building : buildings) {
            renderedBuildings += building.render(vectorGraphics);
        return renderedBuildings;
    public Screen addBuilding(int x, int y, int width, int height) {
        buildings.add(new Building(x, y, width, height));
        return this;

function Raphael(x, y, width, height){

    this.rect = function(x, y, width, height) {
        output[invocations++] = "A Rectangle at [" + x + "," + y + "], " +
                    height + "px high and " + width + "px wide";


var output = [];

    public String currentSight() {
        return (String) output.get(current++);

Now, for the real-life test

    public static void main(String... args) throws Exception {
        HtmlScreen htmlScreen = new HtmlScreen();
        new Boss(11111, htmlScreen);

But, the very first HtmlScreenBehavior test is not happy, it did expect a rectangle, now that rectangle needs to be added explicitly.

    public void shouldRenderABuildingAsARectangle() throws Exception {
        User user = new User().lookAt(new HtmlScreen().addBuilding(10, 10, 50, 40).render());
        assertThat(user.currentSight(), is("A Rectangle at [10,10], 40px high and 50px wide"));

Now it passes. All tests are green.

I’m so glad it’s time to refactor, because my tests are looking very bad. For instance, have a look at the test just after the one I just modified :

    public void shouldRenderTheBuildingWithTheRightPositionAndDimensions() {
        User user = new User().lookAt(
        		new HtmlScreen().addBuilding(50, 30, 80, 40).render());
        		is("A Rectangle at [50,30], 40px high and 80px wide"));

Yes, they are the same, the only difference is in the values. This is pretty much the only instance were I do consider erasing a test without a change in features : when it says exactly the same thing as another test.

So, adieu! I delete the second one, as I like the first one’s name better.

What else? Well, I’m growing bored of typing all of these “A Rectangle…”.

    public void shouldRenderABuildingAsARectangle() throws Exception {
        User user = new User().lookAt(new HtmlScreen().addBuilding(10, 10, 50, 40).render());
        assertThat(user.currentSight(), is(aRectangle(10, 10, 50, 40)));

    public void shouldRenderMultipleBuildings() throws Exception {
        HtmlScreen htmlScreen = new HtmlScreen();
        htmlScreen.addBuilding(10, 10, 50, 40);
        htmlScreen.addBuilding(80, 10, 30, 40);
        htmlScreen.addBuilding(10, 70, 100, 40);
        User user = new User().lookAt(htmlScreen.render());
        assertThat(user.currentSight(), is(aRectangle(10, 10, 50, 40)));
        assertThat(user.currentSight(), is(aRectangle(80, 10, 30, 40)));
        assertThat(user.currentSight(), is(aRectangle(10, 70, 100, 40)));

    private String aRectangle(int x, int y, int width, int height) {
        return "A Rectangle at [" + x + "," + y + "], " + 
        			height + "px high and " + width + "px wide";

Web Apps in TDD, Part 3

Rendering a building

What’s next on the todo list? I want a rectangle to appear on the screen to show a building.
Google tells me there’s a nice javascript library to draw vector graphics on a browser. I think I’ll try it, it’s called Raphael.
So I write an html file using this library to draw a rectangle :

    <script type="text/javascript" src="raphael-min.js"></script>
    <script type="text/javascript" charset="utf-8">
        window.onload = function() {
            var map = new Raphael(0,0,600,400);
            var building = map.rect(10,10,50,40);

And with this I get :

A real beauty. The building is the small rectangle top left.

Now I would like my HtmlScreen to return that.

Testing the behavior

I could write a test like this :

    public void shouldRenderABuildingAsARectangle() {
        assertThat(new HtmlScreen().render(), is("<html>\n" +
                "<head>\n" +
                "    <script type=\"text/javascript\" src=\"raphael-min.js\"></script>\n" +
                "    <script type=\"text/javascript\" charset=\"utf-8\">\n" +
                "        window.onload = function() {\n" +
                "            var map = new Raphael(0,0,600,400);\n" +
                "            var building = map.rect(10,10,50,40);\n" +
                "        };\n" +
                "    </script>\n" +
                "</head>\n" +
                "<body>\n" +
                "</body>\n" +

But it would be awfully fragile. More to the point : it would not declare what I’m interested in.
What is it I’m interested in? If this were a test for some internal functionality it would be easier, you just have to think at what the clients of an object expect from it. Here the client of the rendering from the HtmlScreen is the user : the biped that is looking at the computer screen.

What I would really like to write in my test is :

    public void shouldRenderABuildingAsARectangle() throws Exception {
        User user = new User().lookAt(new HtmlScreen().render());
        assertThat(user.currentSight(), is("A Rectangle at [10,10], 40px high and 50px wide"));

Then I think I’ll write exactly this and I’ll find a way to make it work. A solid, meaningful test is definitely worth the effort.

The are three pretty separate issues here :

  • The user should be able to look at the rendering and extract the parts of it which are meaningful for it, right now that means the javascript.
  • The user should be able to evaluate those meaningful parts and get me a short description of the final result.
  • HtmlScreen should return the correct html with the script needed to satisfy the user

Before attacking any of this I put my html inside a string within the test itself and I pass it to the user, I want to make sure that when my user will work I will know.

    public void shouldRenderABuildingAsARectangle() throws Exception {
        String html = "<html>" +
                "<head>" +
                "    <script type=\"text/javascript\" src=\"raphael-min.js\"></script>" +
                "    <script type=\"text/javascript\" charset=\"utf-8\">" +
                "        window.load = function() {" +
                "            var map = new Raphael(0,0,600,400);" +
                "            map.rect(10,10,50,40);" +
                "        }" +
                "    </script>" +
                "</head>" +
                "<body>" +
        User user = new User().lookAt(html);
        	is("A Rectangle at [10,10], 50px high and 40px wide"));

Now, on with the implementation of my user.
The first problem is quickly solved, I’ll have the user parse the received html and extract all scripts. The second problem is partially solved by using Rhino to evaluate the scripts.

There’s one thing that hangs : provided I can run the javascript in my user, that javascript does not produce a fancy (and easy to check) text description of what the user sees, it calls Raphael which performs some vector graphics magic.

Luckly, stubbing objects and whole libraries in javascript is trivial. In my tests I’ll replace the Raphael library with this file :

    function Raphael(x, y, width, height){
        this.rect = function(x, y, width, height) {
           output += "A Rectangle at [" + x + "," + y + "], " + height + "px high and " + width + "px wide";
    var output = "";

I run this and the test is still red, Rhino says that there’s no window object available. Indeed the window object is provided by the browser. I’ll just make sure that my user executes a browser.js file before everything else when evaluating the HtmlScreen :

    function Window() {}
    var window = new Window();

The test is green.

I move the html to HtmlScreen, this test is green but the old one is red, remember the first test for HtmlScreen?

    public void shouldRenderAnHtmlDocument() {
        assertThat(new HtmlScreen().render(), is("<html><body></body></html>"));

That’s definitely not true any longer. Is interpolating the microdivergence between these two tests going to help my design? Maybe, but I’ve added a lot for this last test, I don’t want to follow up with further changes in the production code just to satisfy that old boring test. Moreover, the reason that test is failing is that it is extremely dependent on the actual html string.

That’s not really good, especially since I’ve done so much to avoid being dependent to the html string in my latest test. Finally, I’ve just added a dom library for use inside the user, it’s really quick to change that test to declaring the same thing without being this fragile :

    public void shouldRenderAnHtmlDocument() throws Exception {
        Document document = new Builder().build(new HtmlScreen().render(), null);
        Element root = document.getRootElement();
        assertThat(root.getLocalName(), is("html"));
        assertThat(root.getChildElements("body").size(), is(1));

Is it me, or there’s still no really simple xml library for Java? This is xom, I decided to try it after years of jdom, dom4j and the basic java dom (ugh!) : it has a simple builder, good; poor navigation, bad; and it forces me to pass a null in the builder since I have no base url, very bad.

Anyway, the HtmlScreenBehavior tests are all green. I’ve one red test, it’s BossBehavior. It complains that raphael does not exist.
In fact the WebClient performing the http call is parsing the html from the server response and it is calling the server to obtain the raphael-min.js. My server does not currently return static resources.
Even if I have a red test I add a new one into BossBehavior, which is far more focused, that will ensure a clear feedback and also prove my theory on the origin of the error.

    public void shouldReturnAStaticFileWhenAPathIsProvided() throws Exception {
        boss = new Boss(PORT, new BlankScreen());
        assertThat(Http.callOn(PORT, "sample.content.txt"),
		is(HttpAnswer.with("foo content bar").type("text/plain")));

Inside the sample.content.txt file I got the string “foo content bar”.

Indeed the test fails, as expected.

Grizzly has a nifty Adapter ready for this, the StaticResourcesAdapter, but, how to choose between my original adapter and this one? Well, when a resource is required I’ll use the StaticResourcesAdapter, when no resource is requested it will be my original adapter.
Doesn’t this sound a lot like mapping?

public class AlwaysReturn{
        public void service(Request request, Response response) throws Exception {
            String path = request.unparsedURI().toString();
            if(path != null && !path.equals("/")) {
                filesRetriever.service(request, response);
            } else {

The “filesRetriever” is the static resources adapter from Grizzly.

I run all tests and they are all green!

I perform a reality check, I launch Boss with its main and I call it with a browser.

And here it is, the first building of my city.

Yet looking back at the last piece of code I wrote, I don’t like what I see. Simply put, the service method is a mess : it is performing operations from multiple levels of abstraction, it is meddling with the static adapter while it is also performing actions which are specific to the screen. Finally it belongs to a class “AlwaysReturn” whose name has lost all meaning.
I’ll clean this in a pretty direct way, for now.

    private static class RootAdapter implements Adapter {
        private final StaticResourcesAdapter resourcesAdapter;
        private final ScreenAdapter screenAdapter;

        public RootAdapter(StaticResourcesAdapter resourcesAdapter,
        				ScreenAdapter screenAdapter) {
            this.screenAdapter = screenAdapter;
            this.resourcesAdapter = resourcesAdapter;

        public void service(Request request, Response response) throws Exception {
            if(pathIsPresentIn(request)) {
                resourcesAdapter.service(request, response);
            } else {

Tests are still green.

Controlling the building

I’ve now the render of a building, but that render is static, I can’t really control it in any way.
Thus I’ll add a new test :

    public void shouldRenderTheBuildingWithTheRightPositionAndDimensions() {
        User user = new User().lookAt(new HtmlScreen().addBuilding(50,30,40,80).render());
        assertThat(user.currentSight(), is("A Rectangle at [50,30], 40px high and 80px wide"));

The test does not compile, I need the method “addBuilding”, easy added (empty, of course). Test is now failing, I always get the old rectangle in the old position. Fine.

This code makes all tests pass :

public class HtmlScreen implements Screen {
    private Building building = new Building(10, 10, 40, 50);

    public String render() {
        String start = "<html><head>" +
                "    <script type=\"text/javascript\" src=\"raphael-min.js\"></script>" +
                "    <script type=\"text/javascript\" charset=\"utf-8\">" +
                "        window.onload = function() {" +
                "            var map = new Raphael(0,0,600,400);";
        String end = "}</script></head><body></body></html>";
        return start + building.render() + end;

    public Screen addBuilding(int x, int y, int height, int width) {
        building = new Building(x, y, height, width);
        return this;

This code is ugly. The HtmlScreen is assembling the overall html, importing Raphael and composing the script. Building.render() is also strongly coupled with Raphael. It is also coupled with the fact that the Raphael canvas is named “map”, see for yourself :

    public String render() {
        return "map.rect(" + x + ","+ y + "," + width + "," + height + ");";

This can’t do. I want everything related to Raphael to stay isolated and I want the HtmlScreen to focus on assembling the overall html, not the details of the script. I think I need a VectorGraphics object.

public class HtmlScreen {
    public String render() {
        return header() +
                vectorGraphics.include() +
                openScript() +
                openFunction() +
                vectorGraphics.init() +
                building.render(vectorGraphics) +
                closeFunction() +
                closeScript() +

public class VectorGraphics {

    public String include() {
        return "<script type=\"text/javascript\" src=\"raphael-min.js\"></script>";
    public String init() {
        return "var map = new Raphael(0,0,600,400);";

    public String rect(int x, int y, int width, int height) {
        return "map.rect(" + x + ","+ y + "," + width + "," + height + ");";

public class Building {
    public String render(VectorGraphics vectorGraphics) {
        return vectorGraphics.rect(x, y, width, height);

I don’t know if this idea of a vector graphics renderer, used by the building and by the HtmlScreen as well, represents a new and lower level of abstraction. In theory it is, as it manages every detail of the use of javascript for graphics, but strong abstractions aren’t always the most obvious ones, only tests will tell.