Thursday, October 31, 2013

Mockist TDD vs Classic TDD


If you want to practice TDD there are two main approaches to choose from: mockist TDD or classic TDD (Martin Fowler). In this post i would like to compare between the two.
First, I’ll describe the two approaches and later I will list the pros and cons of the mockist approach.

Mockist TDD
With this approach, you're working in a high granularity, meaning every class has its own test fixture. As a result, each test fixture should only test one CUT (Class Under Test) and none of the classes the CUT depends on, assuming they all have their own test fixtures.

Suppose we have a class A that uses class B. To achieve the high granularity we’ve talked about, TestFixtureA must use a Test Double of B, as shown in figure 1:

Figure 1


Of course our design must support dependency injection to achieve that and it means class A must work against an interface of B and it also requires a way to inject a concrete instance of B into class A (via constructor/setters/IoC etc.)
That’s why this approach is called Mockist TDD since it has an extensive use of Mocks (Test Doubles). It is also called Isolated Testing since each class is tested in an isolated way.

NOTE: we isolate class A from class B even if class B is a regular business class that has no interactions with any external resources such as DB\web service\files system etc.

Classic TDD
With this approach, you're working in a low granularity, meaning every graph of classes has its own test fixture. As a result, each test fixture covers a graph of classes implicitly by testing the graph's root.

Figure 2
















Usually you don't test the inner classes of the graph explicitly since they are already tested implicitly by the tests of their root, thus, you avoid coverage duplications. This lets you keep the inner classes with an internal access modifier unless they are in use by other projects.

Pros & Cons
Let’s describe the pros and cons of the mockist approach.
Pros
1.       More TDD’ish – since all the classes the CUT depends on are mocked, you can start testing the CUT without implementing the classes it depends on. Think about the classic approach, when you come to test some CUT, you should first implement its dependencies, but before that you should first implement their dependencies and so forth.
2.       High granularity – this means:
a.       Smaller test fixtures – one per class, unlike one per graph of classes in the classic approach.
b.      Smaller test setups – take a look on TestFixtureA at figure 2: the Arrange phase (Arrange-Act-Assert) of tests like this is quite large since it has to setup a state for too many classes in the graph. This quite a crucial issue – the bigger the test, the bigger the risk of having bugs in the test itself.
c.       Frequent checkins/commits – think about it, with the classic approach, your tests won’t pass before all the classes the CUT depends on are implemented correctly, thus, the frequency of your checkins (commits) is reduced dramatically (you don't want to commit red tests).
3.       More alternatives to do DI – take a look at figure 3, suppose you need to inject different concretes of interface I into class C. With the mockist approach, which heavily relies on injections, the code that initializes class A also initializes class B and inject it to class A, and also initializes class C and inject it to class B. Therefore, it can easily inject a concrete class of interface I into class C. See figure 4 for example. On the other hand, with the classic approach, the code that initialize class A doesn’t have access to the inner classes of the graph (B, C and I) and therefore its only way to inject a concrete class of interface I into class C is by using some framework of IoC.
Figure 3



















Figure 4










Cons
1.       Much more interfaces and injections – with the mockist approach, for almost every class, you have at least one interface. In addition, there is some kind of injections inflation.
2.       Weaker encapsulation – each class exposes its relations with the classes it depends on so that they can be injected into it and also to allow behavior verification, this partly weakens the encapsulation.
3.       High vulnerability to refactoring – with the mockist approach, every change in the interaction between two classes, requires changes in some tests, since tests usually aware of the interactions between classes (see behavior verification). With the classic approach, on the other hand, you usually do state verification and thus, the tests are not aware of the interaction between classes.

Conclusions:
I personally definitely prefer the mockist approach for one main reason – I cannot see how truly TDD is possible without it.


Monday, October 28, 2013

Red-Green-Refactor

The most recommended way to implement TDD is to follow the Red-Green-Refactor path. In this post I would like to talk about the importance of the Red-Green steps.

Good tests
A good test is green whenever the UUT (Unit Under Test) is correct and is red whenever the UUT is incorrect.

Bad tests
There are 3 types of bad tests:
  1. Tests that are red when the UUT is correct and are also red when the UUT is incorrect. Obviously, tests of this type will be discovered immediately.
  2. Tests that are red when the UUT is correct and are green when the UUT is incorrect. This type of tests is worse than the previous type since it’s not always detectable. And if the UUT is incorrect (and thus the test is green…), it will provide you a false confidence that everything is OK.
  3. Tests that are green when the UUT is correct and are green when UUT is incorrect. This type is at least as bad as the previous type of tests. Tests like this are worthless.

The Red-Green-Refactor (RGR) path will most probably lead you to good tests in most cases. Why? If you follow that path, the first step is the Red step. In that step you should first write your test before the UUT is implemented (thus, incorrect). This almost ensures you that your test will be red when the UUT is incorrect. The second step is the Green step, in which you implement your UUT correctly and you expect your test to be green. This almost ensures you that your test will be green when the UUT is correct. 
Eventually this leads you, with a high degree of certainty, to a ‘good test’ as described above (red when the UUT is incorrect and green when the UUT is correct).


Remember: we’re not talking about pure mathematics here, there will be times when you will follow the RGR path and still end up with ‘bad tests’. Yet, following this path will enhance the robustness of your tests dramatically.

Conclusions:
Many times, people tend to write their tests only after they have completed the UUT and thus, skipping the Red step. This might lead them to bad tests of type 2 and 3 as mentioned.
My conclusion: follow the RGR path.

Monday, April 22, 2013

Referencing the internal members of an aggregate.

There is a lot of confusion around the Aggregate pattern [DDD] and especially around the question: whether or not it's OK for an external object to reference an internal member of an aggregate.

First of all, let's see what the DDD book has to say about it:

"Choose one Entity to be the root of each Aggregate, and control all access to the objects inside the boundary through the root. Allow external objects to hold references to the root only. Transient references to internal members can be passed out for use within a single operation only. Because the root controls access, it can not be blindsided by changes to the internals."

This is a little bit confusing. On the one hand, the root controls all access to the internal members and it can not be blindsided by changes to the internals, but on the other hand, transient references to internal members may be passed out. This means that an external object can mess around with the state of the internals and thus, blindside the AR...
Sounds like a paradox, or is it?

Consider the following example: suppose we have a Travel Agency web site in which users can make flights reservations. A reasonable invariant would be: total reserved sits for a flight can not exceed the total number of sits on the plain. This rule is a true invariant since breaking it will cause a total chaos on the day of the flight. Imagine 60 people claiming their sits on a plain with 50 sits only...

To enforce this invariant we will probably have the following objects in one aggregate:

Here Flight is the AR










Our mission is to make sure that the state in the DB NEVER(!) violates this invariant.
There are several techniques to achieve that.

Tell, don't ask.

The first technique is to let the AR encapsulate all the internal members and every changes to their states will be done through it. In our example, Flight (which is the AR) will encapsulate Reservations. Flight will expose methods like ReserveSits, UpdateReservation, CancelReservation and thus will be able to enforce the invariant. This technique might work, but only if all the internal members of the aggregate are fully encapsulated. Unfortunately it breaks the rule of: Transient references to internal members can be passed out for use within a single operation only

It used to be my favorite technique but it's quite a pain in the ***. What if the aggregate consists of a dipper graph of objects? Eventually you will end up with an AR that has an endless list of methods which all their purpose is to encapsulate every action on the internal members.

Brute force validation.

We need a different technique, a one that will allow external objects to hold transient references to internal members and at the same time will not blindside the AR. The one i prefer is what I call a "brute force validation" (BFV). With this technique you ask the AR to check all its invariants before each time you are about change its state in the DB. You will probably have a method in the AR called CheckInvariants or something like that.

There are a few issues to consider with BFV. First of all, you MUST not forget to call CheckInvariants before each time you are saving the AR to the DB. This means you need to find all the places in code in which you are saving the AR and to invoke this method there. Ouch...
And what if some developer will add a new place in code that saves the AR to the DB? If this developer will forget to call the CheckInvariants method - your DB will be corrupted...

Fortunately, the Repository pattern is coming to the rescue. According to this pattern, each aggregate should have its own repository (usually with the name of the AR as a prefix e.g. FlightRepository). Each AR repository should have a method that saves the AR in the DB along with all its internals and only them. According to this pattern, the AR repository should be the only place to save the AR to the DB. This sounds like a good place to call the CheckInvariants method - inside the repository itself, right before the saving action.

But there is another issue: what happens if an external object modifies one of the internals and then tries to save this internal directly to the DB? This will bypass the CheckInvariants method which is located only at the AR. Actually, if you are following the Repository pattern correctly, you don't have to worry about it - repositories should only expose methods that save ARs and not regular entities. Therefore this scenario is not possible.

One last issue to consider. Imagine the following scenario: some AR holds a reference to an internal member of another AR. In our example, let say that another AR, the class User, is holding a list of all the user's Reservations. Those Reservations are also internal members of some Flights. 


Is this possible?

















What if some User object modifies a Reservation and then this User object is sent to UserRepository to be saved? According to the Repository pattern this is all perfectly legal - a User is an AR and hence should have a repository to save it.
But still, we do have a problem here, the modification to the Reservation object may violate some of the invariants of a Flight and this Flight won't even know about it. Do not worry, if you're following the Aggregate pattern correctly, this scenario is also not possible. It's true, at some point, a User object may hold a reference to some Reservation of some Flight, but this reference is Transient, meaning, the Reservation is not a member of User. Therefore, even if some User object will modify a Reservation and then this User object will be saved to the DB, the Reservation will not be saved along with it.

An entity can be a member of only one AR













Conclusion

Brute force validation allows you to expose the AR's internal members (if needed) and yet, to be confident that even if some of the invariants are violated - these violations will be discovered before saving the AR to the DB.