Introduction
You know you have to write unit tests. You have the tools, and your team is committed to the idea. Somehow, though, despite your best intentions, the tests just never get written. Or you write them, but never run them. They always fail anyway. For one reason or another, your project is resisting the will of the unit test. Usually, this means that your code needs a good solid refactoring, to make it more accepting of the way of the test.
Many people have observed that unit testing fits the mold of Heisenberg’s Uncertainty Principle which says, among other things, that at the quantum level, the act of observing an event changes the nature of that event. Unit tests often have the same effect, but it extends beyond the runtime event being tested. It affects the design of the code that is to be tested, as well.
In this column, we’ll examine ten different ways you can change your code to make it ready to be tested. By doing so, you will be creating a more loosely coupled, flexible and transparent architecture, which will benefit you not only in testing but in documentation, maintenance and, eventually, modification. With a little forethought and some careful implementation, the unit tests can drive more than the QA process; it can help you design a more robust application in the first place.
Use Interfaces
Those of us who came through the COM days with our registries intact remember that most strident of admonitions: “it’s the interfaces, stupid”. To write COM code, you had to employ interfaces as the primary coupling mechanism. For the un- (or under-)initiated, interfaces are lightweight constructs which define the publicly accessible method signatures exposed by a class; they contain no implementation details whatsoever. Instantiated objects can be referred to by references of the interface type, but the interface can never be instantiated itself.
Here is a simple interface example: we’ll define an interface for interacting with different kinds of agents. The interface is called IAgent:
public interface IAgent { string AgentName {get; set;} bool accomplishMission(TimeSpan timeLimit); }
Agents all have a name, and can attempt to accomplish a mission in a given amount of time.
We’ll implement two different kinds of agents, SecretAgents and AirlineAgents. Airline agents perform their mission by getting people on planes, and secret agents have clandestine rendezvous’s. In addition, the SecretAgent, if in undercover mode, won’t reveal her name.
public class AirlineAgent : IAgent { private string _name; public string AgentName { get { return _name;} set {_name = value;} } public bool accomplishMission(TimeSpan timeLimit) { // check in passengers // board passengers // prepare for departure // calculate time it took to complete mission // in this case, let's say 20 minutes TimeSpan totalTime = new TimeSpan(0,0,20,0,0); return totalTime < timeLimit; } } public class SecretAgent : IAgent { private string _name; private bool _underCover = false; public string AgentName { get { if(!_underCover) return _name; return ""; } set {_name = value;} } public bool accomplishMission(TimeSpan timeLimit) { // go to rendezvous // swap cash for package // return to base // calculate time it took to complete mission // in this case, let's say 2 days TimeSpan totalTime = new TimeSpan(2,0,0,0,0); return totalTime < timeLimit; } public void GoUndercover() { _underCover = true; } }
In projects based entirely on concrete types, you may often hear complaints about the amount of time it takes to write the unit tests. This is due to the fact that test cases are meant to exercise the methods of a type; as long as each class is defined only by its own instantiable self, then unit tests are not reusable. There is a one-to-one correlation between the classes and the test cases.
Using interfaces, though, can allow you to reuse your unit tests. When you employ this architectural strategy, you are defining a common subset of available functionality that your concrete classes can implement. Unit tests, since they target methods on a type, can be written to exercise an implementation of a given interface. For every concrete type that implements that interface, the same test case can be used to exercise and verify the methods.
To test these different agents, we will want to run tests against the public interface IAgent. Tests for accomplishMission will all be remarkably identical; test known values for timeLimit and check true or false on the return. We need to create a specific unit test for each concrete type of agent in our application, but we don’t want to rewrite the test for accomplishMission each time. Instead, start by defining a base class for your unit test that can adequately test accomplishMission:
public class TestAgents { protected IAgent agent; protected TimeSpan achievableLimit; protected TimeSpan impossibleLimit; [Test] public void testAccomplishMission() { Assert.IsTrue(agent.accomplishMission(achievableLimit)); Assert.IsFalse(agent.accomplishMission(impossibleLimit)); } }
Notice that the class is not marked with [TestFixture] though it does contain at least on [Test]. TestAgents will never be instantiated directly by the NUnit test runner because you have not marked it as a TestFixture, which is precisely the behavior you want. The test method itself relies on protected fields that will be filled in by the derived tests.
[TestFixture] public class TestAirlineAgent : TestAgents { [SetUp] public void setUp() { agent = new AirlineAgent(); // 45 minutes achievableLimit = new TimeSpan(0,0,45,0,0); // 10 minutes impossibleLimit = new TimeSpan(0,0,10,0,0); } } [TestFixture] public class TestSecretAgent : TestAgents { [SetUp] public void setUp() { agent = new SecretAgent(); agent.AgentName = "Jeffrey Smithers"; //five days achievableLimit = new TimeSpan(5,0,0,0,0); //one day impossibleLimit = new TimeSpan(1,0,0,0,0); } [Test] public void TestSecretAgentName() { Assert.AreEqual("Jeffrey Smithers", agent.AgentName); SecretAgent secretAgent = (SecretAgent)agent; secretAgent.GoUndercover(); Assert.AreEqual("", agent.AgentName); } }
The test for AirlineAgent is only testing the accomplishMission method, since AirlineAgent currently has no other functionality. SecretAgent, on the other hand, has the extra ability to not return its name if it is undercover. The TestSecretAgent class therefore has an extra [Test] method to cover this. The code that actually exercises accomplishMission, though, is written only once, and can be maintained and upgraded in a single place.
Define a base test class
To create a test case for NUnit, you need only define a class and mark it with the TestFixture attribute from the NUnit Framework assembly. This designation provides a marker for the test runner to identify your test cases and run them. It provides no other functionality for your class, and since it is only metadata, your class does not inherit any meaningful functionality or data from it.
However, you will find very quickly that there are techniques and strategies you employ again and again for your unit tests. It might be that you have to perform a task to set up a test data store, and run it often to clean out cruft from earlier tests, or that you have to create or destroy session information to enable proper test values. These types of repetitive activities cry out to be centrally located so that you can eliminate the extra work of typing them again and again. In keeping with the principle of “don’t repeat yourself’, it is, in fact, imperative.
The solution is to create a base test class that your specific tests can derive from. This base class will host a series of utility methods and data structures that you can call on from any individual test. In addition, if all (or even most) of your tests require some common code in setUp and tearDown, you can implement those methods on the new base class. When your individual tests require the functionality, they can call up the chain and invoke it.
For example, with our agents application, perhaps we want our unit tests to be able to validate that the agent’s name matches a certain pattern (no numbers, or if numbers, starts with “00”, or something). We would make a base class for our tests that exposes an isValidNameString method:
public class AgentTestBase { public bool isValidNameString(string name) { // impelemnt regular expression // or other logic to test for name validity } }
Simply have your other tests derive from this one. In the case of our example above, we’ll have TestAgents derive from AgentTestBase, thus providing our common testing functionality to all our current tests.
Another favorite technique of mine is to make my mock objects be internal classes in the base test class. This is especially useful when I only have a few mock objects, and most of my tests can share the same versions. See “Mock the data access layer” below for an example of this.
Do not decorate the base class as a TestFixture. This just clutters your test runs with ignored tests (since it won’t have any actual [Test] methods on it).
As much as feasible, make everything return a value
This obviously isn’t a hard-and-fast, all-or-nothing rule, but as much as possible, avoid void methods and sub procedures. Unit tests are easier to write when the primary value to test is the return from a method. When a unit test has to scurry off to the database or another object or a text file to verify a simple method, then not only have you created more work for your unit test but you are also testing more than just your business logic; you are testing the infrastructure that leads from your business logic to that external data store. You are additionally adding the exact same infrastructure into your unit test, and if there is a problem, it will be a problem in the test as well as the thing being tested.
A method that returns the logical outcome of its processing can clearly enunciate its view of the world to its client, whether that be a user, other process or unit test. It may or may not reflect the true state of things, since obviously the full state of the application will probably be based on more than the code in this single method, but it will reflect what the method thinks is the state of things, which is what you should be unit testing. Testing the full application state is a job for integration testing. (For more on integration testing, see last month’s column on FitNesse).
Say that AirlineAgents have to be able to submit a timecard at the end of the day. The business rules for the application state that the agent should create the timecard, then save it to the database. Such a method might normally be written as:
public void submitTimeCard() { // create timecard string timecard = string.Format("Agent Name: {0}, Date: {1}, Hours: {2}", _name, DateTime.Today.ToShortDateString(), "8"); // save to db SqlConnection conn = new SqlConnection("MY_CONNECTION_STRING"); SqlCommand comm = new SqlCommand("SOME_SQL"); // etc. // ... }
To test that, you would have to go to the database and look up the timecard and verify its format. This is open to a number of problems, and isn’t really testing the code in your method; it is testing a complex collection of activity, most of which won’t have anything to do with the agent’s ability to create the timecard.
A better way to write this method is:
public string submitTimeCard() { // create timecard string timecard = string.Format("Agent Name: {0}, Date: {1}, Hours: {2}", _name, DateTime.Today.ToShortDateString(), "8"); // save to db SqlConnection conn = new SqlConnection("MY_CONNECTION_STRING"); SqlCommand comm = new SqlCommand("SOME_SQL"); // etc. // ... return timecard; }
Now, you can write a unit test that calls the method then verifies that the string returned matches the known state of the agent and the expected format of the document. You are not testing the database code, since that is orthogonal to the problem at hand.
So what happens if your method has more than one logical return value? If your architecture already calls for this, you will already have devised an answer to the problem. But what if you are trying to convert a void method to one with a return value (or values) just for the benefit of your unit tests? In this case, you really have three options:
- pick the most relevant value and return it and nothing else
- pick the most relevant value and return it as the return value of the method, but make the other values REF parameters
- package all the values into a strongly typed data transfer object, which is just a fancy description of a struct
Avoid option #2 as much as possible unless your architecture specifically calls for it. Using REF parameters adds complexity to your application, and your unit tests should not force you to add complexity to your code. All of the other suggestions given in this article actually reduce complexity, which is the key to testable code.
Separate data access from business logic
Following from the last suggestion, make sure that your data access code is separated into its own layer. You might be using direct ODBC calls, or typed datasets, or an O/R mapping tool like NHibernate, but whichever method you use, make sure it lives in its own namespace and object model.
If your business objects are focused on the actual problem domain (rather than the common, and already solved, problem of storing data) then it is much easier to test whether or not you are solving the real problem. In addition, you can now implement a series of unit tests that exercise the data storage infrastructure for your application without jumping through hoops to ensure that the data you are storing is correctly formatted.
Instead of writing the timecard directly to the database inside of AirlineAgent, as in our example above, you would write a new data access class that handles saving and loading timecards.
public class TimeCardDA { public int saveTimeCard(string timecard) { // write timecard to database // return new row identifier for new timecard // throw exception if there was an error } public string loadTimeCard(int id) { // read timecard from database // return timecard as string } }
The method submitTimeCard now becomes much cleaner:
public string submitTimeCard() { // create timecard string timecard = string.Format("Agent Name: {0}, Date: {1}, Hours: {2}", _name, DateTime.Today.ToShortDateString(), "8"); // save to db TimeCardDA tcda = new TimeCardDA(); tcda.saveTimeCard(timecard); // ... return timecard; }
In addition, now you can write unit tests that exercise just the data access layer.
[TestFixture] public class TestTimeCardDA : AgentTestBase { private TimeCardDA tcda; private const int goodTCID = 59; private const int badTCID = -1; [SetUp] public void setup() { tcda = new TimeCardDA(); } [Test] public void testLoadTC() { string timecard = tcda.loadTimeCard(goodTCID); Assert.IsTrue(timecard.Length > 0); } [Test] [ExpectedException(typeof(SqlException))] public void testLoadBadTC() { string timecard = tcda.loadTimeCard(badTCID); } // etc. }
Mock the data layer to test the business logic
When you are writing the unit tests that target the domain logic, mock the data access layer to ensure the separation of concerns. Just because you have separated the data access code into is own namespace, if the business logic is inseparably dependent on it, you still have the same problem: testing the business logic is dependent on successful data access logic.
You can either choose to create your own mock data access classes, or employ a mock object framework like nmock. We’ll save mock-object frameworks for another column. Today, let’s just create a mock version of TimeCardDA that throws exceptions when it is given a well-known input.
public class MockTimeCardDA : TimeCardDA { public const string badtimecard = "BADTIMECARD"; public const int newTCID = 60; public override int saveTimeCard(string timecard) { if(timecard.Equals(badtimecard)) throw new Exception("BAD TIME CARD"); return newTCID; } public override string loadTimeCard(int id) { if(id < 0) throw new Exception("BAD TIME CARD ID"); return string.Format("Agent Name: {0}, Date: {1}, Hours: {2}", "John Doe", DateTime.Today.ToShortDateString(), "8"); } }
This mock version of the data access defines some well-known error-inducing inputs. If the input is anything other than the defined error procuring values, then the methods will return something resembling a real value. When the well-known input is given, the method will error out.
Make use of configuration
If you have followed the advice given so far, then you have an architecture based on interface implementation, with a variety of concrete implementations of your code. Multiple domain objects implementing the same interface, separated domain and persistence layers, and a variety of mock or test objects that can be used in place of the real thing. In order to make full use of all of this, you need to take advantage of the configuration abilities of .NET.
Let’s examine the case of the persistence layer. For normal use, your domain model will rely on the real data access layer to perform its data storage. During testing, though, you will want to use the mock persistence layer. If your domain model is tightly coupled to the real persistence layer, then you will not be able to easily replace it at test-time.
Instead of making the domain model directly coupled to the persistence model, employ the indirection pattern. Create some kind of broker or router that your domain model relies on to provide concrete implementations of the persistence layer. This broker will look up the classes that are needed for the storage operations and create instances of those classes reflectively, based on the values in the configuration file. When the application is deployed, it will have a configuration file pointing to all of your real persistence classes. The test environment, though, should have its own copy of the configuration that points to your mock data object layer.
For our agents application, we’ll want to have a way to get the real version of the TimeCardDA in a deployment scenario, but the mock version for unit testing. First, let’s add a configuration element to the .config file for our application:
<appSettings> <key name="TimeCardDA" value="tensteps.TimeCardDA"/> </appSettings>
Next, we’ll modify TimeCardDA so that you cannot instantiate it directly. Instead, we’ll use the factory pattern (calling a static method on the class to return a new instance). The static method will look in the configuration file to determine whether to return the real or mock object.
public class TimeCardDA { protected TimeCardDA() { } public static TimeCardDA newInstance() { string version = System.Configuration.ConfigurationSettings.AppSettings["TimeCardDA"]; return (TimeCardDA)typeof(TimeCardDA).Assembly.CreateInstance("ninesteps.MockTimeCardDA"); } // etc.
Finally, we need only change our code to use the new factory method instead of the direct constructor.
public string submitTimeCard() { // create timecard string timecard = string.Format("Agent Name: {0}, Date: {1}, Hours: {2}", _name, DateTime.Today.ToShortDateString(), "8"); // save to db TimeCardDA tcda = TimeCardDA.newInstance(); tcda.saveTimeCard(timecard); // etc. return timecard; }
Now, whenever we run the unit test suite, we make sure that the value of the TimeCardDA key in the config file is set to the mock object instead of the real, and our unit tests will test only the logic of the business model.
You could, instead, implement a method or property on the domain objects that tell it whether or not to use the real mock object.
[TestFixture] public class TestAirlineAgent : TestAgents { [SetUp] public void setUp() { agent = new AirlineAgent(); agent.useMockDA(); ] // etc.
This makes it very explicit from the test code’s perspective what is being tested; nobody reading your tests will miss the fact that the data generated during the test will be sent off to some mock objects instead of the real thing. However, it means that your domain objects have to implement some code that is ONLY useful during testing, and this is usually a bad idea. Your domain objects should be devoted to the single purpose of solving your business problem; extraneous testing-specific code at best clutters the interface, and at worst, can have unintended ripple-effects throughout the class and possibly the rest of the code.
Make Your Classes Do Only One Thing
This is a fairly common design principle that deserves its own point. When you write a class, it is the domain model representation of some idea. Too often, we clutter these classes with code that is only tangentially related to the idea. In “Separate data access from business logic”, we examined one of the many ways that programmers clutter their code with extraneous functionality.
The rule of thumb is that there should only ever be one reason for you to modify a class. If you find yourself going back to the class to make changes again and again as different requirements change, then the class is probably overcrowded. When this happens, determine what the class was intended to do, and refactor everything else into one or more other classes. If our SecretAgent class was spending too much time in file i/o, or the AirlineAgent kept looking things up via LDAP, then we would want to remove that non-core logic and put it in another class dedicated to that kind of task.
Have Domain Object Factories
Often, the unit of functionality that you are testing is dependent on other parts of your domain model. In general, you will not want to use mock objects to impersonate the objects your functionality relies on. Mock objects are generally used to impersonate external objects from third-party libraries whose internal state you cannot control nor directly observe.
Your domain objects might have very convoluted construction requirements. If the object needs more than a single call to a simple constructor to be initialized into a state that is useful for your test, you should think about creating a test factory for that class. The test factory should provide a static method for returning an instance of the class in a ready state for use in your tests. On top of that, it should provide some static constants defining known property values that can be used to verify the state during your tests.
For example, we might need to create a series of SecretAgents for testing. We will want to create them in known states with known values. Here is a sample factory for SecretAgents:
public class TestSecretAgentFactory { public static const string GOOD_AGENT_NAME = "Jeffrey Smithers"; public static const string BAD_AGENT_NAME = "!@#$"; public static IAgent CreateGoodAgent() { SecretAgent agent = new SecretAgent(); agent.AgentName = GOOD_AGENT_NAME; return agent; } public static IAgent CreateBadAgent() { SecretAgent agent = new SecretAgent(); agent.AgentName = BAD_AGENT_NAME; return agent; } public static IAgent CreateUndercoverAgent() { SecretAgent agent = new SecretAgent(); agent.AgentName = GOOD_AGENT_NAME; agent.GoUndercover(); return agent; } }
Whenever one of your tests needs to introduce a SecretAgent into the test, perhaps passing one or more into a method that operates on a collection of IAgents, then you can use the factory to create them. Moreover, when you are handed back a reference to an IAgent in the course of a test, you can compare its data values against known values on the test factory.
Think carefully about packaging, assemblies and namespaces
Eventually, almost all “enterprise level” applications will grow to the point that running the entire suite of unit tests becomes a severe burden on the development team. Since running unit tests often is one of the central tenets of agile development (and it really is a good idea) this problem can quickly lead to fewer runs of the tests (and at worst, fewer tests written).
One way to avoid this problem is to plan your application for natural divisions of the codebase. Instead of allowing everything to be part of one monolithic assembly, the application should be built around the idea of multiple, interrelated assemblies. Your tests should follow the same architectural separation. When you work on a specific piece of the application, you can focus on running the tests associated with the assembly you are knee-deep in, and ignore the rest.
Don’t be afraid of multiply-nested namespaces. If your application naturally breaks down in a nested tree structure, then let it. The only strong rule for breaking your application up into multiple assemblies and namespaces is to be careful not to overly entwine them. You generally only want dependencies running one direction between any two packages. If classes in one assembly are dependent on classes in a second, the reverse should not be true. This will allow you to replace entire assemblies more easily (for instance, if all your persistence code lives in its own assembly, you could swap it out for a new assembly that targets a different database, or one composed entirely of mock objects).
Pick a logging strategy early
Even given the earlier recommendation to have your methods return values, sometimes you can’t test everything you need to just by examining the return from a method. Since unit tests are usually run from outside your development environment, it makes step through debugging during testing difficult. In truth, most unit tests are run as batches anyways, and even if you could step into the code, you won’t be sitting there and be able to.
What you need is an external store of application state that you can examine and correlate back to the test results. You need a logging strategy. It is vital to implement your chosen logging strategy as early as possible in the development effort, so that you don’t have to layer it back into existing code later.
Beyond that, logging will be enormously beneficial to your application post-deployment. Since bugs are difficult enough to trace when you have a full development environment available to you, tracing bugs at the customer site is next to impossible. Detailed logging allows you, and your customer, to get a detailed look at application state in both a real-time and historical perspective.
You may choose to use the built-in Trace mechanism in the FCL, or move to an external tool like log4net. Regardless, make sure you learn the basics of categorizing your messages, sorting by priority and, most importantly, routing them to different output stores. The console is a great place to get realtime logging information, but what happens when your unit tests run as a batch overnight? Where is the console tomorrow? You need to be able to store the log information in files or data tables for retrieval and examination later.
Summary
This is by no means an exhaustive treatment on the testability of applications. Entire articles can be written just about the testability of user interfaces and data access layers. Instead, these ten suggestions form a good starting point for re-examining the assumptions inherent in your design, and thinking through the decisions that will affect your code’s testability. Since testability determines how well you can verify your application, it should follow that a testable application is a better application.
Authors
Justin Gehtland is a founding member of Relevance, LLC, a consultant group dedicated to elevating the practice of software development. He is the co-author of Windows Forms Programming in Visual Basic .NET (Addison Wesley, 2003) and Effective Visual Basic (Addison Wesley, 2001). Justin is an industry speaker, and instructor with DevelopMentor in the .NET curriculum. |