Monday, December 19, 2005

TDDing the sales report

Download: Sources

You're starting a new project. You are talking with the customer, defining a few user stories, to make a release plan, when he says the most important thing for him would be a weekly sales report. Your immediate reaction is to say: you can't have the sales report before, you actually can put data into the system, the data entry forms should be developed first. Then he says, no, I need that first, and the data needs to come from our old system, and I will get one of my IT employees to write a data transformation code that transfers the data into your system, so that I can have the report after the first iteration.

Ok, then...so lets do that first. After a few more discussions, you and the customer write down a few acceptance tests for the sales reports as:

I have, 3 orders from 2 customers (one makes two orders as in the first forgets something).

|| order id || product || qty || unit price || customer || order total ||
|| order #123 || xxy mobile phone || 2 || 200 || company #1 limited || 400 ||
|| order #123 || xxy mobile charger || 2 || 15 || company #1 limited || 430 ||
|| order #124 || xxy mobile charger || 1 || 15 || company #1 limited || 15 ||
|| order #125 || bluetooth adapter || 1 || 20 || company #2 limited || 20 ||
|| order #125 || xxy mobile charger || 3 || 15 || company #2 limited || 65 ||

so, seeing my sales report, I would like to see how much has been sold of each product I have (sorted by product):

|| product name || qty || total ||
|| bluetooth adapter || 1 || 20 ||
|| product #no name || 0 || 0 ||
|| xxy mobile charger || 6 || 90 ||
|| xxy mobile phone || 2 || 400 ||

If I also want to include the customer I would get:

|| product name || qty || total || customer ||
|| bluetooth adapter || 1 || 20 || company #2 limited ||
|| product #no name || 0 || 0 || - ||
|| xxy mobile charger || 3 || 45 || company #1 limited ||
|| xxy mobile charger || 3 || 45 || company #2 limited ||
|| xxy mobile phone || 2 || 200 || company #1 limited ||

Ok, now let's make it work. TDD with a database. Testing with a database is not considered unit testing (Michael Feathers [1] ), but there are situations when this kind of tests could prove very good to have.

Over the years I discovered that there are three techniques that can be applied when testing code that interacts with a database:
  • mock testing

  • rollback changes technique

  • reset database technique.

The best technique is to avoid direct interaction with the database, by using mock objects. That can be a very good technique in many cases, as mock objects are very fast, and quite easy to use, but in this case using mock objects, might mean that out tests could cover very little of our actual functionality, so we can leave mock objects for a better situation.

Now if we really need to touch the database, and the automated tests need to be independent from one another, it is important that the state (data) of a database is the same before running each tests. For this there are two techniques: either you recreate the database in a specific state before each test is run, or any eventual changes made by a test when being run are rolled back after its execution. For the latter technique, you could use database or application transactions that can facilitate it. See Roy Osherove's: ....

The reset database technique, can be done in several ways, however speed in TDD is essential as the faster you get feedback from you tests the faster you learn what decisions to take next, so the most common technique is an in memory database, like HypersonicSQL or Cloudscape for Java, or SqlLite or Firebird for everyone. However this technique can be just as misleading as the mocking of the database in some cases, as database engines are different, and you could end up with your tests working but deployed on the real server the code to fail. In order to avoid this, you can write your tests directly against the database engine you will deploy, sacrificing speed for safety.

In order to recreate a database's state, you can clean/re-create the database and populate it from code (using an ORM/ DataSets etc) or using a sql script that does all this. The second approach might prove faster but, it is even less maintainable then the first so great caution must be imposed. However, there is a tool, that can really help us with this latter technique, who's usage in this context will be proven today.

Back to work

In TDD the first step is to write a test, make it fail then make it work (red/green factor - see: ....). Let's write the test:

[TestFixture]
public class TestSalesReport
{
SalesReportDAO dao = new SalesReportDAO();

[SetUp]
public void ResetDatabase()
{
}

[Test]
public void TestProductsAndQtySold()
{
DataSet reportDataSet = dao.GetSalesReport();
DataTable report = reportDataSet.Tables[0];

Assert.AreEqual(4, report.Rows.Count, "we should have 4 rows");
}
}
As you can see we will need to add the code to the test that resets the database state. Now let's create the database and put data the test data into it, then using the amazing Tychotic Database Scripter [2] we will export the whole database state as a stored procedure that can be run in our test's setup.



and put the values in the acceptance test into it (OrderLines table):



Now let's export the script with the database scripter and create the stored procedure. The stored procedure's code is:


if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[SETUP]') and OBJECTPROPERTY(id, N'IsProcedure') = 1)
drop procedure [dbo].[SETUP]
GO

CREATE PROCEDURE [dbo].[SETUP]
AS

BEGIN

-- DELETES
DELETE FROM [dbo].[OrderLines];
DELETE FROM [dbo].[Orders];
DELETE FROM [dbo].[Customers];
DELETE FROM [dbo].[Products];

-- Products
SET IDENTITY_INSERT [dbo].[Products] ON;
INSERT INTO [dbo].[Products]
(
ID
,Name
,Price
)
SELECT 1,'xxy mobile phone',200.0000 UNION
SELECT 2,'xxy mobile charger',15.0000 UNION
SELECT 3,'bluetooth adapter ',20.0000 UNION
SELECT 4,'product #no name ',50.0000
SET IDENTITY_INSERT [dbo].[Products] OFF;

-- Customers
SET IDENTITY_INSERT [dbo].[Customers] ON;
INSERT INTO [dbo].[Customers]
(
ID
,Name
,Address
)
SELECT 1,'company #1 limited',NULL UNION
SELECT 2,'company #2 limited',NULL
SET IDENTITY_INSERT [dbo].[Customers] OFF;

-- Orders
SET IDENTITY_INSERT [dbo].[Orders] ON;
INSERT INTO [dbo].[Orders]
(
ID
,CustomerID
,OrderNo
,Total
)
SELECT 1,1,'order #123',430.0000 UNION
SELECT 2,1,'order #124',15.0000 UNION
SELECT 3,2,'order #125',65.0000
SET IDENTITY_INSERT [dbo].[Orders] OFF;

-- OrderLines
SET IDENTITY_INSERT [dbo].[OrderLines] ON;
INSERT INTO [dbo].[OrderLines]
(
ID
,OrderID
,ProductID
,UnitPrice
,Quantity
,Total
)
SELECT 1,1,1,200.0000,2,400.0000 UNION
SELECT 2,1,2,15.0000,2,30.0000 UNION
SELECT 3,2,2,15.0000,1,15.0000 UNION
SELECT 4,3,3,20.0000,1,20.0000 UNION
SELECT 5,3,2,15.0000,3,45.0000
SET IDENTITY_INSERT [dbo].[OrderLines] OFF;

END;
GO

As you can see the scripter preserves the original primary key values in the database, thus making testing easier. If you run it manually you will notice that it is very fast. Now it is time to add the code to our test that runs this stored procedure and resets the database state.

[TestFixture]
public class TestSalesReport
{
SalesReportDAO dao = new SalesReportDAO();


[SetUp]
public void ResetDatabase()
{
//we run the clean and populate with data, stored procedure
using (SqlConnection conn = new SqlConnection(ConfigurationSettings.AppSettings["connection_string"]))
{
SqlCommand command = new SqlCommand("SETUP", conn);
command.CommandType = CommandType.StoredProcedure;

conn.Open();
command.ExecuteNonQuery();
}
}


[Test]
public void TestProductsAndQtySold()
{
DataSet reportDataSet = dao.GetSalesReport();
DataTable report = reportDataSet.Tables[0];

Assert.AreEqual(4, report.Rows.Count, "we should have 4 rows");
Assert.AreEqual("bluetooth adapter", report.Rows[0]["product_name"].ToString().Trim(), "the first product should be a bluetooth adapter");
Assert.AreEqual(DBNull.Value, report.Rows[1]["qty"], "no product #no name has been sold");
Assert.AreEqual("90.0000", report.Rows[2]["total"].ToString(), "6 xxy mobile charger sold results in 90");
}
}

To make it compile we need:

public class SalesReportDAO
{
public DataSet GetSalesReport()
{
return null;
}
}

Now we need to see if our code fails as it should since the data is not extracted from the database at this time:

------ Test started: Assembly: TDDSalesReport.dll ------

TestCase 'TDDSalesReport.Tests.TestSalesReport.TestProductsAndQtySold' failed: System.NullReferenceException : Object reference not set to an instance of an object.
D:\Projects\Dan Bunea\TDDSalesReport\TDDSalesReport\TDDSalesReport\Tests\TestSalesReport.cs(37,0): at TDDSalesReport.Tests.TestSalesReport.TestProductsAndQtySold()


0 succeeded, 1 failed, 0 skipped, took 1.02 seconds.



---------------------- Done ----------------------

Running the code will make it fail, but if we alter the database's state then rerun the test we notice that it is in the previous state, so our stored procedure works. Now all we need to do is write the code to pass the test:

public class SalesReportDAO
{
public DataSet GetSalesReport()
{
DataSet results = new DataSet();
results.Tables.Add("SalesReport");
using (SqlConnection conn = new SqlConnection(ConfigurationSettings.AppSettings["connection_string"]))
{
SqlDataAdapter adapter = new SqlDataAdapter(GetCommandByProduct(), conn);
adapter.Fill(results.Tables[0]);
}
return results;
}

private string GetCommandByProduct()
{
StringBuilder builder = new StringBuilder();
builder.Append("select ");
builder.Append(" Products.Name as product_name,");
builder.Append(" Sum(OrderLines.Quantity) as qty,");
builder.Append(" Sum(OrderLines.Total) as total ");
builder.Append("from Products ");
builder.Append("left outer join OrderLines ");
builder.Append(" on Products.Id = OrderLines.ProductID ");
builder.Append("group by Products.Name");
return builder.ToString();
}
}

And running it:

------ Test started: Assembly: TDDSalesReport.dll ------


1 succeeded, 0 failed, 0 skipped, took 1.19 seconds.



---------------------- Done ----------------------

Excellent, we have written a report that is accompanied by an automated test that is independent and fast enough (unfortunately my computer is rather degenerated, but on a more powerful computer the times might be decreased dramatically).

Now we could improve o by adding a few more asserts but in my opinion asserts should be handled with care, as if their number increases, they become harder to maintain so there should always be a balance between the number of asserts and the coverage a test should do, as I tried to show above.

For the second test, exactly the same approach will be used, writing the test, making it compile, making it fail, and then developing the code to make it work as in any TDD scenario. So we start with this test:

[TestFixture]
public class TestSalesReport
{
SalesReportDAO dao = new SalesReportDAO();


[SetUp]
public void ResetDatabase()
{
//we run the clean and populate with data stored procedure
using (SqlConnection conn = new SqlConnection(ConfigurationSettings.AppSettings["connection_string"]))
{
SqlCommand command = new SqlCommand("SETUP", conn);
command.CommandType = CommandType.StoredProcedure;

conn.Open();
command.ExecuteNonQuery();
}
}


[Test]
public void TestProductsAndQtySold()
{
DataSet reportDataSet = dao.GetSalesReportByProduct();
DataTable report = reportDataSet.Tables[0];

Assert.AreEqual(4, report.Rows.Count, "we should have 4 rows");
Assert.AreEqual("bluetooth adapter", report.Rows[0]["product_name"].ToString().Trim(), "the first product should be a bluetooth adapter");
Assert.AreEqual(DBNull.Value, report.Rows[1]["qty"], "no product #no name has been sold");
Assert.AreEqual("90.0000", report.Rows[2]["total"].ToString(), "6 xxy mobile charger sold results in 90");
}

[Test]
public void TestProductsQtySoldAndCustomer()
{
DataSet reportDataSet = dao.GetSalesReportByProductAndCustomer();
DataTable report = reportDataSet.Tables[0];

Assert.AreEqual(5, report.Rows.Count, "we should have 5 rows");
Assert.AreEqual("bluetooth adapter", report.Rows[0]["product_name"].ToString().Trim(), "the first product should be a bluetooth adapter");
Assert.AreEqual(DBNull.Value, report.Rows[1]["qty"], "no product #no name has been sold");
Assert.AreEqual("45.0000", report.Rows[2]["total"].ToString(), "3 xxy mobile charger sold to company #1 limited results in 45");
Assert.AreEqual("company #2 limited", report.Rows[3]["customer"].ToString().Trim(), "company #2 limited bought these");
}
}

Doing a small refactoring to the names of the methods that extract the report data from the database. Of course it doesn’t compile, but we fix that:

public class SalesReportDAO
{
public DataSet GetSalesReportByProduct()
{
DataSet results = new DataSet();
results.Tables.Add("SalesReport");
using (SqlConnection conn = new SqlConnection(ConfigurationSettings.AppSettings["connection_string"]))
{
SqlDataAdapter adapter = new SqlDataAdapter(GetCommandByProduct(), conn);
adapter.Fill(results.Tables[0]);
}
return results;
}

public DataSet GetSalesReportByProductAndCustomer()
{
throw new Exception("The method or operation is not implemented.");
}

private string GetCommandByProduct()
{
StringBuilder builder = new StringBuilder();
builder.Append("select ");
builder.Append(" Products.Name as product_name,");
builder.Append(" Sum(OrderLines.Quantity) as qty,");
builder.Append(" Sum(OrderLines.Total) as total ");
builder.Append("from Products ");
builder.Append("left outer join OrderLines ");
builder.Append(" on Products.Id = OrderLines.ProductID ");
builder.Append("group by Products.Name");
return builder.ToString();
}

}

then we start by running all the tests and making sure the second test fails:

------ Test started: Assembly: TDDSalesReport.dll ------

TestCase 'TDDSalesReport.Tests.TestSalesReport.TestProductsQtySoldAndCustomer' failed: System.Exception : The method or operation is not implemented.
D:\Projects\Dan Bunea\TDDSalesReport\TDDSalesReport\TDDSalesReport\DataLayer\SalesReportDAO.cs(26,0): at TDDSalesReport.DataLayer.SalesReportDAO.GetSalesReportByProductAndCustomer()
D:\Projects\Dan Bunea\TDDSalesReport\TDDSalesReport\TDDSalesReport\Tests\TestSalesReport.cs(48,0): at TDDSalesReport.Tests.TestSalesReport.TestProductsQtySoldAndCustomer()


1 succeeded, 1 failed, 0 skipped, took 1.85 seconds.



---------------------- Done ----------------------

Now let’s write the proper code to make it run:

public class SalesReportDAO
{
public DataSet GetSalesReportByProduct()
{
return GetReportResults(GetCommandByProduct());
}



public DataSet GetSalesReportByProductAndCustomer()
{
return GetReportResults(GetCommandByProductAndCustomer());
}

private DataSet GetReportResults(string sql)
{
DataSet results = new DataSet();
results.Tables.Add("SalesReport");
using (SqlConnection conn = new SqlConnection(ConfigurationSettings.AppSettings["connection_string"]))
{
SqlDataAdapter adapter = new SqlDataAdapter(sql, conn);
adapter.Fill(results.Tables[0]);
}
return results;
}

private string GetCommandByProduct()
{
StringBuilder builder = new StringBuilder();
builder.Append("select ");
builder.Append(" Products.Name as product_name,");
builder.Append(" Sum(OrderLines.Quantity) as qty,");
builder.Append(" Sum(OrderLines.Total) as total ");
builder.Append("from Products ");
builder.Append("left outer join OrderLines ");
builder.Append(" on Products.Id = OrderLines.ProductID ");
builder.Append("group by Products.Name");
return builder.ToString();
}


private string GetCommandByProductAndCustomer()
{
StringBuilder builder = new StringBuilder();
builder.Append("select ");
builder.Append(" Products.Name as product_name, ");
builder.Append(" Sum(OrderLines.Quantity) as qty, ");
builder.Append(" Sum(OrderLines.Total) as total, ");
builder.Append(" Customers.Name as customer ");
builder.Append("from Products ");
builder.Append("left outer join OrderLines ");
builder.Append(" on Products.Id = OrderLines.ProductID ");
builder.Append("left outer join Orders ");
builder.Append(" on Orders.Id = OrderLines.OrderID ");
builder.Append("left outer join Customers ");
builder.Append(" on Orders.CustomerId = Customers.Id ");
builder.Append("group by Products.Name,Customers.Name ");
builder.Append("order by Products.Name ");
return builder.ToString();
}

As you may see several refactorings have been made but let’s see what happens when we run it:

------ Test started: Assembly: TDDSalesReport.dll ------


2 succeeded, 0 failed, 0 skipped, took 1.32 seconds.



---------------------- Done ----------------------


I guess, we can call our customer now (. Although it might not be the most optimized code, the code does what the customer wants, backed up by automated tests that prove it and if there is need for optimizations and refactorings it is always easier to do them protected by the automated tests then without.

Conclusion

By writing just a few lines of code and with the help of a brilliant tool we managed to TDD our sales report as requested by the client, fast and in a convenient way. However, there should be caution in using this technique, as the database script might become rapidly a test maintenance nightmare if the database structure changes, so having as few scripts like this as possible as well as combining this technique with other database testing techniques, will make this maintenance work easier.

[1] Michael Feathers – A Set of Unit Testing Rules http://www.artima.com/weblogs/viewpost.jsp?thread=126923
[2] Thycotic DatabaseScripter - http://www.thycotic.com/dotnet_dbscripter.html

2 comments:

Steve said...

For myself, I find a lot of advantage in having a database backend. However, I think that a relational technology is inappropriate when trying to establish a good DAO, and would begin by using an OODB such as db4o. This reduces cognitive dissonance, is pretty easy to get working, and allows the developer to focus on good design. Whether an OODB is eventually used by the system is an entirely separate concern.

Dan Bunea said...

Hi,

Usually we use an ORM, such as NHibernate or ActiveRecord, which as you say, allows us to focus on good design, not being all cluttered by database code. An example at: http://danbunea.blogspot.com/2007/01/test-first-web-applications-tdding.html

Thanks,
Dan