Using the DbFit Framework for Data Warehouse Regression Testing

It is ironic that the users of database application need to rely on the very technologists that created the system to then devise and run their acceptance tests. Surely someone has devised a test system for databases that is simple enough for ordinary tech-savvy people to use and for them to create the tests? Yes they have. Fitnesse DbFit is a mature product that can, and does, test SQL Server databases, and Nat Sundar explains how to set it up and do it.

How do you, or should you, automate database testing? If you have a technique for doing so, how can you extend this for testing a data warehouse? How easy is it to use? There are some established tools that are designed for testing databases, and the FitNesse framework is one of the best for user-acceptance tests. In this article I aim to demonstrate enough to get you started with using DbFit for regression-testing a SQL Server Data Warehouse.

DbFit for the Job?

DbFit is a member from FitNesse framework. FitNesse is an implementation of the Framework for Integrated Test (FIT) testing framework. DbFit is specially designed and developed for database testing.

In DbFit, the test cases are written in a simple text format. The DbFit test cases are designed to work with simple SQL queries and stored procedures.

There were six major reasons why we implemented DbFit within our DevOps environment. It was easy to use with our existing CI tools, the test cases were easy to write, we could extend it for our specialized requirements, DbFit worked smoothly with Excel, It was easy for the database developers to understand, and there was a good web-based GUI that everyone can use.

    1. Support for XCopy Deployment
      DbFit test-cases can be executed from a web interface or from a command line. Because of the command line support, it is very easy to integrate with the continuous integration tools for XCopy deployment.

 

    1. Wiki Test cases
      All the test cases are written in simple wiki format. It is easy to understand and aid for better communication between technical and non-technical stakeholders.

 

    1. Highly extensible framework
      The DbFit framework can be extended using java or .Net languages.

 

    1. Export and Import test cases using Excel
      The DbFit has good interoperability with Excel. The test cases can be written in Excel and imported into DbFit. Also DbFit does support to export the test cases as Excel sheets.

 

    1. Developer’s friend
      It enable the database developers to access the database objects in tabular format. This helps the database developers to adopt the framework easily.

 

  1. Web interface to manage test cases
    DbFit has a fantastic GUI (frontend) to manage the test cases. Test cases can be created and edited in the front end. The test cases can be executed as a single test case or a suite (collection of test cases).This GUI helps a tester or anyone to create test cases without the knowledge of scripting.

DbFit for regression test:

The DbFit can be used to regression-test both the database metadata and the data itself. Testing the metadata will disclose the unexpected changes in database objects such as tables, views or stored procedures, whereas testing the data will disclose the unexpected changes to the data coming from the ETL packages.

In this article we will concentrate about testing the data in the Data warehouse, and cover these areas in detail.

Installation & Configuration:

In this section I will explain how to install DbFit, and set up the necessary configuration on your local machine.

Pre-requisite:

We need Java Runtime Environment (JRE) 7 or higher to set-up DbFit. The JRE can be downloaded from Oracle website here.

Installation:

You can download DbFit from the web site here. Once the zip file has been downloaded then you can unzip the contents to a folder.

Contents of the zip file can be seen as per the image below

Starting DbFit Service:

The DbFit framework can be started by running the “startFitnesse.bat” file. It may take few minutes to start for the first time.

If, as a result, you can see this, it confirms that the DbFit service has been started successfully.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\DbFit_Startup.png

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\DbFit_Startup.png

Once the DbFit package is up and running, the service can be accessed using the URL http://localhost:8085/ in a browser as shown below.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\DbFit_Homepage.png

Stopping the service:

The DbFit service can be stopped by pressing Ctrl+C on the MSDOS command prompt. The prompt ‘Terminate Batch Job (Y/N)’, shown below, means that the service is about to be stopped.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\Stopping.png

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\Stopping.png

Connecting to SQL Server:

Because of licensing restrictions, the database drivers are not available with DbFit by default. Instead, we need to download them manually and then copy the files across to the DbFit folder.

Unfortunately, DbFit does not connect to SQL Server by default. We need to install the JDBC driver and configure it to use in the DbFit framework. These steps will help you to install the driver and connect to SQL server.

Installing JDBC driver:

The JDBC driver for SQL Server can be downloaded from Microsoft website here.

It is best to download the zip file “sqljdbc_6.0.8112.100_enu.tar.gz”. Once downloaded, unzip the contents to a folder.

You will be able to locate the JDBC driver file “sqljdbc42.jar” in the folder:

“..\sqljdbc_6.0.8112.100_enu.tar\sqljdbc_6.0.8112.100_enu\sqljdbc_6.0\enu\jre8”

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\JDBC_Driver.png

Now copy the JDBC driver file into the DbFit lib folder (..\DbFit-complete-3.2.0\lib), as shown in the image below.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\JDBC_Driver_Deployment.png

This has enabled us to connect the SQL server from DbFit, but if you would prefer to connect using the more secure Windows authentication, then you will need to follow one more additional configuration step, which I’ll now explain.

Setting up Windows authentication:

The assembly (sqljdbc_auth.dll) can be found on the folder “..\sqljdbc_6.0.8112.100_enu.tar\sqljdbc_6.0.8112.100_enu\sqljdbc_6.0\enu\auth\x64”.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\SQLAuth.png

…so copy this file to the JRE folder C:\Program Files (x86)\Java\jre1.8.0_121\bin as shown in the next screenshot.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\SQlAuth_Deployment.png

Setting up a sample test case:

Now that the DbFit package has been set up successfully, we will now create a sample test case.

Open http://localhost:8085/SampleSQLTest in your browser. You should see an editor where you can create and run your test case (page).

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\SampletestCase.png

This will create a blank test case as shown in the image below.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\SampletestCaseEdit.png

Test cases are created in a wiki markup which is a simple text format. The representation is very similar to the relational model. Hence you can create DbFit test cases without the help of programming language.

Test cases can be grouped logically in a test suite. DbFit expect you to follow Camel Casing when you give the name to a test case or a suit.

In addition, the test cases can be created in Excel and imported from Excel to DbFit interface using the button “Spreadsheet to FitNesse”.

Test cases are actually stored on the file system in a simple text format. You can create and edit them using a text editor such as Notepad or Notepad++.

The additional drivers for DbFit need to be loaded into FitNesse to support the execution of tests. This can be achieved by loading the relevant libraries in the test case using this command

The line below indicates that DbFit will try to connect to MSSQL Server database. This helps DbFit to use the correct driver to execute the test case.

More information can be found here for other RDBMS.

This line indicates the connection settings for the SQLServer

In DbFit, a SQL query can be executed using the Query command. The SQL query has to be defined after the command “!|Query” . We need to define the expected result set structure in the next line. Here, the result set will have only one column “X”. This is represented below.

The subsequent rows define the expected results. It is enough to specify just the columns that you’re interested within the result set. In this example, we are expecting only one value (“test”) that is shown below.

“|test|”

Now all we now need to do is to copy the complete test case shown below, and paste it in the text box to create a test case. I will be using the JDBC driver to connect to SQL server and use Windows authentication to connect.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\Sampletestcasewithcode.png

Once done, click the ‘save’ button to create a test case. Now the test page will look like this.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\TestCaseAfterCompletion.png

Now the test case can be executed by clicking the ‘Test’ button the header. The test case will, hopefully, be successful as shown in the image below.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\TestResultExecution1.png

Failure Case:

Now let’s update the test case with an invalid SQL expression (“SELECT GetDate1()”) So that the test case will fail. Here we can see a failure case.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\InvalidTestcase.png

This represents the completion of the test case, and the test has failed.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\TestResultCaseFailure.png

Command line execution:

The DbFit test cases can be run from the command line. This is great for any DevOps environment because this will help us to integrate the DbFit with the Continuous Integration (CI) tools.

This command line will help you to run the test page “SampleSQLTest” in DbFit

Here we can see the successful execution of the test case from the Command line.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\CommandLineExecution.png

 

Naming Convention for Test Case:

There are two simple rules to follow while creating a test page name in DbFit

  • The test page name has to start with an uppercase letter and should have one more uppercase letter in the name
  • All the uppercase letters have to be separated by at least one lowercase letter.

Sample Test case Hierarchy

The DbFit package supports the hierarchical classification of the test cases. This can be achieved by creating a test-suite page. A suite is a collection of related test cases (pages) in DbFit. A test suite will run all the test pages defined in the hierarchy and we can see the results of all test cases in a single page.

This will help us to access the overall test results of a specific system.

Creating a Suite:

A suite can be created by typing the name with the suffix “suite” in the browser as follows.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\CreatingSuit.png

Now we can see a suite “AdventureWorks” page has been created. Under this suite base-page, we can create the hierarchy of all the test cases.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\CreatingSuit1.png

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\CreatingSuit2.png

As I would like to set-up a hierarchy of the test cases, I am planning to create further child suites for Staging, Dimension and Fact. These suites will hold the relevant test cases. A child suite page can be created by clicking on the Add-> Suite page link.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\ChildSuitePage.png

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\CreatingDimensionTestSuite.png

Now, the dimension test suite has been created successfully. In addition, I have added few suites to create a hierarchy within the AdventureWorks suite as shown below.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\MultipleChildSuites.png

Setting up connection details in a file:

In the previous test case, we have mentioned the name of the server and the database in the test page itself. However the connection details will change over time and will vary from server to server. So it is a much better idea to maintain the connection details in a separate configuration file.

Now I have added the connection configuration file in the root folder.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\ConfigFileLocation.png

The contents of this connection configuration file is this.

Connection-string=jdbc:sqlserver://localhost\sql2012;integratedSecurity=true;

You can now use these stored connection configurations within a test script by using the Connect using file command with the URL of the configuration file, in this case …

I show how to reference this within a test case in the next section.

So far we have learned about setting up a test hierarchy in DbFit. Now we can apply this methodology to build a regression test for a Data Warehouse.

Regression Test for Data Warehouse:

Let’s assume we have developed a SSIS package to load the data from staging to the Data warehouse table “DimCurrency”. To enable us to regression test the data load to this table, we need to have another test table with the expected data. Let us assume that we have prepared the data in the test table “DimCurrency_Test”. Now we need to compare both the “DimCurrency” and the “DimCurrency_Test” table for each test iteration.

Test table with expected data : dbo.DimCurreny_Test

Target table (for SSIS load) : dbo.DimCurreny

After the successful completion of Data warehouse load, you would expect that the number of records between the DimCurrency and the DimCurrency_Test tables are the same.

The table structure and the data of the DimCurrency table can be seen below.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\DimCurrency.png

The table structure and the data of the DimCurrency_Test table can be seen below.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\DimCurrencyTestTable.png

Now we’ll create the “DimCurrencyTest” test case under the “Dimension” suite to compare the data between the target and the test table.

Test Case content:

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\ValidateDimCurrencyTestCase.png

Now we can see how DbFit can validate for different test cases.

Test Case 1: Both the target table and test table has got same data (and same number of records)

The screenshot below shows how DbFit confirms that we are able to successfully regression-test the DimCurrency table because we are able to successfully compare the data against DimCurrency_Test table.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\ValidateDimCurrencyTestCaseSuccessful.png

Test Case 2: Missing record in DimCurrency

Now let’s assume that the SSIS package has a defect and has missed a record to load in the DimCurreny table. To replicate this test case in the DimCurrency table, I will be deleting a record for the currency “CZK”.

I have deleted the currency using the SQL below.

Now let’s re run the test case to validate the results. The test case has correctly identified the missing record.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\MissingRecordsInDimCurrency.png

Test Case 3: Record with an incorrect column value

Now I have inserted a record to add the Czech currency. However I have deliberately mistyped the currency name as “Czech Dollar” instead of “Czech Koruna” using this SQL.

Now I would like the DbFit framework to compare the values and notify me the difference between the expected and actual columns values. Let’s run the test case to access the results.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\IncorrectColumnValue.png

The DbFit framework has correctly identified the incorrect column value and highlighted the expected and actual column values for comparison.

So far, we have created a test case for a dimension table. Now we’ll create a test case for the staging table and concentrate on creating a test hierarchy.

Testing the suite hierarchy:

Now I have created a “Staging” suite under the AdventureWorks suite to group all staging-related test cases.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\SuiteHierarchy.png

I have added two test cases to validate records in the Sales.Currency and Production.UnitMeasure staging tables. Now these test cases can be tested alone or these can be executed from the suite.

I can see what test cases are available under the “Staging” suite.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\StagingSuiteHierarchy.png

Now I can execute the staging suite to validate the results of all these test cases, by clicking on the “Suite” link.

The result of executing all of the tests in the staging suite with all the test cases will be shown as in this screenshot.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\StagingSuiteHierarchyExecution.png

In addition we have defined a parent test suite “AdventureWorksSuite” in the root.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\AdventureWorksSuiteHierarchy.png

If we then execute the suite, we will see the following test summaries.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\AdventureWorksSuiteHierarchyExecution.png

Executing a Suite from command line:

This command line will help you to run the suite “AdventureWorksSuite” in DbFit

Here we can see the successful execution of the suite from the Command line.

C:\WorkArea\Nat\SQL\Blog\SimpleTalk\DbFit\Images\TestingSuite.png

DbFit limitations

There are few noticeable limitations of DbFit . It is designed to test the accuracy of a system, but it is not suitable for performance or load testing because it is unable to compare more than 10000 rows: Also, user-defined database types are not supported.

Conclusion

One of the key aspect of building a regression test framework is to prepare good test data, and the experts in judging what the data should be are in the business rather than IT. It is impractical to have to use systems for tests that have to be devised by IT people if it is for the purposes of verifying a business process. This means providing a verification system that can be created and used directly by anyone without IT training, beyond using Excel and a Wiki. Preparing business-oriented test data is a challenging task, because the business requirements are highly volatile and change substantially over time. Once you have a reasonable supply of test data that is provided directly by the business, audit and accountancy experts, you can use DbFit to improve the quality, design and maintainability of your Data Warehousing application.

Reference links: