Some NHibernate Best-Practices

NHibernate can help to build a project more quickly if the database it connects to is sound. Otherwise, there are bound to be problems, but they are not problems of NHibernate's making. Nick Harrison gives some advice, and suggests some 'best-practices' for using it, fresh from the reality of developing an application based on NHibernate.

Background

Building line-of-business applications is complicated. You have to keep track of shifting business requirements, aggressive deadlines, and internal politics. Once you’ve added such things as security, performance, scalability, to the list, you will be able to see why so many projects fail, and, even where they are delivered, why important items such as usability are side-stepped.

To succeed, we need to grasp anything we can find that promises an advantage. We often look hopefully to new abstractions. Moving from assembler to compiled code brought substantial gains. Object oriented languages, RAD tool  and 4GLs all brought their own snakes and ladders to the game of application development.

SQL itself is a wonderful abstraction and a 4GL in its right. ORM tools, such as NHibernate, seek to build on these abstractions to simplify the work of putting together a line-of-business application.

NHibernate promises to shield us from the complexity and tedium of writing SQL statements; in much the same way that  C was designed to  shield us from the complexity and tedium of manipulating registers.

There are tradeoffs with any abstraction, and sometimes they leak. Sometimes the abstractions create more problems than they solve. If we stick to best practices, it  will help us to avoid areas that are most prone to problems.  It puts us in a better position to judge whether the use of the abstraction actually helps to deliver the application at all.

Helping to Keep Murphy at Bay

Sometimes bad things happen to even the best software, but we can take steps when using NHibernate, to prevent, or at least avoid, the worse catastrophes. We need to follow a few ‘best practices’ to help improve the odds of success. These are:

  • Remember that NHibernate is not a silver bullet
  • Do not use HQL, use ICriteria or LINQ syntax
  • Avoid implicit transactions
  • Avoid Select N+1
  • Periodically check the performance with a good profiler
  • Use Fluent Mapping

There are more no doubt additional best practices, but this makes a good place to start.

Not a Silver Bullet

NHibernate is not a silver bullet that can slaughter the bugbears that lurk in your database. It will not make  fundamental problems in the database magically expire. It isn’t an alternative to refactoring the database. If anything, these problems will be exacerbated with a tool like NHibernate.     If you work on solving these fundamental problems first, before using NHibernate, then you will simplify the transition to an ORM.

The following best practices hold true whether you are using an ORM or not.    These are mostly database best-practices that should be followed regardless of your development environment. A tool like NHibernate does not diminish the need for these best practices.

Create Foreign Key Constraints for every Foreign Key relationship used

If you do not have foreign key constraints defined, then you must define them before attempting to implement an ORM tool. Chances are that the lack of foreign key constraints is masking underlying issues with the integrity of the data. While NHibernate will work without every foreign key being defined with a constraint, it makes your database vulnerable to integrity violations, and it slows the database down because the Query Optimizer uses this information in its task of determining the best query strategy.  If your data model cannot survive the process of adding foreign keys and constraints; you have more fundamental problems that NHibernate will not help with. On the contrary, these problems will only serve to complicate the process of implementing NHibernate.

Ban Composite Keys

If you have composite primary keys, replace them with generated keys before implementing an ORM tool. While NHibernate can handle composite keys, and their use is perfectly correct if it is a ‘natural’ key, they are an extra complication for an ORM that you can, and should, eliminate before moving forward. Queries will not be as well-optimized for lazy loading properties as they should be. This is not an indictment against the performance of composite keys in general, but rather an observation about  how ORM tools will interact with them.

There will be more setup involved to configure the mapping classes, and the resulting objects are harder to understand. If you want, then continue using what was the composite primary key as a unique constraint, but provide a surrogate primary key as well.

Ban Multi-Use Tables

If you have tables that are serving multiple purposes, you must normalize the data model. Otherwise, this will create needless complications and cause confusion. This problem shows itself in several ways. Perhaps you have a table where a large number of fields will always be blank or certain combination of fields will never be set at the same time. One example may be storing Employee information along with Customer information. This needs to be separate tables.

Ban “Smart” Columns

“Smart columns” are parsed to get more details. Storing XML in a text field is a “smart column”. There is nothing to ensure that it is XML, that it is valid XML or that it confirms to the expected grammar.    SQL Server provides a data type for XML that should be used if you have to store XML in the database. In almost every circumstance, you can and should store the data that goes into the XML stream and treat the actual XML document like a calculated field and not store it directly but store the components that it is made of.

You may have a ProductId that is parsed to reveal the Category or other relevant details. I have also seen this show up in CSZ fields where the City, State, and ZipCode are stored in one field.

Parsing logic is not centralized and the parsed values cannot easily be validated or properly constrained. Information that has to be parsed out of a single field needs to be stored in separate fields.

Ban Multi-Use Columns

Whenever you have a simple column with multiple meanings, you have confusion. The various meanings need to be separated into separate columns. You will almost always need extra code to ensure that the column is being used in the “right” way.

Using a date column to track when something happened as well as verifying that the event did take place is not likely to cause confusion, but expecting everyone to know that a single date field may mean the last time a user logged in or the last time a customer was contacted or the last time a vendor’s approval was validated is guaranteed to cause problems. This really happened in a production application.

Reusing columns for different purposes in your database is even more dangerous than reusing variables in your code.

Normalize Your Data Model

Problems with normalization should also be sorted out before implementing an ORM strategy. The data model should also be reviewed periodically to ensure that it is still properly normalized. We are often plagued with problems where the same details are stored in multiple locations, and problems where irrelevant data is mashed together to form a new table or simply tacked onto the first table that the developer thought of. This often happens because of shifting requirements or last minute requests being added without being properly thought out. It is easy to add a new column in a hurry and miss that the column is already created in another table. Individual “one-off” requests l over time may result in unrelated fields being added to tables that now have a better home in another table or that multiple tables are hiding inside one.

Follow these two rules of thumb:

  • The same piece of data should not be stored in multiple places
  • Each column in a table should be related to each other
  • Be wary of table with too many columns or too many rows. Both indicate potential problems that need to be investigated. Unfortunately, there are no hard and fast rules for defining “too many”.

Following two rules of thumb will take care of most questions of normalizing the database.

Forget about HQL

If you have never heard of HQL, you can skip over this section. If you are thinking about using HQL but are not sure, read on and learn how there are better alternatives available.

There are three ways to represent queries for NHibernate. HQL,  ICriteria, and LINQ. LINQ is the new kid on the block, and any reason that you may have had for wanting to use HQL goes away with LINQ on the scene. The main problem with HQL is strings. An HQL query looks similar to this:

The disadvantage of HQL is that there is no intellisense to help you to form the query: There is also no compile-time support to check that you have a valid query, and there is no way that a refactoring tool can help you when you start renaming properties. The advantage of HQL is that it looks like SQL and the conditions are easy to see. This made the transition from hand written SQL statements easier.

The same query with ICriteria syntax will look similar to this:

Here we get some compile time support for verifying that we have a valid query; but we are still dealing with strings, so the compile time support is limited and there is no refactoring support.

With LINQ, this query will look similar to this:

This looks a great deal like SQL. The restrictions are easy to spot and intuitive to understand. There are no strings involved.    A refactoring tool will clean up the query when you rename properties.

You need to know both the ICriteria syntax and the LINQ syntax. As your queries get more complex, you will find that there are things that you may not be able to do with the LINQ syntax. When you need to conditionally include constraints or you want to reuse constraints across multiple queries, you may find that the ICriteria gives you more flexibility.

Avoid Implicit Transactions

The folks that bring us the NHibernate profiler warn about implicit transactions. Implicit, or ‘auto-commit’,  transactions will cause problems with NHibernate’s internal caching as well as putting extra load on the database because it will  create a transaction for every statement.

The preferred code pattern for dealing with transactions is similar to this:

Avoid Select N+1

I have written previously about this select n+1 problem. This can be a substantial drain on your application but can easily go unnoticed. Fortunately the problem is relatively easy to fix and will generally be isolated to minor tweaks to your repository classes. The important thing is that none of your business logic needs to be modified to avoid this potential performance problem. This is one of the rare items that can easily be taken care of after the fact without disrupting the rest of your application.

Profilers Are Our Friends

Red Gate’s ANTS Performance Profiler will provide a wealth of information about your application. You will be able to track memory usage, disk I/O, network performance, even what is going on in the database.   The wonderful thing here is that you can track all of this detail together in one context.

The NHibernate Profiler provides great insight into how the application interacts with the database. The profiler also provides “alerts” that will clue you into when common pitfalls are found. The profiler will tell you when your code is not following many of the best practices outlined here.    

You can work out the best practices, and you can even document them, but this alone will not ensure that they are being followed. Monitoring your application with these profilers will shine the light on when your best practices are not being followed or not working as intended.

Mapping

There are two main approaches for mapping the columns in the database to properties in your entities. HBM files are XML based and Fluent provides a framework where you define a class to provide mapping details.

 An HBM file will look similar to this:

The same mapping defined with the Fluent syntax will look like this:

The Fluent interface has several advantages over the HBM syntax. The most notable advantage is the lack of strings. As we saw with the LINQ syntax for defining the queries, we will get compile time support for ensuring that the mapping definition is correct. Refactoring tools will also help us by updating the mapping definitions when we rename properties.

The HBM files are just XML files full of strings. Fluent classes are classes that can benefit from inheritance. Suppose you wanted to include audit fields such as LastUpdatedBy and LastUpdatedDate in each of your tables and entities. It is easy to define the entities using inheritance defining these common properties in a base class. With HBM files, we cannot take advantage of inheritance and each mapping file will have to define the mapping details separately. With Fluent, the mapping classes can follow a similar inheritance hierarchy to our entities and the redundant mapping details do not need to be duplicated.

Summary

Nhibernate is a wonderful tool making it easier to build line of business applications. Like any tool, it is important to know how to use it properly. We need to follow some best practices to ensure that we get the most out of the tool. Perhaps the most important best practice is to remember that the tool will not mask over fundamental problems with your database.

Conform to best practices in your data model before you attempt to implement NHibernate. These best practices are still relevant and if they are not followed will only complicate using NHibernate. Avoid strings wherever possible by using LINQ style queries and the Fluent interface for mapping. Profile your code often looking for problems with N+ 1 query and implicit transactions.

Follow these best practices and you are well on your way to enjoying life with NHibernate.