One of the most interesting throw-away lines at the recent TechEd show was that Entity Framework was about four times slower than LINQ for SQL, because Entity Framework was a generic solution that could be used with a variety of data stores, and wasn’t optimized for SQL Server.
This set us thinking about what the basis on which such statements are made.
LINQ-to-SQL generated a lot of interest from users but “isn’t getting the love” at Microsoft because it was always seen as a stop-gap. By contrast, Entity Framework is the apple of the Microsoft eye, but has never received much affection from the users. What most programmers wanted was a simple abstraction layer to allow to them to focus on the ‘application domain’ and not have to immerse themselves too deeply in the arcane workings of relational databases. The project to provide this simple layer has been fumbled, mainly because there isn’t a clear solution for all common application architectures.
The trouble is that abstraction layers come at a cost: they make applications run slower, but how much slower? The message from Microsoft and the MVPs about the architecture of data-driven applications seems confused, possibly because it isn’t easy to quantify this performance cost.
All attempts at abstraction, including LINQ, seem to result in a solution that is considerably slower than that which can be achieved through intelligent use of SQLDataReader, or a simple interface based on SQLClient and stored procedures. However, performance comparisons between the various architectures seem to be either anecdotal or dependant on too many variables. Comparisons between LINQ and EF, for example, vary wildly according to whether LINQ queries are pre-compiled, and whether the SQL Query plan is cached and reused. Brian Dawson’s benchmarks of Entity Framework against SQLClient and LINQ solutions, which are the most widely-known, illustrate the range of factors that can change the results of timings. For example, he was able to show a 28% performance gain in EFs performance by merely pre-generating the views!
It is extraordinarily difficult to apply a benchmark to the various ORM solutions in a fair and consistent way, but that doesn’t mean that the community of users shouldn’t be doing it. I’d like to see a production-size application being used, with the full range of timing criteria being decided ‘up-front’. In other words, we need to decide before we start on what timings are fair measures of performance, and then stick to that definition. After all, you don’t decide where the finishing post is after the horse-race is over, do you?
I’d be fascinated to know what you think about this. What are your experiences of nHibernate, Entity Framework and LINQ to SQL? Do you think that the obvious benefit of a full abstraction layer is really worth the performance cost of all the additional overhead?