Using a Profiler to Speed Application Performance


As part of our company's business intelligence, marketing effectiveness platform (including a high-scale contact relationship management application) is a system which uses SSIS (among other things) to extract, transform and load huge amounts of data from a wide range of different sources. Once it's all in a central SQL Server database, a background C# batch processing application works through the data using highly complex grouping and clustering algorithms to analyze and make the data consistent. Somewhat similar to SSIS fuzzy grouping but we had extra considerations and needed more control on the process.

Bringing in a profiler

I initially started inserting timing code throughout the code, but this is obviously very tedious, and can be inaccurate. Worse, you depend on guesswork to work out where to effectively insert code. I quickly realized I needed proper non-intrusive profiling.

A quick Google search showed up a shortlist: dotTrace from JetBrains and ANTS Profiler from Red Gate. Although I love JetBrains' Resharper, I found ANTS Performance Profiler was more usable at taking me down to which methods and which lines were taking the time plus being able to look at those lines in the same window. That is without going back and forth from Visual Studio.

The analysis proved to be an incremental process. Each time, I would work down from the method level timing and find the top bottleneck. At first, the bottlenecks turned out to be data access, so I used SQL Profiler and Database Tuning Advisor to work out the necessary DB and query changes.

5X performance gains (or more)

This still left performance problems in the C# code itself. Assuming that the quickest way to get a major improvement was to change the core algorithms, I tried rewriting some of the fuzzy grouping algorithms – this made the code a lot more complicated, but turned out to make almost no improvement to performance. As ever, before you optimize, always measure. So I reverted the code, and went back to ANTS Profiler again.

Surprisingly, some of the major problems turned out to be the simple things – regexp and String performance, and basic collection types. A couple of changes in the string handling*, and using hashtables instead of lists (Hashtable is faster than SortedDictionary, SortedList or List), and we now get through the full 9 million rows in 2-3 days – at least 5 times faster!

Click for full size image

ANTS Profiler results before optimization

Click for full size image

ANTS Profiler results after optimization

Lessons learned

In retrospect, the lessons are fairly obvious:

* <string>.SubString(...) does some considerable lifting, so if you want to check if the first two chars in a string are equal to some other string then you should consider using <string>.StartsWith instead or avoid the SubString if you can.

Code before optimization

Code before optimization

Code after optimization

Code after optimization