Solving Performance Problems in an ASP.NET application with ANTS Profiler

You have a Data-Access Layer that is generating dynamic SQL, and you haven't the time to understand the implementation of the business layer. Something that is firing unexpected volumes of queries is slowing the code down. What do you do? You can now avoid having to hunt methodically for the cause from the queries up; you can now do it rapidly from the code down. Geoffrey Braaf explains...

A few weeks ago I was faced with a challenge I wasn’t sure how to approach; I was asked to hunt down and solve performance issues in a large e-commerce site. The site had become a real drag ever since the database had been filled to the brim with both CMS and periodically imported data. It looked pretty lean and solid from a distance, but up close it was speckled with nightmarish and untestable code.

The First Attempt

So how did I go about it? First, I revised the database and concluded that appropriate indexes were applied and that the model was solid. After that I used SQL Server Profiler to learn more about the queries being fired by the application. It was immediately apparent that the site’s caching mechanism had been undermined and modified since I’d implemented it several months before. So, having narrowed the issue down to heavy database load, my problem was now either faulty data access code or a defective caching layer.

First stop, the caching layer; given my knowledge of the implementation and the tests I had already at hand, I could quickly mark off the caching layer as ‘in good working order’.


After spending many hours
wading through trace files
and thousands of lines of
code, I was seriously
starting to worry that I
wouldn’t be able toresolve
these issues fast enough
                        “

Now to figure out where in the code the unexpected masses of queries were being fired from. As the site was built on top of a generated DAL which applied dynamic query optimization, the actual queries that were executed were not always apparent. I also lacked a detailed understanding of the implementation of the business layer, which was composed of several class libraries.

This was doing nothing to make my job any easier. After spending many hours wading through trace files and thousands of lines of code, I was seriously starting to worry that I wouldn’t be able to resolve these issues fast enough. Naturally, there was a deadline, and mine was ASAP, yesterday or even earlier if possible.

The ‘Aha’-moment

That night it hit me: I shouldn’t hunt for the cause from the queries up, but from the code down. If there were methods in the code that were slow, I should be able to detect them by timing their execution. Although I had never worked with a code profiler before, I thought I’d give ANTS Profiler a try, and resort to adding debug traces to all suspect code if my attempt failed.

I downloaded a trial version of the profiler, installed it, launched it from within Visual Studio 2005 and … had no clue how to use it. Being the tech-geek that I am, I obviously rejected the idea of reading the help files and dove right into pushing buttons. To my surprise, I was able to start the profiler up within seconds, and had the web application project under surveillance within minutes! How’s that for an intuitive interface!

684-Configuration%20Screen.gif

Because my ‘patient’ made use of a lot of cached data, the start-up was pretty slow (a couple of minutes) and very heavy on the SQL server. The fact that I already knew this didn’t reduce my awe when I saw that ‘event’ represented by a group of big spikes in the trace graph. Although reducing startup time wasn’t my goal at the time, seeing a known fact presented in the ANTS UI helped me to quickly understand how to use the tool in my hunt for bad code.

684-Startup%20Spikes.gif

From that point on, it was plain sailing. I browsed through all the known slow pages and watched the all-too-spiky trace graph as I went along. After about 10 minutes, I selected the entire trace; it took a little while, but when I was presented with the HOT code, I was as happy as a kid in a candy store.

It turned out there were two problems, and that the first one had to do with a certain footer control which had been developed without obeying any project specific standards. The control not only directly accessed the DAL, it also had some awkward and duplicate calls to the business layer. Rewriting the footer wasn’t as easy as I’d hoped, but after rebuilding the control and redoing the trace, significantly fewer spikes were visible and the site was a lot more responsive. Secondly, I found that after the initial implementation of the caching layer, several entities were added to the database and DAL, but were not accompanied by a caching wrapper class. It is still a complete mystery to me why this hadn’t been done, because after I’d added these caching wrappers, the performance boost was simply immense.

Surprise, an added bonus!

After resolving the initial problem that actually got me into this ‘adventure’, I was satisfied by the performance-win I’d accomplished, and delivered the resulting code to my client. Now I was eager to see more performance traces of the application. Playtime!

To my surprise, ANTS Profiler suggested that there was another bottleneck, which only became apparent after intensive (~30 minutes) tracing. Part of my caching strategy involved storing sitemap-like data in a singleton tree class at application start. Many controls then rendered themselves, or parts of their UI, based on the contents of that tree. For navigational purposes we had written some helper methods that would resolve a node into pretty, SEO-friendly, URLs.


I only marginally slowed
down site initialization,
but dramatically
improved the UI
rendering speed.
Obviously this extra
bonus was very well-
received by my client.
                     “

My (incorrect) assumption had been that producing the URLs on the fly by looping through all tree nodes and their children would be fast and wouldn’t influence performance that much. Strangely, the profiler still marked the helper method as HOT. Evidently, it was not the duration of the method that caused the issue, but the hit-count! I was amazed by the number of times the method was executed. By simply adding the full URL as a property to each node, I only marginally slowed down site initialization, but dramatically improved the UI rendering speed. Obviously this extra bonus was very well-received by my client.

Conclusion

I must admit that the concept of a code profiler, and especially ANTS Profiler, really appeals to me after this first experience with it. I can clearly see its purpose, and can think of a host of ways I could have saved myself a lot of time in the past. Although, in this instance, I would eventually have solved my problem by tracing and asserting, using the profiler saved me a huge amount of time. The ‘wasted’ day gave me insight, but the profiler told me the exact cause of my troubles, and gave me the power to resolve the issue fast. Also, seemingly unimportant pieces of code (like a badly written footer control) can ruin your day!

It would also be safe to assume that the additional performance gain, caused by my slightly modified caching strategy, would probably not have happened without the profiler, as it was a much more subtle issue. If I had the time, I’m sure I’d always find more ways to optimise my code … the sky is the limit!