We needed to find the source of the memory leak, and fast

At any time, memory leaks in a commercial application have to be fixed; but when your application is being deployed via virtualized desktops using platforms such as Microsoft Application Virtualization for Terminal Services or Citrix XenDesktop, then it becomes vital that memory is used as efficiently as possible. You can imagine what would happen if signs of a large memory leak showed up during the deployment to a key customer. Jeremy Jarrell tells the story.

Introduction

Matrix Solutions is the industry leading provider of Strategic Account Management for the media industry.  Matrix SalesCenter, which is written from the ground up using Microsoft .NET and SQL Server technologies, recently saw its 1.0 release and today is in active development for version 2.0.

815-Clip3.jpg

The media business is a business built on advertising revenue and many large media companies may have thousands of active advertising accounts at any given time.  Therefore it’s critical for a media company to be able to quickly identify who their key advertisers are to ensure that they’re getting the most attention possible.  SalesCenter is designed to allow an advertising salesperson to easily sift through large amounts of customer data in order to quickly identify not only who their key customers are, but also how well a particular salesperson is meeting his or her revenue goals.

The Problem

As mentioned above, Matrix SalesCenter positions itself primarily as a sales analysis tool for helping media Account Executives comb through reams of data in order to quickly identify their most valuable accounts.

This means that, by its very nature, SalesCenter must be able to handle large amounts of data…much of it held in the application’s memory.  Keeping as much as possible of an Account Executive’s relevant account information in client memory is the best way to guarantee that accounts can be accessed quickly and effectively.

However, many of our customers choose to deploy SalesCenter via virtualized desktops using platforms such as Microsoft Application Virtualization for Terminal Services or Citrix XenDesktop.  This means that many different instances of SalesCenter may be sharing a limited amount of memory on a single server, which means that although memory may now be cheap, it’s still not a resource that we can take for granted.  Therefore it’s incredibly important that we use our memory as efficiently as possible leaving little room for leaks.

Unfortunately, early in the deployment of SalesCenter to a key customer we began to notice signs of a large memory leak during one of our routine operations.  This operation involved maintenance that some employees would perform when setting up a new installation of SalesCenter

815-Clip2.jpg

Basically, this maint­enance involved dis­playing a list of a user’s ad­vertis­ers and then allowing the user to link similar advertisers together in order to remove duplicate advertisers from view.  After each link, the entire list would refresh. However, when the list was refreshed entirely new copies of the advertisers were brought into memory while the older unused copies were simply leaked!  Since it wasn’t uncommon for this list to display tens of thousands of advertisers at a time, and to be refreshed dozens of times in a session, this problem rapidly spiralled out of control. 

815-Clip1.jpg

The problem was further aggravated by the fact that advertiser objects were not lightweight objects and each instance could take up a significant amount of memory.  It was quickly apparent that we needed to find the source of the memory leak, and fast.

The Solution

Enter the ANTS Memory Profiler⢠from Red Gate®.  First we ran our application under the memory profiler while taking memory snapshots at key points during use.  For example, we took a memory snapshot when the advertiser list was first opened, then another after the first advertiser link was made.  We continued linking more advertisers, always remembering to immediately follow each link with a memory snapshot.  After several iterations of this we had collected a rich set of data to analyze.

815-Clip5.jpgOnce we had a nice selection of data to work with, we easily isolated our largest objects in memory by examining the Classes with the largest size and the classes with the Largest growth in size graphs on the Summary page of the ANTS Memory Profiler.  Since memory leaks can be very time consuming to fix, and since time was one luxury we did not have, focusing our efforts on the largest and fastest growing objects assured us that we would be seeing the maximum return on our time.  From the graphic to the right, we can easily tell that the ‘Advertiser’ object is responsible for the majority of our memory growth.

Next we viewed the largest classes on the Class List page in order to learn exactly how many instances of the problem class were in memory as well as what kinds of trends those objects were exhibiting with each refresh of the screen.  For example, were the instances getting larger or smaller?  Or, were we creating more objects than we were freeing, leading to a net gain in the number of instances with each refresh.  Seeing that you are creating more new instances than you’re releasing with each action is often a telltale sign of a memory leak.  In our case, we noticed that we were always seeing a net gain in instances with each refresh.

In the graphic below, we can tell that the number of Advertisers in memory is trending upwards by the value in the ‘Instance Diff (+/-)’ column.  We can also tell that the total amount of memory that these Advertiser objects are taking up is trending upwards by the value in the ‘Size Diff (bytes +/-)’ column.

815-Clip6.jpg
Here’s a tip, once you know the name of the type you’re looking for you can filter the Class List to only show types containing that name by typing it into the filter box.

815-Clip7.jpg

Once we had established a trend with our objects, we moved to the Instance List screen for that particular object where we were greeted with thousands of individual instances.  Luckily, the Instance List comes equipped with handy filters that will focus your attention directly to instances showing signs of the most common memory leaks.  For example, the Kept in memory only by disposed objects filter will only show you instances of your object which are held onto only by other objects which have already been disposed of…a sure sign that your instance is no longer in use and actually should have been disposed of as well.  Or, the Kept in memory only by event handlers filter, will only show you instances of your object which are held in memory only by that object subscribing to the events of other objects.  These dangling event handlers are often the most common cause of memory leaks in .NET applications.  In fact, it appeared that our application had fallen victim to this trap as well since all of our memory leaks were caused by our unused objects forgetting to unsubscribe from other objects’ events.

Don’t be discouraged by the number of instances of your object, you often won’t have to view the graph for every instance.  In fact, if you sort the instance list by the Distance from GC Root you’ll notice that many of your instances will share the same value.  Often times, if two objects happen to be the same distance from the GC root then odds are good that they’ve been leaked by the same path.  This means that fixing the leak for one instance will often fix the leak for other instances on that same path.

815-Clip8.jpg

Once we had identified our problem instances, our next step was to let ANTS Memory Profiler graph the chain of objects still holding onto our object by displaying the Object Retention Graph

The Object Retention Graph easily displays exactly what is holding our objects in memory which makes freeing them a simple matter of finding the corresponding reference in code and releasing it.  Note that the Object Retention Graph only displays the nearest retaining object so you may have to repeat the cycle of capturing a snapshot and updating the code more than once, but eventually you’ll find that last object which releases your object into freed memory bliss. 

You’ll also find that an intimate knowledge of your code will help you significantly when tracking down your memory leaks.  For example, in the graphic above the Advertiser object is referenced by two paths:  a memory cache and a reference to _SelectedItem (which is highlighted in red).  From our knowledge of the code we know that the memory cache path is still valid but the _SelectedItem path is not since the list has refreshed.  Therefore, we can begin our pursuit of the memory leak by focusing on the trail of _SelectedItem.

Conclusion

One of the promises of the new wave of mainstream managed languages a decade ago was that memory problems would be a thing of the past.  Although memory leaks have in fact been largely decreased, it’s fallacious to think that we’ll never have to worry about memory problems again.  In fact, we may have even been lured into a false sense of security by the idea of “managed memory“.  However, by the careful application of good tools, and by learning from our mistakes we can easily keep our memory problems well under control.

Don’t forget that fixing memory leaks is often an iterative process and you may have to revisit an object several times in order to find all that is holding it in memory.  Also, if you feel yourself starting to get overwhelmed by vast amounts of data your profiler will produce then try to reproduce the leak with a much smaller dataset.  In the example above, once the major leaks had been identified we drastically reduced our sample size from over 17,000 advertisers to just 3.  This allowed us to more easily track exactly which of our instances were still valid instances in memory and which had actually been leaked.

Finally, I encourage you to pay attention to the techniques that you apply when fixing your memory leaks.  By slowly building an understanding of what kind of mistakes often lead to your memory leaks, you’ll eventually be able to spot these traps earlier when writing your code and thus you’ll learn to avoid many types of memory leaks altogether.

You can download a free trial of ANTS Memory Profiler from Red Gate’s website and try it on your own application.