Learning to play the symbols

If you are ever tasked to figure out what’s happening inside an application, one of the essential bits of information is the Program Database (.PDB) file. The PDB contains symbols, that is to say that it contains a list of methods, properties, and variables in an application and lists exactly where in the application these objects appear in the compiled code. Once an application has been compiled into binary or Intermediate Language, all of the ‘friendly’ method and variable names are replaced with machine-friendly numbers. The PDB is a cross-reference between the human-friendly names and the machine-friendly addresses as well as a reference as to where these names appear in the source code file.

In addition to debugging, a PDB is also useful for code analysis tasks such as code coverage, performance profiling, and memory usage. ANTS Profiler is a good example of an analysis application that benefits from the knowledge in the PDB file, so it’s important to know how the software uses the PDB and what can happen when the PDB is missing or incorrect.

In ANTS Profiler, the PDB is used for two main functions: to find the source code for the application being profiled and to filter out methods that you’re not interested in or are powerless to fix, such as the assemblies distributed by Microsoft as part of the .Net Framework or third-party assemblies.

Quite a few ANTS Profiler incidents come through support because of a missing or damaged PDB, to the point where I almost pick up the phone and answer ‘Red Gate Software, this is Brian, have you got a PDB?’

The most common solution is simply to ensure that the PDB is in the same folder as the assembly you’re profiling. If your program loads dll assemblies from different folder locations, then the PDB relating to that dll should be in the dll’s folder. This gets more complicated if the assembly loads from the Global Assembly Cache at %SYSTEMROOT%Assembly because of the shfusion extension that changes the normal Explorer view of this folder to a special view of the assemblies. The GAC is not a physically single folder, but a complex web of subfolders that make it possible to register multiple versions of the same assembly into the GAC by simply dragging and dropping the file in the GAC folder, which is a shade easier than running GACUTIL.exe from the command prompt!

To copy the PDB into the GAC, you open a command prompt and change directories to %SYSTEMROOT%AssemblyGAC_MSIL and find the subdirectory for the version of the assembly that you’re going to load, and copy the PDB file there. Alternatively, you can use regsvr32 /u shfusion.dll to disable the shell extension and navigate the GAC folder normally using Windows Explorer. You have my sympathy — this is not easy! What I’d like to see in a future version of Profiler is the ability to have the PDBs copied into the right places automatically, if that’s practical.

If your application’s assemblies exist in the GAC, it’s important to remember that they will be loaded from there first, even if another copy of the assembly exists in the same folder as your application’s executable! If you use ANTS Profiler for performance profiling in the ‘show only methods that have source code’ mode and you don’t get any method hits or source code for your assembly, check the GAC — your application may be loading the assembly from there rather than from the working folder.

It’s also worth noting that you can generate a PDB for a RELEASE build the same as you can for a DEBUG build. In some versions of Visual Studio, the PDB is produced only when you have specified the DEBUG configuration by default, but even in RELEASE configuration the PDB can be generated using the ‘generate debugging information’ setting. Because DEBUG builds are less optimized, the PDB for a debug assembly may not be valid for a release version. Case in point:

‘Hello, Red Gate Software, this is Brian, have you got a PDB?’

‘Yes, you silly man, of course I do. However, the hit counts in the source code view of my application are all wrong!’

‘Wrong? Wrong how?’

‘Well, I have a blank line that got hit five times. Explain that!’

‘Uhhhhhh…’

I guided him through rebuilding the application in DEBUG mode and redeploying the assembly and the PDB, which solved the problem. I suspect that the PDB was either corrupt or it was built for the release rather than the debug version of the assembly. Since the offsets for method information are potentially different between an optimised release build and a non-optimised debug build, I see this as a potential problem.

In another case, a programmer had everything set up correctly, but ANTS Profiler continually prompted him for the source code file. What went wrong? Absolutely nothing; it was working as-advertised. Since the PDB stores the absolute path to the source code file, it must be present in the folder that the application had originally been built from. This causes some potential inconvenience when moving the application to the testing environment and profiling it there, because the original source code tree needs to be replicated on that computer.

ASP .NET 2.0 uses a dynamic compilation model where the ‘assembly’ is compiled and cached on-the-fly in the ‘Temporary ASP.NET Files’ folder when the page is requested, so how do you get a PDB? Basically, the same way as from Visual Studio. Simply edit the web application’s web.config file and set debug=”True”. The PDB is copied into the temporary folder and the web application can be profiled as normal.

Managed code can be extended by dynamically generating assemblies using CodeDom. A program may do this, for example, if it allows users to write their own ‘macros’ or if some information needs to be gathered first from the user before generating an assembly. In this case, you decide whether or not to generate symbols by setting CompilerParameters.IncludeDebugInformation to true. If the source code for the assembly is a string rather than a code file, though, you may be prompted for the location of the source code file (which of course, doesn’t exist!). If the assembly is generated in-memory, then your only options are to profile all methods or specify the filter manually.

Once you understand what debugging symbols files are for and how they work, you can get more information from performance profilers like ANTS.