Event-Driven Debugging

Most application troubleshooting involves getting an error, analyzing the error message, and at worst, attaching a debugger to work out the real cause. What is not really covered is how to troubleshoot an applicaiton that is not errant, but is having a performance issue, and more than likely, in the middle of the night when you are snug in your bed, sawing logs. What you need is an ever-vigilant cyborg who never sleeps to sit in front of your server all night, but as SkyNet is not live yet, you can settle for the next-best thing.

Windows provides performance counters and alerts that can tell you when an applicaiton reaches an unacceptable threshold of naughty behavior, but although it can tattle on your brainchild, it won’t be the child psychiatrist that you need to tell you why he’s pulling your server’s pigtails and pulling faces at the teacher. What you need is to plug a debugger into performance monitor and have it tell you what’s going on with your applicaiton at the time.

For this purpose, I’d used Microsoft’s MDbgEngine as the basis for an applicaiton that will dump a program’s stacks, I call it Application Slicer Dicer Wonder Dumper Super Cyborg, or StackOMatic for short. StackOMatic can look at a program’s behavior and tell you if the stacks are not moving, but it can also work on the command-line to dump all managed methods on the stack at will.

Now that there is a command you can use to dump the stacks, all you need to do is politely tell Windows to run it when you’re displeased with your creation as it’s trashing the CPU of your server at 3 AM.

The first step is to create a scheduled task to tell StackOMatic to dump your applicaiton. Start Task Scheduler and right-click Task Scheduler Library and then Create Task. For this exercise I’m creating a task that will dump the Red Gate SQL Monitor Base Monitor Service. In the Actions tab, I enter the path to StackOMatic and use the arguments to log the stack dump to a file:

/PN:RedGate.Response.Engine.Alerting.Base.Service /OUT:c:usersadministratorMonitorLog.txt

Next, I go into Windows Server 2008’s Reliability and Performance Monitor and add a new Data Collector Set. This set will produce an alert on the %Processor Time for the service. When the processor time breaches 50%, it will run the StackDumpBaseService task I created.

Whenever the service misbehaves, it will append to the log file. Now when I go to work in the morning, I can see what the service was doing when it overloaded the processor and take action.