“You see, Watson,” he explained in the early hours of the morning as we sat over a cup of tea in Baker Street, “it was perfectly obvious from the first that the explanation for this query’s sudden poor performance was elementary”.
“But how could you guess what the cause was?”
“Firstly, we see that the query returns few rows and yet the execution plan tells us that the query optimizer chose to scan a very large table to return them. Why, we ask ourselves, when clearly a seek operation would have been more efficient? A cursory glance at of the error logs reveals that the DBA’s scheduled “update statistics” job has been failing for over a month. Perhaps when we wake the DBA from his slumber, under this desk, we can teach him how to set an alert”.
“You reasoned it out beautifully,” I exclaimed in unfeigned admiration. “It is so long a chain, and yet every link rings true.”
“It saved me from ennui,” he answered, yawning. “Alas! I already feel it closing in upon me. My life is spent in one long effort to escape from the commonplaces of existence. These little problems help me to do so.”
So ends one of Sir Arthur Conan Doyle’s lesser-known detective stories. The kind of reasoning and logic that Sherlock Holmes applies to a problem is very similar to that required of the DBA, when confronted with a support request and a typically vague description of the problem.
Sherlock’s task was to find the single clue that would unravel, isolate and expose the scarlet thread of murder running through the colorless skein of life. In a million lines of code riddled with the creatures, the DBA must find the particular bug that today brings chaos to their database systems. One thing you never saw Sherlock do was list down all possible causes and motives for the crime and then work through all of them, a-to-z. Likewise, if a DBA adopted this approach to troubleshooting, it might take a year to tune a single query.
When the clue you need is hidden in a million smaller irrelevant details, you cannot simply test every possible cause until you find the villain. Like Sherlock, the DBA needs sound deductive reasoning, based on reasonable situational knowledge.
Given the many possible causes of sub-par query performance, where do you start? If you decide to investigate fractionally higher humidity levels in the server room as a probable cause, then you might not be the right person for the job. On the other hand, if you deduce that it could be the rather large query that the CEO was executing on the same server as the slow query, it doesn’t matter if you are right or wrong. It is a good place to start the process.
Sadly for your ego, all of your deductions will be wrong except the last one, and this failure shouldn’t bother you in the least. As you make each guess, you sift through the available evidence. If your guess is wrong, and it feels like you are spending more time diagnosing the entire mess of an environment than pinpointing the root cause of a specific problem, don’t be discouraged. At each stage, your knowledge of the broader topology of the system deepens, and you start to recognize those parts that are held together more by chance and prayer than good engineering practices. They are likely tomorrow’s villain, even if they aren’t today’s.
The next time a problem does arise, your powers of deduction will be stronger, and your first guess at the root cause will be closer to the mark than it was the last time. Of course, for Sherlock, finding the culprit was the end. For a DBA, it is just the end of the fun part. Actually solving the problem is when the real game begins…