Not long ago, a crack team of SQL Server experts was flamed for a crime against database normalization they didn't commit. These men promptly escaped from the taunts of academic relational theorists. Today, they survive as website editors and software evangelists. If you have a problem, if no one else can help, and if you can find them, maybe you can hire the DBA Team.
Scene 1: A bad morning starts with phone calls
There were ripples in my coffee cup. My hand was shaking. Why? Because I could hear the phone ringing from the end of the corridor as I arrived for work. Ten minutes later, there was enough of a pause to take a tepid sip. The users were revolting. Application slow. Application stalled. Report running slow. "Is the server down?" they asked. Had I, Joe DeeBeeAy, changed something on the server without telling them? No, the server was not down, and no, as far as I knew, nothing had changed, though sometimes it felt like I was the last to know about any changes. In among it all was that strange call from HQ telling me the CTO was on his way over to find out what was going on.
The CTO visiting me was a rare and serious event. I had to act fast, but could my brick of a laptop get any slower? Task Manager said CPU was higher than normal. PerfMon had reported high disk I/O latency. This was crazy. What did it all add up to, and what was causing it? High disk I/O latency…is it a SAN problem?
On cue, the CTO stormed in, apparently looking for a fight.
"What the heck is going on in here today? Are you running maintenance jobs outside the window again?"
"No, I swear. I think it's a SAN problem, but…"
He sighed wearily. "Is there a DBA who doesn't always say that? Maybe you accidentally dropped an index?" He swatted away my plea of innocence. "So what are you doing to fix it?"
I thought quickly. "I'll run a Profiler trace…"
"Are you nuts? One minute you're telling me about high disk I/O, next you want to add to it. Are you trying to bring the whole system to its knees?"
Sweating, wrong-footed, I stood blinking at the CTO. His take-no-prisoners attitude was well known, but this was bullish even for him, and since when did he take to wearing a baseball cap?
Suddenly, the CTO broke into a broad grin. He removed his glasses. Took off his cap. Ripped off prosthetic skin and fake moustache.
Stunned, silent, I stood, blinking dumbly at none other than the renowned SQL Server expert, Jonathan Kehayias.
"Jonathan? What the…?"
"I worry about you, Joe. You may fold under questioning. I hear you're in the frame for heinous crimes against a database server."
"I'm innocent. Minor point, I know."
"I think I can help you prove it, but not alone. We have to get the DBA Team back together."
"The DBA Team? I've heard about them, a crack team of SQL Server administrators forced into hiding after being framed for a crime against database normalization they didn't commit. I assumed that was a myth"
"No, they are real all right and we need their help. But first we need to find them…."
Scene 2: Grant becomes scary once more
We parked my sedan near a forest.
"Serious heat is coming your way from HQ," explained Jonathan, as we strode into the trees. "The developers are whispering in the ear of Eric 'Bloodaxe' Pedersen, one of the business managers. He's telling anyone that will listen that your relational database is the only thing still hindering the progress of the new agile system. It sounds like you're the fall guy."
Out of breath, I trailed Jonathan into a forest glade. Birds were singing, scouts were running around happily, tying knots, putting up tents and lighting fires. A voice boomed out above the general hubbub.
"Hey scouts, great progress; we should have the moose cooked before sundown!"
"Grant," shouted Jonathan "Long time. We've got a DBA here needs your help pretty bad…"
I blinked in surprise. "This…this is Grant Fritchey…the…Scary DBA?"
Grant bore down on us. "No, no! I've put that behind me now. I'm teaching leadership skills to scouts, and I have at last found peace and harmony. I'm no longer the Scary DBA people once knew."
"But this is a real emergency, Grant!" said Jonathan. “The developers are saying there is nothing wrong with the application and that it's the database at fault!"
"No! Peace and harmony!" he shouted, clutching his temples and beginning to color up, "Don't…make… me…angry!"
"…did I mention that they're using the ORM to fix the object-relational impedance mismatch?"
Grant seemed to grow visibly taller, angrier. His fists clenched. "What? It's…it's….clobberin' time!!!"
Birds scattered. The scouts looked up in amazement as we rushed off, back to the car.
Scene 3: Steve hits the trail
30 minutes later, the old sedan was bumping down a dirt track towards an old ranch building. Dust blew as we pulled up outside. A grizzled cowboy in chaps finished tying up the reins of his horse and eyed us nonchalantly. "Jonathan, Grant…long time…what brings you and your friend all the way out west?"
"Hey Steve, this is Joe DeeBeeAy. He needs our help. We need to get the DBA Team back together."
Steve fiddled with the horse's reins. "I don't do that stuff anymore, Jonathan, you know that. I've met me a pretty girl, settled down, got me a farm, kids, dogs, horses….alpacas, sheep…"
"Our main database is dog slow." I pleaded, "I don't know what's causing the problems, but every database call is taking ages. I need your help."
"… pigs, cats, Prius." Steve continued, distracted, before suddenly staring me straight in the eye. "Hey, stranger, that database might be your problem but it's not mine. I've put my past behind me. I've been falsely accused of a database crime for the last time."
"Other DBAs are suffering the same fate, Steve," said Jonathan. “Joe is innocent, but being blamed for the slow database. Bloodaxe Pedersen is telling the CIO how much money they could save by replacing the relational database with a NoSQL database, and Joe with a new coffee machine."
Steve spat in the dust, wheeled around and strode into the ranch. We sat silent, defeated.
Suddenly, he emerged through the billowing dust, sporting a Hawaiian shirt of shocking garishness.
"Maybe a DBA's gotta do what a DBA's gotta do. Lead the way, pardners!"
Scene 4: The hunt begins
Back in my cubicle, we crowded around my laptop. Jonathan got straight down to business, "So, Joe, what have you got for us?"
I told him what I knew.
Physical Disk\Avg. Disk Reads/sec and
Physical Disk\Avg. Disk Writes/sec were showing latency values over 50ms. I thought it could be a SAN issue.
"What else do you know? Have you checked the wait statistics?"
Monitor wait stats without lifting a finger, with the SQL DBA Bundle.
Grant paced the room, a frown like thunder. "What else do you know? Have you checked the wait statistics?" He eyed Steve meaningfully. "Run the
The laptop spewed the results reluctantly onto the screen.
"See here, pardners, that's
PAGEIOLATCH_SH waits, sitting pretty at the head of the line. That's delays in obtaining a latch for a buffer page because of delays getting the page from disk…"
"I told you it smelled like a disk problem!" I cried triumphantly, already heading for the exit.
"Not so fast youngster! Maybe it's the disk that's at fault and maybe it's not…"
Sure, and maybe it was an albino gator in the basement, but right now, I only had a few hours before Bloodaxe Pedersen started taking the air-pressure out of my career, and I was off for a fistfight with some SAN admins. Dammit, how long had I been telling them that we needed to move this server over to SSDs?
Interlude: Grant Fritchey on wait stats
Scene 5: A setback
20 minutes later, I was back, flopped sulkily into my chair, holding a bloody handkerchief to my nose.
"I see the SAN admins agreed with your diagnosis," smiled Steve.
"Thath not thunny. They say there's nothing wrong with the performance of the SAN. They've run tests, the IOPS is right on spec. They asked if I'd tried adding an index. I lost my cool. Big mishtake, but that index barb always cuts deep."
"Joe, pardner, sometimes you need to learn the hard way.
PAGEIOLATCH_SH waits may be top of the list, but we need to know the 'normal' wait profile for this server. Don't suppose you have baselines for the wait statistics?"
I looked back at him, as blank as a starting pistol in a shoot-out. "No, I figured not," he said, shaking his head.
"Steve's right, and even if disk I/O is the bottleneck - meaning the disk subsystem can't return pages fast enough, so sessions are waiting for latches - that doesn't mean disk I/O is the problem."
I stood blinking dumbly at Jonathan. This was becoming a habit today.
Grant was looking more menacing by the second. "Jonathan's right. Think more laterally, Joe. I've seen those
PAGEIOLATCH waits on 100 servers. Maybe on 10 of them the problem is the disk. The other 90 it's something else. Sometimes, it's a memory problem. If the buffer pool is experiencing memory pressure, you'll see cache churn and a lot of disk I/O. Most times, though, it's a rogue application, a query gone bad, doing huge range scans where seeks would do the job."
"Back in Colorado we sing it like this 'Oh, give me a home where the range scans all roam, then the disk I/O metrics will play…'," crooned Steve.
I slapped my forehead, "You know that CRM application the devs are so proud of? They modify it so often I've stopped trying to keep up, but last week's deployment sounded different. I heard talk around the water cooler about finally matching their object persistence layer to a more flexible database structure. I think I even heard terms like 'EAV' and 'fact table'."
Grant's temples seemed to pulsate visibly…"Object persistence layer! No! Don't…make…me…angry…"
Even Jonathan flinched at the EAV part. "Grant, calm down! Joe, you're bleeding all over your expenses form! Steve, grab the laptop! It's time to get us some inspiration…"
Scene 6: Coffee and doughnuts
We sat huddled round the canteen table, clutching mugs of coffee. Jonathan munched ruminatively on a jam doughnut. "I've heard about this CRM team. Young, very agile, up and coming. They have some…challenging…ideas about how to write data applications, but they're the new golden boys with HQ. We'll need some cast-iron proof before we start lobbing grenades in their direction."
"Well then, we better arm ourselves properly before we go picking any more fights," said Steve. “These problems are happening now, right? So let's look at
sys.dm_os_waiting_tasks. This query will show us all the currently blocked and blocking sessions and the associated statements…"
SELECT blocking.session_id AS blocking_session_id , blocked.session_id AS blocked_session_id , waitstats.wait_type AS blocking_resource , waitstats.wait_duration_ms , waitstats.resource_description , blocked_cache.text AS blocked_text , blocking_cache.text AS blocking_text FROM sys.dm_exec_connections AS blocking INNER JOIN sys.dm_exec_requests blocked ON blocking.session_id = blocked.blocking_session_id CROSS APPLY sys.dm_exec_sql_text(blocked.sql_handle) blocked_cache CROSS APPLY sys.dm_exec_sql_text(blocking.most_recent_sql_handle) blocking_cache INNER JOIN sys.dm_os_waiting_tasks waitstats ON waitstats.session_id = blocked.session_id
"We can follow the blocking chain. That'll show us which request is at the head and is causing all the locking waits."
I winced, "That's one nasty looking request that's causing the blocking."
Steve pointed at the screen, "The
resource_description tells us the ID of the page that is the source of blocking contention so we can find out which table it belongs to." Quick as a flash, he was bashing out a
DBCC PAGE query.
"That's a table in the CRM database!" I cried, jumping to my feet. "We've got 'em!"
"It's clobbering time!"
"We better arm ourselves properly before we go picking any more fights"
Know exactly which queries are using the most resources, with the SQL DBA Bundle.
"Whoah, people, we still haven't proved that this is what's causing the disk I/O bottleneck!" Steve shouted, “We've got to be sure. If the app is as wacky as you fear, it could be a heck of a tuning effort…and that means a lot of time and money. You think Bloodaxe Pedersen will take this lying down?"
"Eric Pedersen doesn't take much lying down."
Who said that?!
We looked up to find ourselves face to face with the man himself.
Scene 7: Canteen showdown
Eric towered over the crowded table, chuckling quietly. "Well, well, how many DBAs does it take to change an index? More even than I realized. No wonder you're such a drain on our IT budget…"
"Even you must be tired of that index jibe by now, Eric," I growled “But you won't be laughing when you see what we found out about your beloved new CRM app…"
"Whenever I need a laugh, Joe, I just remember your outdated ideas about what kind of database the modern business needs. Dave here has told me all about how, yet again, it's a roadblock to their new CRM development work."
Developer Dave emerged from behind Bloodaxe, smiling, "Sure is! All we really need is a simple data repository to serialize our object data."
I looked nervously over at Grant, wanting to avoid further bloodshed, but fortunately he was distracted, staring intently at some execution plans on my laptop.
"…anyway, it's all in the latest report to HQ," Dave continued, oblivious to the danger. “Enjoy your little party while you can, but when they read about how you let us down, I'm guessing they'll be mad. We're just off to tell them about it now. And don't forget - a good NoSQL data repository don't need no administrator."
With that, Developer Dave picked up a doughnut from the plate, crushed it and stalked away, jam dripping through his fingers like blood.
"Geez, are all developers such drama queens?"
"Takes one to spot one, Grant," said Jonathan.
"Hey! Don't…make…me…. ANGRY!"
"Pardners, did you not hear what Bloodaxe said?" bellowed Steve “They're heading to HQ now! We've got to beat them to it!"
Scene 8: The chase
We dashed out into the car park and crowded into my sedan.
"But we haven't even finished the analysis," I cried, breathless. "We don't have proof."
"Listen, pardners, name any DBA task…denying table access to developers, killing developers' connections…I've done it at high speed on the back of a horse. A bit of data analysis in the back of a sedan will be a piece of cake."
We pulled onto the highway, tires smoking. I glanced in the rearview mirror. A Jeep was on our tail.
In the back seat, Jonathan tapped away frantically. "The last piece we need to examine is the virtual file statistics. It shows us the disk read and write patterns, how the I/O load is distributed across files and databases on the SQL Server instances. It'll help us nail the files and objects that are eating all our I/O."
I swung a sudden left, tires screeching, mounted the sidewalk and sent a pile of old cardboard boxes scattering.
"WHOAH!" screamed Grant, "Will you look at the size of the I/O stalls on this object!"
Jonathan scanned through the results, "It's associated with…the CRM database! It all correlates! We have the proof we need!"
Grant grabbed the laptop and frantically bashed away at the keyboard. "OK then, just one more thing we need to do."
Jonathan looked nervously through the rear window. "Step on it, Joe! Eric and Dave are right behind us!"
"My foot's on the floor! We're almost there!" I cried.
"It's no good, they're going to overtake!"
We screeched into HQ car park, neck-and-neck with their Jeep.
"All done," shouted Grant. "It's clobbering time!"
With that, he threw the laptop out into the Jeep's path. The Jeep hit it, flew into the air and barrel-rolled, before crashing to the ground upside down and exploding.
We jumped out of our car, looking back in disbelief as Eric and Dave crawled out of the wreckage, dazed but otherwise completely unscathed. We turned and rushed into the building just ahead of them.
Scene 9: HQ
We burst into the office, shouting at each other and gesticulating wildly.
"Phil!" Eric called out, spotting the CIO in the corner. He looked remarkably unfazed. White hair, wild beard, quietly nursing a cup of tea. He had the look of a wise old bird who'd been in the IT trade a good while, and was no longer shocked easily.
"So sorry about all this fuss," said Eric. "But I knew you'd want to discuss our latest report on the CRM project as soon as possible. The application is functioning perfectly, but our antiquated relational database model simply can't adapt to our changing business requirements. We're losing revenue hand over fist as a result. And the best our DBA can do," he gestured toward me, dismissively, "is request an expensive SAN upgrade. Again."
Phil raised a questioning eyebrow in my direction.
"It's true, I did, but I was wrong about that."
The CIO refilled his cup, serenely, and signaled for me to proceed.
"With the help of Jonathan and the DBA Team, I've learned a lot this morning about wait statistics, and about how massive range scans from the new CRM app are bringing the IO subsystem to its knees. All the data is on my laptop, right here...Ah." I made a futile grab for the laptop. Gone, of course, leaving only the hernia from lugging it about.
"A good DBA always backs up his data. You know that, Dave."
Quickly and securely copy your backups to the cloud, with the SQL DBA Bundle.
"And what data would that be then?" cried Dave, triumphant.
Grant looked scary. "A good DBA always backs up his data. You know that, Dave. Even during a life-threatening, high-speed car chase. I uploaded it all to the Windows Azure cloud, via Cerebrata, just before I…err…accidentally dropped the laptop into the path of your Jeep."
"And it proves that the CRM application is the culprit. Not the disks and not the database," I smiled.
"Sure, we could probably find an index or two that would help, but the problems run deeper than that," explained Grant. "For a start, I noticed from the execution plans that we have implicit conversion problems that are forcing table scans for every query the CRM app makes against a
VARCHAR column, due to the ORM parameterizing it as
Suddenly, Phil spoke up.
"It seems to me that we have a lot of reasons for the DBA and development teams to start working together more closely. Eric, Dave, I'm sure you can sit down with Joe, explain the business domain fully, what it means to your applications in terms of the data model. Joe, you've learned a lot recently, and I'm sure, with a bit of help from the DBA Team, you can find a way of doing what they want to do without cramping their style too much."
Grant smiled sweetly. "Of course! They can even use that ORM too, but I'd say you'd need a bigger server for that."
As we left the office, I was surprised see how warmly Phil shook Jonathan's hand. The final piece of this puzzle seemed to slip into place. Had our CIO hired Jonathan, and with him the DBA Team, setting this crazy day in motion?
Seeing my searching look, Phil confessed, "Jonathan and I met many years ago. I knew he could help with this sort of problem. Besides, it was about time the DBA Team got back together. Who knows when we might need them again…?"
I shook my head, smiling.
"Oh, and don't forget to order yourself a new laptop, Joe," said Phil, as he closed the door. "Plus maybe a monitoring tool to help with future analysis. Collecting and maintaining all that baseline data manually is a big job for a solo DBA."
Jonathan put his arm round me and grinned.
"I love it when a plan comes together!"
Want to monitor wait stats and check baselines without lifting a finger?
Try new SQL Monitor 4, which now includes real-time data on wait stats for every single one of your servers, so it’s easy to spot what’s slowing things down.
SQL Monitor 4 also gathers baselining data automatically, and presents it to you as a simple graph, so you can check what’s normal for your servers at a glance.
Try it today as part of the SQL DBA Bundle, which contains SQL Monitor 4 and six other tools that help you keep your data safe:
- Create up to 95% compressed backups and schedule regular restores, including full verification.
- Monitor your servers in real time so you’re always the first to know about problems.
- Protect your data from unauthorized access with 256-bit AES encryption.
- Review full details of schema and data changes going into your servers, including SQL CREATE scripts with highlighted differences, so you can deploy updates with zero errors.
- Save time writing SQL with code-completion and customizable SQL snippets.
Wait Stats whitepaper
- What each wait stat does
- How to diagnose poor performance
- 14 code samples to help you start capturing wait stat data
This 40-page guide is a great way to start working with this powerful but underused SQL Server feature.
Plus, receive emails about new episodes of the DBA Team and other helpful DBA updates.
Here's a direct link to the .zip file, in case your download didn't work: