The Big IT Outage: Where are we going?

A person standing next to a cloud computing server Description automatically generated
Comments 0

Share to social media

As you all know about, just a few weeks ago we had the big CrowdStrike global IT outage.

Some lucky IT guys like me had it in double: Microsoft was also suffered an outage in North Europe on the same week.

I have worked in IT for more years than I can confess. People like me have many stories about interesting situations, either of public knowledge or in private companies.

One situation like this Big IT Outage brought back many of these memories. It’s a common knowledge we should learn with the past, maybe we can extract something from this.

A blue screen with a qr code

Description automatically generated

The MDAC Challenge

Sometime around 1996 or 1997, Borland was posing a challenge to Microsoft. The BDE (Borland Data Engine) used with Delphi was performing better than ODBC.

As a result, Microsoft released the MDAC, a package including ADO and OLEDB. It had to be released and evolved so fast that it became a mess:

  • Windows had one version of MDAC
  • Office had a different version of MDAC
  • Visual Studio had a different MDAC version
  • SQL Server and SSMS had a different version of MDAC

You can imagine the mess this kind of mix results. Around these years, Microsoft hold a big conference in São Paulo, it was called TechEd/BizApps. I attended this conference, and I never forgot what one of the Microsoft speakers told us. I can only remember the keynote speaker was someone important inside Microsoft to make these statements.

The statements: He acknowledged the mess created by the mixed versions of MDAC spread everywhere and promised Microsoft would never make similar mess again.

This was around 1997. Compare with the huge mix of software package versions we use today to build anything, and it’s considered usual.

Trustworthy Computing

In 2002, after a series of security issues which put the security of Microsoft software under question, Microsoft stopped the development of new software features for two months.

During this time, all Microsoft developers focused on improving the security of all the softwares and software development processes. It was a considerable investment – two months of the salary of every employee – focused on security.

During this time, one video I watched caught my attention. The video illustrated how Microsoft implemented automated software testing. In summary, the automated software testing spends all night running hundreds of test.

A person standing next to a cloud computing server

Description automatically generated

In the morning, each development team collected the test results from the night and worked on the identified issues, fixing the bugs.

Every time I find a bug in released software and I discover no one noticed before, every time I need to open a support ticket to get attention to a bug in a released software, I find myself asking how that automated testing process evolved, how is it being applied today? Is it?

It was only a field type

A few years ago, when working as a developer, I discovered one team of developers had prepared a change in their application that would affect many related applications. Each of the affected application was managed by a different team.

After some emails and one quick meeting, all the teams synchronized themselves to make an updated on each one of the apps to be compatible with the new change. The teams coordinated the changes to go online at 00hs, 1am, 2am and so on, each team publishing their solution at a specific hour and reporting if everything was ok. If not, a complete rollback was prepared.

Every time I find a production bug which should be obvious, I keep thinking how a process like that would be helpful

IT Outage Cause? Solution?

My proposal is that instead of looking only and mainly to software development practices, quality practices and more architectural definitions, we need to work to social changes.

We have huge differences between generations, differences in technologies available, level of globalization and much more.

The set of these differences affects the practices in software development. The fact the software quality may be going down instead of going up may mean we are not managing social differences and evolutions in a good way.

A hilarious example

There is an old slapstick comedy movie named Idiocracy, which is a bit difficult to find in streaming networks, but it became a classic.

It’s not easy for a slapstick comedy to become a classic. This one became famous because the first 3 minutes of the movie. It makes a curious prediction about the future of the humanity. The problem is that the prediction makes sense and sometimes we feel like it’s happening.

My comparison here is about social changes happening in unexpected and unnoticed ways. This comedy proposes one. The social changes which affect the software and data areas in IT are many more and may be more difficult to track, but I believe they are present.

Check the video about the comedy: https://www.youtube.com/watch?v=sP2tUW0HDHA

Summary

Does it make sense, or am I only being nostalgic as any older IT guy missing “old times”? I would love to know your thoughts on the comments.

Having a huge global outage because someone deployed a change in a Friday evening, making famous IT jokes become real, is or is not a social change?

Load comments

About the author

Dennes Torres

See Profile

Dennes Torres is a Data Platform MVP and Software Architect living in Malta who loves SQL Server and software development and has more than 20 years of experience. Dennes can improve Data Platform Architectures and transform data in knowledge. He moved to Malta after more than 10 years leading devSQL PASS Chapter in Rio de Janeiro and now is a member of the leadership team of MMDPUG PASS Chapter in Malta organizing meetings, events, and webcasts about SQL Server. He is an MCT, MCSE in Data Platforms and BI, with more titles in software development. You can get in touch on his blog https://dennestorres.com or at his work https://dtowersoftware.com