The IT Architect Interviews #3: The Machine Learning / Data Science Architect
Redgate’s Michaela Murray is on a journey to understand the role of IT Architects in digital transformation initiatives. The third interview in this series sees her talking to Kris Bock, Machine Learning / Data Science Architect at Microsoft, to find out how machine learning and artificial intelligence are changing the way businesses collect, process and use data.
Could we start with what your role is about what you’re doing right now.
Sure thing. I’m a Machine Learning / Data Science Architect in Microsoft’s FastTrack team and our job is to help customers who are looking to get into Azure. It’s almost like a guided walkthrough on everything from how they get started with governance to specific workloads they might be trying to migrate or provision in Azure. That’s across apps and infrastructure, through to data and analytics.
How did you get into this role?
My background was predominantly in infrastructure, originally as an engineer and then moving into more of a pre-sales role. What I brought to the table was the ability to not only look at specific technologies but understand how they work together. When I joined Microsoft, it was to help customers on their path to Azure, but it pivoted towards analytics, artificial intelligence and machine learning. Now my role is about moving customers up the ML and AI maturity model.
What first sparked your interest in ML and AI?
It was this idea that if we can apply a really structured process around ML and AI, it can take away some of the more mundane and rudimentary work and help businesses overcome barriers they face and improve employee satisfaction.
For instance, we’re doing some work looking at detecting anomalies in ocean waves by taking data that comes from buoys placed in the ocean. At the moment, you have highly skilled oceanographers looking at the data, which is a very repetitive and mundane task. Apply a machine learning model instead and they can start using their training and passion to do something more meaningful than just eyeballing data.
How long do projects like this take?
One of the biggest challenges we have is the expectation of the time involved, particularly if customers don’t have a lot of in-house skills. Even if you do have skilled data scientists, it can take weeks or it could take months, but you don’t know until you start the journey – and even when you do, it’s often an iterative journey. We have a lot of success working with customers who are software developers because they understand our job, they understand iterating, and they understand the need to pivot and change direction when you need to.
How do you typically approach projects?
When we start a project, we need the major stakeholders on board from day one because we need to agree what we’re trying to achieve and how it aligns to a business objective. Does it improve the customer experience and how does that work for the business, for example? Is it going to lead to increased revenue or decreased operational expenditure? There has to be a bottom-line objective, and the stakeholders have to be involved, otherwise the ML and AI solution that’s implemented won’t get treated as a fundamental part of the business process.
A good example is an IT organization that wanted to take educational data based on the academic performance of students along with some socio-economic data, and then predict future academic results. We went ahead and built the model and while it was good in terms of the metrics it provided, it turned out that a lot of the predictors came from the socio-economic data rather than the current academic scores. That’s been well researched already so we were just leveraging research that’s out there and validating it.
More importantly, feedback from some of the institutions involved was that they couldn’t change factors like the occupation of parents, household incomes, or geographical location, so the model wasn’t of any use to them.
There are lots of examples where school systems in the US and across the world have taken this kind of data and used it for early intervention programs with vulnerable groups, but in this particular case the institutions weren’t aligned to the outcome, so it didn’t take off.
That’s a shame. Do you see it changing in the future, to a point where machine learning and AI become a natural part of the business process?
It’s happening already with things like predictive texting in mobile phones using machine learning and AI as part of our daily lives. I think we’re going to see a lot more of that, with systems that take machine learning and AI and produce an outcome without too much requirement for businesses to get their hands dirty in code.
We need to make a differentiation, though, between pre-trained models which use data from models that have already been built using datasets to address similar problems, and custom models. Often you’ll find the bigger companies that want to stay ahead will look at custom models, but other customers that don’t have the data scientists will look for pre-trained models. From a solution point of view, I think we’ll move from having a separate box being used to train data to having it embedded in the data systems we already have, whether it’s a data platform like Synapse or some other location as part of a SQL database.
Do you think AI and machine learning will begin to drive business decisions, once they start to benefit from the knowledge it provides?
You’ll definitely see businesses that are leveraging machine learning and AI start to have a competitive advantage over those that aren’t. It’s in the same vein that BI transformed businesses in their ability to move from real-time reporting to making intelligent decisions based on historical analysis.
It’s coming across that your role as an Architect is a little different to the other Architects I’ve been talking to. So what does being an Architect mean to you?
It’s about understanding where customers are, what they need, and then trying to align a response to that by navigating the various different services Microsoft has. So it might be some education, a guiding hand, recognizing a requirement to get a partner in, or it could just be they want to be told what they need to do. We have this saying that every customer is unique, but they all have similarities or commonalities with other customers.
In general in big data and analytics it’s the same scenario and the same problem. We have customers that have business intelligence or operational data environments, for example, and they want to start leveraging additional technology such as streaming or IoT or some other capability and help them improve their business performance.
Part of that is navigating what value there is to the business, how it can be aligned to a technology stack, and how to help them make that right decision around what the appropriate stack is for them.
Do you see those core areas staying the same over the next five or ten years?
At the moment, you have Enterprise Architects and their job is to focus more on the strategy of the business and how technology can help support it. At the other end of the spectrum, you have Solution Architects who dive right into the technology details. In a lot of organizations, I’ve seen those roles kind of squashing together and that’s being partly driven by vendors fulfilling some of the technical depth. Solution Architects in the past knew all about infrastructures, servers and hardware and stuff like that, but the change in the pace of technology, particularly in the cloud, means it’s almost impossible to keep ahead. Customers are now often relying on the vendor to provide them with that technical depth. So Architects still need to be technical enough to keep the vendors honest, but their role needs to be broader to fulfil both the Enterprise and Solution role.
One of the things we’ve discussed with a few people we’ve spoken to is the code-first versus off-the-shelf argument with all of the cloud and micro-services solutions now available. How do you think this changes the role of the Solution Architect?
I think a large part of the role of a Solution Architect bringing disparate technologies together to provide a cohesive solution has been fulfilled by cloud providers. This is really where DevOps comes in, because we’re seeing developers and infrastructure teams coming together to help out. You may not be building a solution from scratch, but what you will be doing is wiring together different pieces. So you’ll now have someone in a bridging role working with people in the CI/CD pipeline to add customized bits. There will always be a 10% gap in what the native services do and what your organization needs, so there will be that requirement for custom development. Cloud vendors have been talking about agile and DevOps for a while because that’s how they build things and customers are having to align to that because that’s the way the technology supports it.
On the topic of DevOps, what does doing DevOps mean to you?
It’s the ability to embody all of your development through a code-first deployment methodology, and align activities into sprints. At the very least, DevOps is a great way of getting teams speaking to each other about what they’re doing. It helps them align to a set of tasks for a sprint and track progress.
For a lot of organizations, particularly ones that are adopting modern data warehouses, I often think it’s probably the most important part. In a year’s time, there could be a new piece of technology that’s even better than the old one, so how do organizations take advantage of the new features? If they don’t have a DevOps mindset and they aren’t agile enough to take artifacts and deploy them against a new kind of infrastructure, then they get stuck in this kind of three year infrastructure cycle. What the cloud enables customers to do is change the frequency of iterations. Often one of the most important outcomes of any cloud project for a customer is getting that part of the business right. It’s actually transforming the way they work.
Where do you see DevOps lining up with machine learning and AI? Is there a crossover?
Absolutely. Machine learning and AI are where software development was 20 years ago in terms of the wide array of tooling, processes and methodologies. Now there’s a big push from companies like Microsoft into MLOps or AIOps, and the idea is that data science is an iterative process too. So agile and DevOps align to it perfectly. If I write a machine learning model, how do I feed that into a deployment? How do I then go back and iterate through that?
Do containers align with machine learning as well?
It’s something Microsoft is heavily invested in, because from our product development perspective containers are underpinning a lot of our services. The machine learning service in Azure is not just a place you go and build a model, it’s a place you can actually deploy it. Maybe you want to deploy it in the cloud, or on-premises, or into an IoT device. The only way you can do that is by containerizing the models, so containers are a fundamental part of it.
What’s the best part of your job?
It’s working on projects that genuinely make the world a better place, but I also like helping customers make sense of machine learning and AI and understand what’s real and what’s not. Some customers still think it’s this magic apple tree that you shake and it tells you how to make your business more profitable, but I like having those conversations and helping customers see where the value can be found.
What about the most challenging part?
The hype around machine learning and AI is one of the biggest challenges, because it’s often a barrier that we need to break through. For instance, customers may come to us and say ‘I’ve got a whole bunch of data, how can I apply machine learning to it?’. Before we can answer that, we have to step back and understand what the objective is.
It’s also keeping pace with the rapid change in development. Three or four years ago, research papers on machine learning and AI topics would have been maybe in the hundreds and now they’re in their tens of thousands. So the range at which some of this is developing is astronomical and it’s often hard to sort the wheat from the chaff.
How do you manage to stay aware and up-to-date with all the tools and technologies out there?
Fortunately in the machine learning and AI space there’s a really good podcast called TWIML AI, This Week in Machine Learning and AI. The host has been doing it for a couple of years now and does a fantastic job of doing a cross-section of machine learning and AI at various levels of technology or detail. It’s easy to get caught up in the technology, but there’s huge interest around the ethics, accountability and fairness of machine learning and AI, and we’re starting to see a lot of that rise to the top now.
What’s your one top tip for someone who wants to get into a role like yours?
You need the ability to step back and put some air between you and what needs to be achieved, so you can be as objective as possible. You often have competing desires and requirements from different teams, so being able to be slightly impartial helps. I think more importantly though is that no one comes into a machine learning or AI job as an expert. The idea that you can be a learn-it-all and not a know-it-all is probably fundamental to that. So if you make a start and you keep yourself open to learning from everyone’s experiences, then there’s no reason anyone can’t do the role.
Finally, in your opinion what’s the next big thing technology-wise?
Quantum. I think there’s enough big investment going on and advances happening that will change the way in which we can engage with technology because the way that we do things will be significantly faster. And I think what the computation allows us to do will really ramp up that curve of development.
Sounds good. Thank you so much. It’s been the highlight of my day Kris, I’ve really enjoyed talking to you. It’s been really, really interesting because it’s definitely an area that I know very little about and hearing all of these real-world examples is amazing. I really do feel like Microsoft as an organization do an amazing job of making their mission happen.
Look out for other episodes in this series, featuring Michaela Murray interviewing:
- Darren Dawson, Pre-Sales Architect at WARDY IT Solutions
- Chris Slee, Solution Architect at C5 / Bank of Queensland
- Lace Lofranco, Senior Software Development Engineer & Architect at Microsoft
This series of interviews concluded with an online panel discussion hosted by Michaela, during which all of the Architects discussed what the future holds for IT Architects when there’s no blueprint. The recording of the lively and informative debate is now available to watch on-demand.