A global, science-led, patient-focused pharmaceutical company with over 80,000 employees. Their Genomics Research (GR) specializes in discovering statistical variants from population-scale genomic sequencing.
The GR Data Team get data into the hands of scientists through a specialized Analysis Platform. But as the scale of the data and size of the team grew, maintaining consistency became the biggest challenge.
Redgate Flyway has enabled this growth to continue uninterrupted while mitigating risk. For the team, peace of mind is their biggest win, knowing that what they tested in pre-production is the same as production.
The speed to scientific discovery has been notably boosted by Redgate Flyway. Processes that previously took five days can now be done in less than one, and the hands-on hours required reduced from 20 to less than two.
This global, science-led, patient-focused pharmaceutical company has over 80,000 employees worldwide. The are committed to revolutionizing the future of healthcare by harnessing the potential of science and medicine to benefit people, society, and the planet.
Its Genomics Research (GR) specializes in building global, population-scale genomics to identify and analyze statistically significant genome variants that can be used to advance new drug discoveries or clinical trials. To enable this vital work, the GR Director and his team manage their specialized Analysis Platform, the largest and most comprehensive exome-wide genotype-phenotype dataset currently available.
The GR Director’s tagline for the team is “To get data into the hands of scientists”, which is no easy ask. In just one recent study, the GR examined the exome sequences of 269,171 Biobank participants, alongside the records of 18,780 phenotypes and identified 46,837 variant-level and 1,703 gene-level statistically significant relationships.
“As the team grew it became much harder to keep things consistent. Trying to keep everything in sync was one of the big challenges we faced.”
The Director of the GR Data Team explained that when he started five years ago there were just three people in the team, and he was the only developer. Since then, the team has grown to 10 developers working on their MySQL environment in varying roles across workflow, Python, DevOps, and CI/CD pipelines. However, the team isn’t the only thing that has seen huge growth over the past five years. What started as one RDS cluster with 23 chromosomes is now 290TB of data across 500 schemas in the team’s AWS RDS-hosted Analysis Platform.
Given the niche data required for the GR scientists to analyze, it was the complexity, not the size, of the Analysis Platform that became their biggest challenge. In DNA, there are normally 23 pairs of chromosomes, and at GR they require a schema for every chromosome, each of which has to be exactly the same. There are also some genomes which can have several regions of interest for scientists. So, they have the complexity of every schema needing to be the same, but also scaled for every chromosome, totaling over 500 schemas to provision across their test, pre-production, and production environments.
As the Data Team Director summarized: “What we needed a solution for was not how many billion rows are in the database, but the number of schemas and the fact that they all have to be consistently the same.”
Every time they needed to update the database schemas, the team had to manually create a script and run it against each schema which was both error-prone and cumbersome to manage. Individuals also had to keep track of which updates needed to be applied to which environments, to ensure that the test, pre-production, and production environments were kept in sync. As the Director explains: “Rather than heading off a problem we were suffering, the decision to introduce Redgate Flyway was to mitigate the risks that growth like ours implies.”
“If Redgate Flyway has run in our pipeline, we’re confident the changes have been implemented in all the environments. This peace of mind is key for us.”
The major catalyst for introducing Redgate Flyway was the growth of their data, and the team manging this. But the GR Data Team also wanted to implement an automated method for database schema management across all environments, from development through to production. The Director knew there was a better way of working, and one of the developers on his team suggested that third-party tools could reduce the friction and operational overhead of using manual scripts.
The team ran their due diligence, looking at a range of vendor and open-source options before leveraging a Redgate Flyway trial license to prove that the software and methodology would work for their AWS CI/CD pipeline.
Given their scale and complexity, the Data Team knew what was required to succeed with the project: “We needed to partner with an organization that could offer Enterprise level software and support. When we found Redgate and Flyway, even from the first meeting, the support and documentation available demonstrated that this was an Enterprise shop, and over the past 12 months we’ve been proven right.”
During their Redgate Flyway trial the team noted how straightforward the solution was, which enabled them to set up their specific use case quickly and easily. When they purchased Redgate Flyway, the Director was met with positivity from the team who knew it was a timesaver that would remove a lot of the hassle and manual tasks from their day-to-day work.
The whole concept of the way they could name files easily and have the ability to roll back migrations was so simple. It was no more complex than having a naming convention for their schema changes and sequence, and then running the commands to migrate. The team found those commands easy to remember and, if they wanted to reinitialize everything in development, they could do so, simply. Not only did Redgate Flyway provide an automated method for database schema management across environments and remove the need for error-prone manual scripts, it created a standardized process for reviewing all changes prior to deploying to production.
The Data Team Director summarized their experience: “The way Redgate Flyway works is very elegant and simple, and this was a major reason behind our purchase. It was easy to take it and put it into our pipeline, and within a couple of days of getting our hands on the software we had Flyway up and running in production.”
“A new feature request would have taken 5 days, but now with Flyway we’ve reduced that to 1 day and less than 2 hours of hands-on work. Those are massive time savings.”
When a scientist develops an idea to look at new data not yet captured in the Analysis Platform, the team have to go through a feature request process. It could be a new cluster to stand up, a new schema to develop or adding a new table to an existing schema. The time to get all that structured and working across their development, test, pre-production, and production environments used to take five days.
With Redgate Flyway, developers simply open up a pull request to create a new SQL database object. This then goes through a code review process and, once peer-approved, is automatically added into their CI/CD pipeline that same day. As the Director explains: “This is where Redgate Flyway is a workhorse, or an engine. It cranks away with nobody having to do anything, and at the end of the day it’s in production.”
Although the team may only receive one new feature request a month, it used to take a whole working week to action the required database schemas manually and involve around 20 hours of hands-on effort from a developer. Now, being able to deliver that new functionality into the hands of scientists within a 24-hour turnaround and needing less than two hours of hands-on processing, is a huge benefit to the wider business. “We need to be able to support feature requests quickly, because we want to know if this new something is a big idea that the organization should pursue or not.”
The quicker turnaround of feature requests is not the only advantage the Redgate Flyway implementation has delivered. It has also removed the operational headache and manual burden of maintaining custom scripts and remembering the order of applying those changes. With Redgate Flyway keeping everything in sync in the pipeline, the team have the confidence that all schemas across their environments are consistent. For the Data Team this is the biggest benefit: “The advantage is peace of mind. I know, guaranteed, that what I tested in the test environment is what is in production with Redgate Flyway. I no longer have to troubleshoot or check what’s different between test and production. All of that uncertainty goes away.”
The success that the GR Data Team have seen with increased speed of delivery and consistency across environments since implementing Redgate Flyway, has been shared across the wider organization. As a result, multiple teams working on the company-wide Data Platform have also implemented Redgate Flyway and started to see similar results.
When asked about the future of Redgate Flyway, the Director of the Data Team summarized his thoughts: “The data is always going to get bigger, but now with Redgate Flyway as part of our core engine room it’s so easy. Our speed to discovery of scientific merit has been notably boosted by Flyway, that’s the beauty of it”.
The speed to scientific discovery has been notably boosted by Redgate Flyway. Processes that previously took five days can now be done in less than one, and the hands-on hours required have reduced from 20 to less than two.
A global, science-led, patient-focused pharmaceutical company with over 80,000 employees. Their Genomics Research (GR) specializes in discovering statistical variants from population-scale genomic sequencing.
The GR Data Team get data into the hands of scientists through a specialized Analysis Platform. But as the scale of the data and size of the team grew, maintaining consistency became the biggest challenge.
Redgate Flyway has enabled this growth to continue uninterrupted while mitigating risk. For the team, peace of mind is their biggest win, knowing that what they tested in pre-production is the same as production.
The speed to scientific discovery has been notably boosted by Redgate Flyway. Processes that previously took five days can now be done in less than one, and the hands-on hours required reduced from 20 to less than two.
“As the team grew it became much harder to keep things consistent. Trying to keep everything in sync was one of the big challenges we faced.”
This global, science-led, patient-focused pharmaceutical company has over 80,000 employees worldwide. The are committed to revolutionizing the future of healthcare by harnessing the potential of science and medicine to benefit people, society, and the planet.
Its Genomics Research (GR) specializes in building global, population-scale genomics to identify and analyze statistically significant genome variants that can be used to advance new drug discoveries or clinical trials. To enable this vital work, the GR Director and his team manage their specialized Analysis Platform, the largest and most comprehensive exome-wide genotype-phenotype dataset currently available.
The GR Director’s tagline for the team is “To get data into the hands of scientists”, which is no easy ask. In just one recent study, the GR examined the exome sequences of 269,171 Biobank participants, alongside the records of 18,780 phenotypes and identified 46,837 variant-level and 1,703 gene-level statistically significant relationships.
“If Redgate Flyway has run in our pipeline, we’re confident the changes have been implemented in all the environments. This peace of mind is key for us.”
The Director of the GR Data Team explained that when he started five years ago there were just three people in the team, and he was the only developer. Since then, the team has grown to 10 developers working on their MySQL environment in varying roles across workflow, Python, DevOps, and CI/CD pipelines. However, the team isn’t the only thing that has seen huge growth over the past five years. What started as one RDS cluster with 23 chromosomes is now 290TB of data across 500 schemas in the team’s AWS RDS-hosted Analysis Platform.
Given the niche data required for the GR scientists to analyze, it was the complexity, not the size, of the Analysis Platform that became their biggest challenge. In DNA, there are normally 23 pairs of chromosomes, and at GR they require a schema for every chromosome, each of which has to be exactly the same. There are also some genomes which can have several regions of interest for scientists. So, they have the complexity of every schema needing to be the same, but also scaled for every chromosome, totaling over 500 schemas to provision across their test, pre-production, and production environments.
As the Data Team Director summarized: “What we needed a solution for was not how many billion rows are in the database, but the number of schemas and the fact that they all have to be consistently the same.”
Every time they needed to update the database schemas, the team had to manually create a script and run it against each schema which was both error-prone and cumbersome to manage. Individuals also had to keep track of which updates needed to be applied to which environments, to ensure that the test, pre-production, and production environments were kept in sync. As the Director explains: “Rather than heading off a problem we were suffering, the decision to introduce Redgate Flyway was to mitigate the risks that growth like ours implies.”
“A new feature request would have taken 5 days, but now with Flyway we’ve reduced that to 1 day and less than 2 hours of hands-on work. Those are massive time savings.”
The major catalyst for introducing Redgate Flyway was the growth of their data, and the team manging this. But the GR Data Team also wanted to implement an automated method for database schema management across all environments, from development through to production. The Director knew there was a better way of working, and one of the developers on his team suggested that third-party tools could reduce the friction and operational overhead of using manual scripts.
The team ran their due diligence, looking at a range of vendor and open-source options before leveraging a Redgate Flyway trial license to prove that the software and methodology would work for their AWS CI/CD pipeline.
Given their scale and complexity, the Data Team knew what was required to succeed with the project: “We needed to partner with an organization that could offer Enterprise level software and support. When we found Redgate and Flyway, even from the first meeting, the support and documentation available demonstrated that this was an Enterprise shop, and over the past 12 months we’ve been proven right.”
During their Redgate Flyway trial the team noted how straightforward the solution was, which enabled them to set up their specific use case quickly and easily. When they purchased Redgate Flyway, the Director was met with positivity from the team who knew it was a timesaver that would remove a lot of the hassle and manual tasks from their day-to-day work.
The whole concept of the way they could name files easily and have the ability to roll back migrations was so simple. It was no more complex than having a naming convention for their schema changes and sequence, and then running the commands to migrate. The team found those commands easy to remember and, if they wanted to reinitialize everything in development, they could do so, simply. Not only did Redgate Flyway provide an automated method for database schema management across environments and remove the need for error-prone manual scripts, it created a standardized process for reviewing all changes prior to deploying to production.
The Data Team Director summarized their experience: “The way Redgate Flyway works is very elegant and simple, and this was a major reason behind our purchase. It was easy to take it and put it into our pipeline, and within a couple of days of getting our hands on the software we had Flyway up and running in production.”
When a scientist develops an idea to look at new data not yet captured in the Analysis Platform, the team have to go through a feature request process. It could be a new cluster to stand up, a new schema to develop or adding a new table to an existing schema. The time to get all that structured and working across their development, test, pre-production, and production environments used to take five days.
With Redgate Flyway, developers simply open up a pull request to create a new SQL database object. This then goes through a code review process and, once peer-approved, is automatically added into their CI/CD pipeline that same day. As the Director explains: “This is where Redgate Flyway is a workhorse, or an engine. It cranks away with nobody having to do anything, and at the end of the day it’s in production.”
Although the team may only receive one new feature request a month, it used to take a whole working week to action the required database schemas manually and involve around 20 hours of hands-on effort from a developer. Now, being able to deliver that new functionality into the hands of scientists within a 24-hour turnaround and needing less than two hours of hands-on processing, is a huge benefit to the wider business. “We need to be able to support feature requests quickly, because we want to know if this new something is a big idea that the organization should pursue or not.”
The quicker turnaround of feature requests is not the only advantage the Redgate Flyway implementation has delivered. It has also removed the operational headache and manual burden of maintaining custom scripts and remembering the order of applying those changes. With Redgate Flyway keeping everything in sync in the pipeline, the team have the confidence that all schemas across their environments are consistent. For the Data Team this is the biggest benefit: “The advantage is peace of mind. I know, guaranteed, that what I tested in the test environment is what is in production with Redgate Flyway. I no longer have to troubleshoot or check what’s different between test and production. All of that uncertainty goes away.”
The success that the GR Data Team have seen with increased speed of delivery and consistency across environments since implementing Redgate Flyway, has been shared across the wider organization. As a result, multiple teams working on the company-wide Data Platform have also implemented Redgate Flyway and started to see similar results.
When asked about the future of Redgate Flyway, the Director of the Data Team summarized his thoughts: “The data is always going to get bigger, but now with Redgate Flyway as part of our core engine room it’s so easy. Our speed to discovery of scientific merit has been notably boosted by Flyway, that’s the beauty of it”.