Earlier this year, we migrated the entire Stack Overflow for Teams platform to Azure. This was a lengthy endeavour and Wouter de Kort, one of our core engineers, wrote about multiple technical aspects of the project in these posts: Journey to the cloud part I: Migrating Stack Overflow Teams to Azure Journey to the cloud … Read more
PostgreSQL continues to be all the rage in 2023, whether in “vanilla” form of the fully open-source distribution or a variant like Amazon RDS, Neon, Yugabyte, and others. If you’re interested in trying PostgreSQL but only have experience with another database like SQL Server, it can feel a bit daunting to get started. In this … Read more
There are some differences and secrets between the UI of a SQL Endpoint and the UI of a Lakehouse. I believe the lakehouse UI was changed recently, otherwise you can blame me for being distracted to this level. Let’s analyze the differences and pending points between these UI’s. SQL Endpoints When using a SQL Endpoint, … Read more
Recently I spoke at SQL Saturday Denver. The day after the conference we went to visit Pikes Peak Mountain. You have a couple of choices of how you can get to the top of Pikes Peak Summit: hike, drive, or take the Cog railway. Each method has its own advantages and disadvantages. If you take … Read more
There are plenty of applications and tools available that allow for the movement of data in and out of SQL Server. Some tools are built by Microsoft, such as SSIS or Azure Data Factory. Others are created by third parties, such as Databricks or Snowflake. Still other available options make use of SQL Server features … Read more
Organizations of all types and sizes are turning to the cloud for their application and data storage requirements. The cloud makes it possible for them to deploy their workloads more quickly and to scale them up and down as requirements change. Not only does this increase the organization’s flexibility, but it also frees up IT … Read more
I’ve quietly resolved performance issues by re-writing slow queries to avoid DISTINCT. Often, the DISTINCT is there only to serve as a “join-fixer,” and I can explain what that means using an example. Let’s say we have the following grossly simplified schema, representing customers, products, and product categories: And then we have tables for orders … Read more
Protecting and controlling access to your data against unauthorized person(s) is crucial in any organization. Unauthorized entry or modification of your data can lead to severe and sometimes irreversible damage. Just like we mentioned at the beginning of this series, MySQL Replication is a process where data from one MySQL database known as the source … Read more
In the previous blog in this series, we learned how to produce, read and interpret execution plans. We learned that an execution plan provides information about access methods, which PostgreSQL use to select records from a database. Specifically, we observed that in some cases PostgreSQL used sequential scan, and in some cases index-based access. It … Read more
Dr. Codd first described the relational model in a paper in Communications of the ACM (CACM 13 No 6; June 1970). Some more work followed up after that by other people, giving us normal forms and other things we have taken for granted for 50+ years. I was looking at some postings on a SQL … Read more
While this article is specifically geared to SQL Server, the concepts apply to any relational database platform. The Stack Exchange network logs a lot of web traffic – even compressed, we average well over a terabyte per month. And that is just a summarized cross-section of our overall raw log data, which we load into … Read more
When people start learning a new field, for example T-SQL, it’s tempting to spend very little time trying to understand the fundamentals of the field so that you can quickly get to the advanced parts. You might think that you already understand what certain ideas, concepts and terms mean, so you don’t necessarily see the … Read more
Tables in a MySQL database are commonly related to one another, often in multiple ways, and it is only by linking the tables together that you can derive meaningful information from the data. To connect these tables together, you can use the JOIN clause, which you include in your SELECT, DELETE, or UPDATE statements. The … Read more
Over the years, SQL Server Statistics have been discussed in countless blog posts, articles, and presentations, and I believe that they will remain a core topic for a while when speaking about performance. Why is that? Well, if we were to consider the equivalent of Maslow’s hierarchy of needs for database performances, statistics would be … Read more
If you haven’t already heard, SQL Server 2022 introduced a new built-in system function called GREATEST. Simply put, it is to a set of columns, variables, expressions etc. what the MAX function is to a set of values (i.e., rows) of a single column or expression. The opposite of GREATEST function is LEAST function which … Read more
Jenkins is an open-source multi-platform software for continuous integration/continuous delivery and deployment (CI/CD) in DevOps. It is one of the most common CI/CD tools. Jenkins uses CI/CD pipelines to automate the software development and deployment workflows. Jenkins uses plugins to integrate with DevOps tools such as Docker and Kubernetes. Jenkins plugins can also be used … Read more
Over the past 10 years, I have been involved in scheduling speakers at several conferences, most notably the PASS Data Community Summit. Since starting at Red-Gate last September, I have been much less involved in planning the Summit. Still, I have been able to help with the schedule by looking it over and looking for … Read more
MySQL provides a set of constraints that you can include in your table definitions to help ensure the integrity of the data. The constraints let you better control the types of data that can be added to the database. For example, you can use constraints to set a column’s uniqueness or nullability, specify a default … Read more
Microsoft is merging Data Factory and Power BI Dataflows in one single ETL solution. It’s not a simple merging, but still these ETL tools are easier to use then ever. In this article, I will demonstrate how to create a lakehouse and ingest data using a step-by-step walkthrough that you can follow along with. You … Read more
If you’ve been using MySQL for a while and want to learn how to orchestrate MySQL containers, you’ve come to the right place. And while using Docker on its own to manage a single MySQL instance or multiple instances has certain drawbacks, such as lacking the ability to orchestrate multiple instances, scale, and provide services … Read more