There are some differences and secrets between the UI of a SQL Endpoint and the UI of a Lakehouse. I believe the lakehouse UI was changed recently, otherwise you can blame me for being distracted to this level. Let’s analyze the differences and pending points between these UI’s. SQL Endpoints When using a SQL Endpoint, … Read more
There are plenty of applications and tools available that allow for the movement of data in and out of SQL Server. Some tools are built by Microsoft, such as SSIS or Azure Data Factory. Others are created by third parties, such as Databricks or Snowflake. Still other available options make use of SQL Server features … Read more
I’ve quietly resolved performance issues by re-writing slow queries to avoid DISTINCT. Often, the DISTINCT is there only to serve as a “join-fixer,” and I can explain what that means using an example. Let’s say we have the following grossly simplified schema, representing customers, products, and product categories: And then we have tables for orders … Read more
SQL Server security structure, mechanisms and methods are very thoroughly documented in the Microsoft documentation, but it is quite daunting if you don’t already know about the functionality. I recently had a request to explain some security features of SQL Server so that internal audits could be completed. While thinking about the request and preparing … Read more
While this article is specifically geared to SQL Server, the concepts apply to any relational database platform. The Stack Exchange network logs a lot of web traffic – even compressed, we average well over a terabyte per month. And that is just a summarized cross-section of our overall raw log data, which we load into … Read more
When people start learning a new field, for example T-SQL, it’s tempting to spend very little time trying to understand the fundamentals of the field so that you can quickly get to the advanced parts. You might think that you already understand what certain ideas, concepts and terms mean, so you don’t necessarily see the … Read more
Over the years, SQL Server Statistics have been discussed in countless blog posts, articles, and presentations, and I believe that they will remain a core topic for a while when speaking about performance. Why is that? Well, if we were to consider the equivalent of Maslow’s hierarchy of needs for database performances, statistics would be … Read more
If you haven’t already heard, SQL Server 2022 introduced a new built-in system function called GREATEST. Simply put, it is to a set of columns, variables, expressions etc. what the MAX function is to a set of values (i.e., rows) of a single column or expression. The opposite of GREATEST function is LEAST function which … Read more
In 2012, Aaron Bertrand said most everything I knew (and a bit more) about the issues with using the sp_ prefix. Procedures prefixed with sp_ have special powers when placed in the the master database in that it can be executed anywhere on the server after that. Nothing much has change in those suggestions. It … Read more
At Stack Overflow, our environment has multiple implementations of a largely – but not 100% – identical schema. (By “schema,” I mostly mean the set of tables within a database.) I sometimes need to retrieve or update data across a large number of sites. Let’s say, pulling or removing information from the Users table, and … Read more
My initial goal of documenting and exploring SQL Server Row Level Security (RLS) was to show the basic functionality and focus on a few performance and administrative items. I also wanted to confirm my base assumption that it is very secure. This security makes it useful in many situations to segregate data by user groups, … Read more
A challenge that reappears periodically in the world of databases (especially database management) is the need to run code on a subset of databases and to do so in a nuanced manner. Some maintenance or metrics collection processes can be simply run against every database on a server with no ill-effect, but others may be … Read more
Did you know you can include a SELECT statement within another SELECT statement? When a SELECT statement is embedded within another statement it is known as a subquery. There are two types of subqueries: basic subquery and correlated subquery. In this article I will be discussing both types of subqueries and will be providing examples … Read more
One of the new Extended Event available in SQL Server 2022 is the query_antipattern. This extended event allows to identify anti-patterns on the SQL queries sent to the server. An anti-pattern in this case is some code that the SQL Server optimizer can’t do a great job optimizing the code (but cannot correct the … Read more
A student of mine sent me a T-SQL challenge involving gap-filling of missing dates and balances. I found the challenge interesting and common enough and figured that my readers would probably find it interesting to work on as well. I’ll present the challenge and three possible solutions, as well as the results of a performance … Read more
This is the beginning of a series on SQL Server Dynamic Data Masking. Dynamic Data Masking is a concept familiar with all developers and users of sensitive data. It is implemented in SQL Server with simplicity and elegance, requiring minimal changes to front end applications, including reporting, and almost no changes to queries. The series … Read more
When I first saw a bullet item stating SQL Server 2022 would support instant file initialization for log file growth, I was excited. When I later learned it only applies to automatic growths, and only those of 64 MB or less, I was a little less excited. With those limitations, I was skeptical this enhancement … Read more
One of the more challenging technical details of columnstore indexes that regularly gets attention is the need for data to be ordered to allow for segment elimination. In a non-clustered columnstore index, data order is automatically applied based on the order of the underlying rowstore data. In a clustered columnstore index, though, data order is … Read more
I have come across a lot of use cases for manufacturing rows on the fly, aside from the common goal of populating a large data set such as a numbers or calendar table. A few favorites include building sample data, pivoting an unknown number of columns, data extrapolation, and filling gaps in date or time … Read more