The first two articles in this series demonstrated how PostgreSQL is a capable tool for ELT – taking raw input and transforming it into usable data for querying and analyzing. We used sample data from the Advent of Code 2023 to demonstrate some of the ELT techniques in PostgreSQL. In the first article, we discussed … Read more
In the first article in this transforming data series, I discussed how powerful PostgreSQL can be in ingesting and transforming data for analysis. Over the last few decades, this was traditionally done with a methodology called Extract-Transform-Load (ETL) which usually requires external tools. The goal of ETL is to do the transformation work outside of … Read more
In our data hungry world, knowing how to effectively load and transform data from various sources is a highly valued skill. Over the last couple of years, I’ve learned how useful many of the data manipulation functions in PostgreSQL can supercharge your data transformation and analysis process, using just PostgreSQL and SQL. For the last … Read more
One of the most useful constructs in SQL Server is the stored procedure. It gives you a way to do several things. First up, you can store code within the database. Next, you can parameterize queries so that you’re not hard coding or generating ad hoc queries every time you want to call them. You … Read more
Aggregation is a widely used way to summarize the content of a database. It is usually expressed with GROUP BY clause or just using aggregate functions (like COUNT or SUM). When the database engine executes a query with aggregations, it produces individual rows need to compute the required output and then performs the aggregation as … Read more
In this blog, we continue our exploration on PostgreSQL indexes which we started here. In that article, we learned what an index is, and how exactly indexes can help with query execution. But there is much more to learn about indexes! In this blog, we will keep exploring B-tree indexes. We will learn whether (and … Read more
Understanding how to join the data in one table to another is crucial for any data analyst or database developer working with a relational database. Whether you’re a beginner or an experienced SQL user, this article will help you strengthen your SQL skills and become proficient in SQL joins. With several types of joins available, … Read more
One of the technologies that my new job brought with it was learning about all the various database platforms that are not Microsoft SQL Server. Not that I don’t still spend time learning about SQL Server, as it will happily remain one of our largest topics, but rather that I need to learn about other … Read more
PostgreSQL continues to be all the rage in 2023, whether in “vanilla” form of the fully open-source distribution or a variant like Amazon RDS, Neon, Yugabyte, and others. If you’re interested in trying PostgreSQL but only have experience with another database like SQL Server, it can feel a bit daunting to get started. In this … Read more
In the previous blog in this series, we learned how to produce, read and interpret execution plans. We learned that an execution plan provides information about access methods, which PostgreSQL use to select records from a database. Specifically, we observed that in some cases PostgreSQL used sequential scan, and in some cases index-based access. It … Read more
While there are many features within PostgreSQL that are really similar to those within SQL Server, there are some that are unique. One of these unique features is called VACUUM. In my head, I compare this with the tempdb in SQL Server. Not because they act in any way the same or serve similar purposes. … Read more
Concurrency control is an essential aspect of database systems that deals with multiple concurrent transactions. PostgreSQL employs various techniques to ensure concurrent access to the database while maintaining data consistency using atomicity and isolation of ACID (stands for Atomicity, Consistency, Isolation and Durability – https://en.wikipedia.org/wiki/ACID) properties. Concurrency Techniques Broadly there are three concurrency techniques available … Read more
In the last blog (When PostgreSQL Parameter Tuning is not the Answer), we compared several execution plans for a SQL statement as we made changes to parameters and indexes. Still, there was no mention of what an execution plan is, how one can obtain an execution plan for a query, and how to interpret the … Read more
Writing queries to retrieve the data from a database is probably the single most common task when it comes to working with data. Working with data in PostgreSQL is no exception. Further, PostgreSQL has an incredibly rich, wide, and varied set of mechanisms for retrieving data. From standard SELECT… FROM… WHERE to windowing functions and … Read more
So much about parameters tuning, but does it always help? Welcome to the third and final blog of the “magic of parameters” series. In two previous blogs, we discussed how tuning PostgreSQL parameters could help improve overall system performance. However, the very first paragraph of the very first blog on this topic stated that: Although some … Read more
So far in the series I’ve shown how to create databases, tables, constraints, indexes and schema. Now, it’s time to put some of that information to work and begin the process of manipulating data within the database. After all, a database is only useful if there’s information stored within. PostgreSQL makes use of standard SQL … Read more
In the first two articles of this series about PostgreSQL privileges, we reviewed how to create roles, grant them privileges to database objects, and how object ownership is an important aspect in managing access and control within the database. When it comes to managing what roles can access or modify an existing object, ownership is … Read more
Welcome to the second blog of the “magic of parameters” series. In the first entry, I covered memory parameters, and in this article. In this article will talk about PostgreSQL configuration parameters which manage the (auto)vacuum and (auto)analyze background processes. Why vacuuming is necessary? Before we start talking about vacuum and analyze-related parameters, we need … Read more
Having access to the psql command-line tool is essential for any developers or DBAs that are actively working with and connecting to PostgreSQL databases. In our first article, we discussed the brief history of psql and demonstrated how to install it on your platform of choice and connect to a PostgreSQL database. In this article … Read more
PostgreSQL has a separate command-line tool that’s been available for decades and is included with any installation of PostgreSQL. Many long-term PostgreSQL users, developers, and administrators rely on psql to help them quickly connect to databases, examine the schema, and execute SQL queries. Knowing how to install and use basic psql commands is an essential … Read more