PostgreSQL Schema: Learning PostgreSQL with Grant

Comments 0

Share to social media

This is part of an ongoing series of post by Grant as he takes his vast knowledge of SQL Server and applies it to adding PostgreSQL and shares it with you so you can skip learn from his triumphs and mistakes. For more you can go to the Learning PostgreSQL with Grant series home page

An important aspect of building and managing a database is organizing the objects within your database. You may have tables that support different functions, for example, a schema for warehousing operations and another for sales. Some logins may need access to some tables, but not others. You might want to isolate one set of objects within a database from other sets of objects. All of this, and more, can be accomplished using schemas within a database and PostgreSQL supports the use of schema for just these types of functions.

In the sample database I’ve created as a part of this ongoing series, I created a couple of schemas and organized my tables within them. You can see the database here in the CreateDatabase.sql script. The rest of the code in this article is in the folder 08_Schema.

Managing Schema

A schema is used first and foremost as an organizing mechanism on your database. From there, you can get into using schema to help design security, manage access, and generally control what users can see and do within your database. When you create a blank database, it comes with a default schema, public.

When you create an object, like a table, it will automatically go into your default schema unless you specify otherwise. By default, all logins to the database have access to the public schema (PostgreSQL 15 has changed this default, so now users do not have rights to create objects in the public schema). Other than these default behaviors, the public schema is just another schema in the database and most of the functions and rules we’ll discuss in the rest of the article apply.

To get stated creating your own schemas, the syntax is very straightforward:

This creates a schema called mytestschema. To create a table within that schema, you simply use a two part name (schema_name.table_name) for the table within the CREATE TABLE command like this:

It’s the same with any queries too:

You can think of the schema as the owner of the table (the owner of the schema is technically the owner of the table). Defining the owner in all your code helps ensure that accidents don’t happen. Because, when you start using schema, you can define the same object names if they’re in different schemas. Keeping object names distinct is a good practice, but sometimes the same name is the best name in different schemas):

This is perfectly valid. If I were to write, what I consider poor code, like this:

This likely results in the following error:

ERROR: relation "testtable" does not exist

LINE 2: select * from testtable;

Initially, is seems like it’s an error because PostgreSQL can’t figure out which of the two testtable tables to pull from. Rather, it’s because logins have a default schema. When I run a query like the one immediately preceding, without a schema identifying where the table lives, PostgreSQL looks in my default search path. If it’s not there, well, that table doesn’t exist. This is true even though, I’ve got two with that name. PostgreSQL doesn’t check other schemas “just in case.”

Later in the article, I will discuss how to manage the schema defaults.

If the schema is empty, you can drop it:

If I do not drop the table first, an error will occur:

SQL Error [2BP01]: ERROR: cannot drop schema mytestschema because other objects depend on it

Detail: table mytestschema.testtable depends on schema mytestschema

Hint: Use DROP ... CASCADE to drop the dependent objects too.

In the error message, there is a hint for how to get around this. I could rewrite my query like this:

The beauty of this is that it will remove all tables, views, etc., within the given schema. That is also the horror of this syntax, that it’s going to remove all the tables, views, etc., without consulting with you in any way.

There is a default schema in every database that gets created, public. However, it is just the default and like most defaults, it can be changed. In fact, you can even drop the public schema if you so choose. I started this section detailing how to create your own schema because, I think it’s a good practice to organize your data storage into a defined schema that you directly control, as opposed to just accepting the defaults.

Controlling the Default Search Path

In addition to helping you organize your database objects; schema helps you control access to those objects. I haven’t delved into security yet in this series, and it’s likely a little way off still. However, I will talk some about how schema helps you manage security on your database. (My teammate Ryan Booz did recently publish an article on the subject, “PostgreSQL Basics: Roles and Privileges”.)

In this section I want to detail some of the ways you can manage the default schema.

In the last example of the section above, I showed how you can have duplicate table names between schema, but, that you must reference the schema name to access those tables. However, this isn’t the whole story.

There’s actually a defined search list for schema that you can see using this query:

If you haven’t changed anything in your server, the default results are:

"$user",public

Each user has a schema of their own, like SQL Server. That’s the $user schema you see above. However, if you don’t specify the schema, it’ll default to the first one in the search list, public in this case. We can add schema to the search list for the current connection:

That will not only add the radio schema to the the search_path, it will rearrange the order on the search_path so that the radio schema is searched before the public schema. If you disconnect and come back to the connection, you will have to reset the path when using the SET command.

If you want to make the changes to the path the default, you can use the ALTER ROLE to set any role to have a specific search path. For example:

If you want to set the default for the server\cluster\database, you can modify the search_path in the postgressql.cnf file, or you can use:

This will not override the individual path’s set but will make every login that does not override the search path need to specify the schema name with referencing any object. (Which as noted, is a best practice.)

Ownership and Basic Privileges

When you create a schema, you can define a schema owner as other than the login that is running the command:

A schema I haven’t created yet, secureschema, would be created with the owner being the radio_admin login role (also not yet defined because I am not digging into security yet). That will ensure that only that the radio_admin login, and any accounts defined as superuser, of course, can work within that schema.

You can also control behaviors on schema. For example, since I have set up independent schema in this database and intend to use it in that manner, I can revoke access for all logins to create objects on the public schema (This is only necessary in Postgres 14 and earlier, in 15 create is not granted by default.):

This is using two different meanings of the word “public”. In the first, ‘public’, we’re referring to the schema of that name. In the second, ‘PUBLIC’, we’re talking about the role that contains all users within the database. This is a shorthand mechanism to make sure no one accidentally puts things into the public schema. I’d say it’s a good practice to follow if you’re going to use other schemas, especially if you’re using them to help properly secure your database.

You can grant various privileges between schema and users such that a given user can read data from tables within a schema, but not modify the data contained in the tables (read only access). In this way, you can combine multiple types of data within one database but isolate them from each other as needed. This is a principal reason for using schema in your database.

If you’re not isolating storage and access between schema, it may make less sense to use a schema other than public in the first place. However, most applications have varying levels of access they would like to manage, and schema will lend themselves to appropriately implementing that type of security. If security isn’t a concern, using schema names instead of putting all objects in the public schema can be advantageous for documentation as well.

Conclusion

Schemas are containers that allow you to segment objects and security at a lower level than a database. Using schemas other than public has good benefits. In PostgreSQL there are several methods of setting the default schema if your users are averse to using two-part names.

If you are familiar with schemas from SQL Server, the core functionality of schema is roughly the same as in SQL Server. However, there is added functionality like the ability to control a search list changes the types of behaviors you can control within PostgreSQL.

 

About the author

Grant Fritchey

See Profile

Grant Fritchey is a Data Platform MVP with over 30 years' experience in IT, including time spent in support and development. He has worked with SQL Server since 6.0 back in 1995. He has also developed in VB, VB.NET, C#, and Java. Grant has written books for Apress and Simple-Talk. Grant presents at conferences and user groups, large and small, all over the world. Grant volunteers for PASS and is on the Board of Directors as the Immediate Past President. He joined Redgate Software as a product advocate January 2011.

Grant Fritchey's contributions