Functions and Procedures: Learning PostgreSQL with Grant

Comments 0

Share to social media

This is part of an ongoing series of post by Grant as he takes his vast knowledge of SQL Server and applies it to adding PostgreSQL and shares it with you so you can skip learn from his triumphs and mistakes. For more you can go to the Learning PostgreSQL with Grant series home page

One of the most useful constructs in SQL Server is the stored procedure. It gives you a way to do several things. First up, you can store code within the database. Next, you can parameterize queries so that you’re not hard coding or generating ad hoc queries every time you want to call them. You can put in transaction handling, multiple result sets, security and more. They truly are a useful tool. So, of course, I went straight to CREATE PROCEDURE in PostgreSQL.

I begin reading about procedures in PostgreSQL and I’m suddenly hit with a revelation. Procedures can’t return result sets. They can’t return anything except INOUT parameter values (more on that in a bit). What the heck do we do to return results? Thankfully, Ryan Booz pointed me in the right direction, CREATE FUNCTION.

We have functions in SQL Server. They are of course scalar, table valued or multi-statement table values (AKA, sort of evil). However, they’re not that similar in form or function to SQL Server’s procedures, so what’s going on in PostgreSQL?

The Differences Between Functions and Procedures

In PostgreSQL, CREATE FUNCTION predates CREATE PROCEDURE, by a considerable margin. Procedures were added in PostgreSQL 11 (we’re on 16 as of this writing, and 17 is on the roadmap for later this year!). If you look at the syntax in the links already provided, you’re going to notice quite a few similarities. This is because these two constructs are similar. However, there are a few differences. Let’s talk about them.

First up, as was already mentioned, a procedure in PostgreSQL cannot return a result set. A function can return a result set. In fact, functions can return multiple result sets. Further, this being PostgreSQL, those result sets are objects, which means you can return, not simply a set of columns and rows (tuples), but a table, a cursor (the good kind, and yes, there’s a good kind of cursor, at least in PostgreSQL, it’ll take another article to explain it), other things. Both procedures and functions do allow for output parameters, what they call INOUT parameters. So, why do we need procedures?

The next big difference between procedures and functions is that functions cannot control transactions. Yeah, you read that correctly, you can’t control transactions within functions. Now, don’t get this wrong. Functions are absolutely subject to transactions. For example, if you write a function that inserts data into a table, and there’s an error, the function will get rolled back. If there were multiple statements within the function, they all get rolled back. The function takes part in the greater transaction that called the function. If that transaction gets committed, so does any side effects that occurred in the function. If that transaction is rolled back, so are all the function calls within it. Procedures on the other hand, you can COMMIT and ROLLBACK transactions from within the procedure.

The final major difference is how they’re called. A function is called from a SELECT statement. 50% of the SQL Server people reading this just groaned out loud. The other 50% just fist-pumped because they’ve always wanted the ability to JOIN procedures. A procedure is called from, are you read, the CALL statement. The biggest difference between these is the implications of using a SELECT versus a CALL, but really that’s about it. Oh, I know, it can be a huge difference in the long run, but for our (remember our motto) introductory purposes, it’s a relatively benign process.

There are several other minor differences, especially in how they get used, but these are the three main differentiators. In short, if you want a result set, beyond just a single row from INOUT parameters, you need to use a function. If you need to control transactions within the code, and you’re not returning a result set, you need to use a procedure.

Note: In the sample database I’ve created as a part of this ongoing series, I created a couple of schemas and organized my tables within them. If you wish to execute the code or look at the data structures, the code is in my ScaryDBA/LearningPostgreSQL repository here. The objects and database you will need can be created/reset using the CreateDatabase.sql script, then adding sample data using the SampleData.sql script. The rest of the code needed for this article is in the 12_FunctionsAndProcedures folder.

Now, let’s see these two things in action.

CREATE FUNCTION

The core syntax for creating a function isn’t all that surprising. First, I’ll create a function that returns a result set. Second, I’m going to create a function that inserts a row into a table. I’m using this as an example, rather than returning a result set, so that I can use the exact same code within the procedure example below. This allows us to compare apples to apples.

The core syntax is of course CREATE OR REPLACE FUNCTION, and from there I’ll break it down a bit. Obviously, I supplied a schema and a name for the function, radio.radiodetails. If you don’t supply a schema, it’ll go to whichever one you’re defaulting to at the moment.

Next, I supplied a parameter, radioid. Within the code, in this example, I didn’t use the name. Instead, I referenced the ordinal position of the parameter using $1. In typical production code that would need to be maintained, I wouldn’t use the ordinal position like this because it’s less clear than using the parameter names.

I could modify the code to look like this:

…r.radio_id = radioid …

It is possible to skip the name completely. I could define just the data type of the parameter like this:

…FUNCTION radio.radiodetails (INT)…

Then, I would be forced to use the ordinal position of the parameter in the code. Again, this makes for unclear code, so it’s certainly not a practice I would follow.

By default, parameters are for input to the function. However, you can have output parameters, you just have to define them as such. You can also have parameters that are both input and output. I could completely rewrite the function above like this to return the same information:

I had to give it a new name because you can’t change the RETURN type of a function. I would have had to drop the function and then create it again from scratch.

In the original function, I defined the RETURN as a TABLE and then supplied the table definition. If the query in question is returning multiple rows, this is the way to get it done.

Finally, I defined the language as SQL. You can also make functions in c, internal, or user defined types. Most of the time, queries being queries, it’s likely to be SQL.

To execute the function, I can just call it in a SELECT statement:

SELECT * FROM radio.radiodetails(2);

This will return the parameter values as a result set:

And if I wanted to modify data in the database, I could create a function like this:

The only thing I did here is have it set up as RETURNS VOID since there is no output when this function is called. However, calling the function does result in, to my SQL Server eyeballs, some odd syntax:

You may also see an output from this query like the following (this is the output when calling this function from DBeaver):

While that is the right way to reference the function, calling SELECT in order to INSERT data just feels weird.

There is a lot more detail to the behaviors and definitions around functions, but that’s a good start. Now, let’s see how procedures work.

CREATE PROCEDURE

Procedures in PostgreSQL are really just a type of function, but with the added ability to deal with transactions internally. As close as they are to functions, there are a few differences. Here’s the syntax to create a copy of the code that inserts into the antenna table:

The only real difference here is the removal of the RETURNS clause. Otherwise, it’s identical code. You can return values from a procedure, but only through output parameters, which are defined the same way as with a function.

To see transactions in action, we’ll make a few modifications to the procedure:

The biggest change is where I redefined the LANGUAGE to plpgsql because I want to use IF statements in the code and they’re not supported in straight SQL. Other than that, you can see that I’m taking charge of the transaction to ROLLBACK if values below one is entered for the two ID parameters. That is the big difference between functions and procedures.

Then, to execute the procedure, we use the CALL statement like this:

That will add one row to the antenna table.

Conclusion

While there are some odd things, like using SELECT with an INSERT function, overall the behavior of functions and procedures is relatively straightforward and easy enough to understand.

Functions return a result set, where procedures do not, except output parameter values. Procedures can control transactions where functions do not. With that in mind, generally, I’m assuming the rule is to use functions except where you do need to control the transaction and then use procedures.

 

About the author

Grant Fritchey

See Profile

Grant Fritchey is a Data Platform MVP with over 30 years' experience in IT, including time spent in support and development. He has worked with SQL Server since 6.0 back in 1995. He has also developed in VB, VB.NET, C#, and Java. Grant has written books for Apress and Simple-Talk. Grant presents at conferences and user groups, large and small, all over the world. Grant volunteers for PASS and is on the Board of Directors as the Immediate Past President. He joined Redgate Software as a product advocate January 2011.

Grant Fritchey's contributions