This is part of an ongoing series of post by Grant as he takes his vast knowledge of SQL Server and applies it to adding PostgreSQL and shares it with you so you can skip learn from his triumphs and mistakes. For more you can go to the Learning PostgreSQL with Grant series home page
One of the most useful constructs in SQL Server is the stored procedure. It gives you a way to do several things. First up, you can store code within the database. Next, you can parameterize queries so that you’re not hard coding or generating ad hoc queries every time you want to call them. You can put in transaction handling, multiple result sets, security and more. They truly are a useful tool. So, of course, I went straight to CREATE PROCEDURE in PostgreSQL.
I begin reading about procedures in PostgreSQL and I’m suddenly hit with a revelation. Procedures can’t return result sets. They can’t return anything except INOUT
parameter values (more on that in a bit). What the heck do we do to return results? Thankfully, Ryan Booz pointed me in the right direction, CREATE FUNCTION.
We have functions in SQL Server. They are of course scalar, table valued or multi-statement table values (AKA, sort of evil). However, they’re not that similar in form or function to SQL Server’s procedures, so what’s going on in PostgreSQL?
The Differences Between Functions and Procedures
In PostgreSQL, CREATE FUNCTION
predates CREATE PROCEDURE
, by a considerable margin. Procedures were added in PostgreSQL 11 (we’re on 16 as of this writing, and 17 is on the roadmap for later this year!). If you look at the syntax in the links already provided, you’re going to notice quite a few similarities. This is because these two constructs are similar. However, there are a few differences. Let’s talk about them.
First up, as was already mentioned, a procedure in PostgreSQL cannot return a result set. A function can return a result set. In fact, functions can return multiple result sets. Further, this being PostgreSQL, those result sets are objects, which means you can return, not simply a set of columns and rows (tuples), but a table, a cursor (the good kind, and yes, there’s a good kind of cursor, at least in PostgreSQL, it’ll take another article to explain it), other things. Both procedures and functions do allow for output parameters, what they call INOUT
parameters. So, why do we need procedures?
The next big difference between procedures and functions is that functions cannot control transactions. Yeah, you read that correctly, you can’t control transactions within functions. Now, don’t get this wrong. Functions are absolutely subject to transactions. For example, if you write a function that inserts data into a table, and there’s an error, the function will get rolled back. If there were multiple statements within the function, they all get rolled back. The function takes part in the greater transaction that called the function. If that transaction gets committed, so does any side effects that occurred in the function. If that transaction is rolled back, so are all the function calls within it. Procedures on the other hand, you can COMMIT
and ROLLBACK
transactions from within the procedure.
The final major difference is how they’re called. A function is called from a SELECT
statement. 50% of the SQL Server people reading this just groaned out loud. The other 50% just fist-pumped because they’ve always wanted the ability to JOIN
procedures. A procedure is called from, are you read, the CALL
statement. The biggest difference between these is the implications of using a SELECT
versus a CALL
, but really that’s about it. Oh, I know, it can be a huge difference in the long run, but for our (remember our motto) introductory purposes, it’s a relatively benign process.
There are several other minor differences, especially in how they get used, but these are the three main differentiators. In short, if you want a result set, beyond just a single row from INOUT
parameters, you need to use a function. If you need to control transactions within the code, and you’re not returning a result set, you need to use a procedure.
Note: In the sample database I’ve created as a part of this ongoing series, I created a couple of schemas and organized my tables within them. If you wish to execute the code or look at the data structures, the code is in my ScaryDBA/LearningPostgreSQL
repository here. The objects and database you will need can be created/reset using the CreateDatabase.sql script, then adding sample data using the SampleData.sql script. The rest of the code needed for this article is in the 12_FunctionsAndProcedures
folder.
Now, let’s see these two things in action.
CREATE FUNCTION
The core syntax for creating a function isn’t all that surprising. First, I’ll create a function that returns a result set. Second, I’m going to create a function that inserts a row into a table. I’m using this as an example, rather than returning a result set, so that I can use the exact same code within the procedure example below. This allows us to compare apples to apples.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
CREATE OR REPLACE FUNCTION radio.radiodetails (radioid INT) RETURNS TABLE(radio_id int, radio_name varchar(100)) AS $$ SELECT r.radio_id, r.radio_name FROM radio.radios AS r WHERE r.radio_id = $1 $$ LANGUAGE SQL; |
The core syntax is of course CREATE OR REPLACE FUNCTION
, and from there I’ll break it down a bit. Obviously, I supplied a schema and a name for the function, radio.radiodetails
. If you don’t supply a schema, it’ll go to whichever one you’re defaulting to at the moment.
Next, I supplied a parameter, radioid
. Within the code, in this example, I didn’t use the name. Instead, I referenced the ordinal position of the parameter using $1. In typical production code that would need to be maintained, I wouldn’t use the ordinal position like this because it’s less clear than using the parameter names.
I could modify the code to look like this:
…r.radio_id = radioid …
It is possible to skip the name completely. I could define just the data type of the parameter like this:
…FUNCTION radio.radiodetails (INT)…
Then, I would be forced to use the ordinal position of the parameter in the code. Again, this makes for unclear code, so it’s certainly not a practice I would follow.
By default, parameters are for input to the function. However, you can have output parameters, you just have to define them as such. You can also have parameters that are both input and output. I could completely rewrite the function above like this to return the same information:
1 2 3 4 5 6 7 8 9 10 11 12 |
CREATE OR REPLACE FUNCTION radio.radiodetails2 (INOUT radioid INT, OUT radioname varchar(100)) AS $$ SELECT r.radio_id, r.radio_name FROM radio.radios AS r WHERE r.radio_id = radioid $$ LANGUAGE SQL; |
I had to give it a new name because you can’t change the RETURN
type of a function. I would have had to drop the function and then create it again from scratch.
In the original function, I defined the RETURN
as a TABLE
and then supplied the table definition. If the query in question is returning multiple rows, this is the way to get it done.
Finally, I defined the language as SQL. You can also make functions in c, internal, or user defined types. Most of the time, queries being queries, it’s likely to be SQL.
To execute the function, I can just call it in a SELECT
statement:
SELECT * FROM radio.radiodetails(2);
This will return the parameter values as a result set:
1 2 3 |
radio_id|radio_name| -------+----------+ 2|FT3D | |
And if I wanted to modify data in the database, I could create a function like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
CREATE OR REPLACE FUNCTION radio.addantenna (antennaname varchar(100), manufacturerid int, connectortypeid int) RETURNS VOID AS $$ INSERT INTO radio.antenna (antenna_name, manufacturer_id, connectortype_id) VALUES (antennaname, manufacturerid, connectortypeid) $$ LANGUAGE SQL; |
The only thing I did here is have it set up as RETURNS VOID
since there is no output when this function is called. However, calling the function does result in, to my SQL Server eyeballs, some odd syntax:
1 2 3 4 5 6 |
SELECT * FROM radio.addantenna('Stubby', 4, 2); |
You may also see an output from this query like the following (this is the output when calling this function from DBeaver):
1 2 3 |
addantenna| ----------+ | |
While that is the right way to reference the function, calling SELECT
in order to INSERT
data just feels weird.
There is a lot more detail to the behaviors and definitions around functions, but that’s a good start. Now, let’s see how procedures work.
CREATE PROCEDURE
Procedures in PostgreSQL are really just a type of function, but with the added ability to deal with transactions internally. As close as they are to functions, there are a few differences. Here’s the syntax to create a copy of the code that inserts into the antenna table:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
CREATE OR REPLACE PROCEDURE radio.newantenna ( antennaname varchar(100), manufacturerid int, connectortypeid int) AS $$ INSERT INTO radio.antenna (antenna_name, manufacturer_id, connectortype_id) VALUES (antennaname, manufacturerid, connectortypeid) $$ LANGUAGE SQL; |
The only real difference here is the removal of the RETURNS
clause. Otherwise, it’s identical code. You can return values from a procedure, but only through output parameters, which are defined the same way as with a function.
To see transactions in action, we’ll make a few modifications to the procedure:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
CREATE OR REPLACE PROCEDURE radio.newantenna ( antennaname varchar(100), manufacturerid int, connectortypeid int) AS $$ BEGIN IF manufacturerid < 1 THEN ROLLBACK; ELSEIF connectortypeid < 1 THEN ROLLBACK; ELSE INSERT INTO radio.antenna (antenna_name, manufacturer_id, connectortype_id) VALUES (antennaname, manufacturerid, connectortypeid); END IF; END $$ LANGUAGE plpgsql; |
The biggest change is where I redefined the LANGUAGE
to plpgsql because I want to use IF statements in the code and they’re not supported in straight SQL. Other than that, you can see that I’m taking charge of the transaction to ROLLBACK
if values below one is entered for the two ID parameters. That is the big difference between functions and procedures.
Then, to execute the procedure, we use the CALL statement like this:
1 |
CALL radio.newantenna('2 Meter short', 2, 2); |
That will add one row to the antenna table.
Conclusion
While there are some odd things, like using SELECT
with an INSERT
function, overall the behavior of functions and procedures is relatively straightforward and easy enough to understand.
Functions return a result set, where procedures do not, except output parameter values. Procedures can control transactions where functions do not. With that in mind, generally, I’m assuming the rule is to use functions except where you do need to control the transaction and then use procedures.
Load comments