There are times when you need to join multiple tables to retrieve the result set needed. SQL allows for multiple tables to be included in the FROM
clause along with the criteria for how to join tables. In SQL Server there are a number of different ways to join tables. In this article, I will be discussing the ins and outs of joining two tables in SQL Server using the inner and outer join operators.
The Three Parts of a Join Operations
When two tables are joined using a join operation, the records of the two sets are logically separated into three different buckets. Those different buckets can be represented using a Venn diagram like the one in Figure 1.
Figure 1: Three parts of the JOIN Operation
Each section of the Venn diagram is represented by a different color: yellow, brown, and blue. The members of Set A that don’t match with SET B will be contained in the yellow bucket. Members of SET A and B that match end up in the brown bucket. Lastly, the members in the blue bucket are those members in SET B that don’t match with SET A. The Venn diagram can visually represent the resulting sets created when an inner or outer join operation is performed.
Test Data
To demonstrate how the different inner and outer join operations work, the examples use two test data tables created in the tempdb database. These two tables are SetA and SetB. If you want to follow along and execute the examples in this article, you can create the two test tables using the code in Listing 1.
Listing 1: Creating SetA and SetB
1 2 3 4 5 6 7 8 9 |
USE tempdb; GO CREATE TABLE SetA (Col1 tinyint, Col2 char(1)); INSERT INTO SetA VALUES (1, 'A'), (2, 'B'); CREATE TABLE SetB (ColA tinyint, ColB char(1), ColC char(1)); INSERT INTO SetB VALUES (1, 'A', 'B'), (2, 'C', 'D'), (3, 'E', 'F'); GO |
The two tables created when Listing 1 is run can be represented by the Venn diagram in Figure 2.
Figure 2: Tables SetA and SetB
Table SetA consists of a set with two columns (Col1
and Col2
) and two different rows. Table SetB has three columns (ColA
, ColB
, and ColC
) and contains three different rows.
INNER JOIN operator
The INNER JOIN operator is probably the most used join operation found in T-SQL code. This operator returns rows that match when two sets are joined based on one or more search criteria. The brown section in Figure 3 represents the result set of an INNER JOIN operation between the two test data tables, SetA and SetB, based on the search criterion SetA.Col2 = SetB.ColB
.
Figure 3: Venn Diagram representing an INNER JOIN
The T-SQL code in Listing 2 demonstrates one way to write an INNER
JOIN
operation to return the matching rows found in the brown section of the Venn diagram in Figure 3.
Listing 2: Performing INNER JOIN operation
1 2 3 4 5 |
USE tempdb; GO SELECT * FROM SetA INNER JOIN SetB ON SetA.Col2 = SetB.ColB; GO |
Report 1 shows the results when the code in Listing 2 is executed.
Report 1: Output produced when Listing 2 is run
The code in Listing 2 identifies the two tables, SetA and SetB, to be joined in the FROM clause, with the INNER
JOIN
operator between them, followed by an ON clause. The ON clause identifies the search criteria to be used when joining the two tables. In Listing 2, the ON clause specifies SetA.Col2 = SetB.ColB
as the search criterion. This criterion specifies that the INNER JOIN operation is to find all the rows in SetA that have corresponding matching rows in SetB, where Col2
in SetA is equal to ColB
in SetB. Let’s walk through how this join operation works.
While the technical details of how the engine joins the tables together are beyond the scope of this article, logically, SQL Server has to compare each row in SetA with each row in SetB to determine which rows match the join condition. To perform this comparison, SQL Server first finds the value for Col2
in the first row of SetA, which has a value of A. Next SQL Server scans the rows in SetB to determine if any ColB
values match the value of A. SQL Server only finds a single row in SetB with a value of A in ColB
. SQL server then repeats this compare process between SetA and SetB for each row in SetA, trying to find matches. The second row in SetA doesn’t match with any rows in SetB. Once all the comparisons between SetA and SetB are completed, the column values from matching rows between SetA and SetB are merged, and the final results of the INNER
JOIN
are returned to the calling program.
The matching search criterion used in Listing 2 uses SQL-92 join syntax. This is not the only join syntax supported by SQL Server.
Comparing join syntax between versions of SQL
In the early days of SQL, the various database vendors started implementing SQL code differently. This led to inconsistencies in how SQL code was written. In the 1980s, the American National Standards Institute (ANSI) started working on a SQL language standard. Over time the SQL standards changed the way joins could be written.
The code in Listing 2 above uses SQL-92 join syntax, which requires a join operator to be specified between the joined sets and an ON clause to identify the join criteria. Prior to SQL-92, a different syntax for joining data was used which is shown in Listing 3.
Listing 3: Performing INNER JOIN using SQL-86 join syntax
1 2 3 4 5 |
USE tempdb; GO SELECT * FROM SetA, SetB WHERE SetA.Col2 = SetB.ColB; GO |
The pre-SQL-92 syntax required a comma to be placed between tables being joined and required the join criteria be specified in the WHERE
clause. By moving the join conditions to the ON clause, the SQL-92 standards separated the join conditions from other search conditions a query might have in the WHERE
clause. SQL Server still supports the older method of doing the INNER
JOIN
operation using pre-SQL-92 syntax, but it no longer supports the outer join method using the pre-SQL-92 syntax. SQL-92 standard is now the recommended standard for doing join operations. For the rest of the join examples in this article, only the ANSI SQL-92 syntax will be shown.
The INNER
JOIN
operation returns rows with matching members in both the left and right sides of joined sets. However, there are times when rows might need to be included in result sets even when they don’t match. This is where the outer join operations can help.
LEFT OUTER JOIN
The LEFT
OUTER
JOIN
operation returns all the rows from the table on the left, even when they don’t match any of the rows on the right based on the search criteria. The Venn diagram in Figure 4 logically shows the results of a LEFT
OUTER
JOIN
operation between SetA and SetB based the search criterion of SetA.Col2 = SetB.ColB
.
Figure 4: Venn diagram representing LEFT OUTER JOIN
By comparing Figure 4 to the Venn diagram for the INNER
JOIN
found in Figure 3, the matched row of SetA, in the brown section, and the unmatched members in the yellow section of SetA have been combined into a single yellow circle. The members of this yellow circle are the members of SetA and SetB that will be returned from the LEFT
OUTER
JOIN
operations. Listing 4 contains the T-SQL code to produce the LEFT
OUTER
JOIN
results represented by Figure 4.
Listing 4: LEFT OUTER JOIN
1 2 3 4 5 |
USE tempdb; GO SELECT * FROM SetA LEFT OUTER JOIN SetB ON SetA.Col2 = SetB.ColB; GO |
Report 2 shows the output when Listing 4 is executed.
Report 2: Output from running Listing 4
The results in Report 2 have one row that matched the search criterium and one that didn’t match. The row that didn’t match the column values for the table on the right side of the LEFT
OUTER
JOIN
operation is set to NULL
. It is easy to identify those rows that did not match the search criterium by checking to see if any of the columns from the right table is null, as done in Listing 5.
Listing 5: Finding rows on the left that don’t match rows on the right
1 2 3 4 5 6 |
USE tempdb; GO SELECT * FROM SetA LEFT OUTER JOIN SetB ON SetA.Col2 = SetB.ColB WHERE SetB.ColA is null; GO |
When the code in Listing 5 is run, only the last row in Report 2 will be returned.
RIGHT OUTER JOIN
The RIGHT
OUTER
JOIN
operation is similar to the LEFT
OUTER
JOIN
operation. The difference is that all the rows in the right table are returned, whether or not they match the rows in the left table. To show this consider the code in Listing 6.
Listing 6: Right outer join example
1 2 3 4 5 6 |
USE tempdb; GO SELECT * FROM SetA RIGHT OUTER JOIN SetB ON SetA.Col2 = SetB.ColB; GO |
Report 3 shows the results when Listing 6 is executed.
Report 3: Results when Listing 6 is run
Report 3 shows that all the rows from the right table, SetB, are returned. For those rows in SetB that don’t match SetA, the NULL
value was returned for the SetA columns in the result set.
If you need to find all the rows in SetB that don’t match the search criterion of SetA.Col2 = SetB.ColB
, then the code in Listing 7 can be run.
Listing 7: Finding rows in SetB that don’t match SetA
1 2 3 4 5 6 7 |
USE tempdb; GO SELECT * FROM SetA RIGHT OUTER JOIN SetB ON SetA.Col2 = SetB.ColB WHERE SetA.Col1 is null; GO |
Report 4 shows the results produced when Listing 7 is executed.
Report 4: Results from running Listing 7
FULL OUTER JOIN
The FULL
OUTER
JOIN
operation combines both the LEFT
and RIGHT
OUTER
JOIN and the INNER
JOIN
in a single operation. Listing 8 uses the FULL
OUTER
JOIN
operator to find all the rows that match and don’t match the search criterion SetA.Col2 = SetB.ColB
.
Listing 8: Finding all matching and unmatched rows
1 2 3 4 5 |
USE tempdb; GO SELECT * FROM SetA FULL OUTER JOIN SetB ON SetA.Col2 = SetB.ColB; GO |
Report 5 shows the results when Listing 7 is run.
Report 5: Output from Listing 8
The first row in Report 5 is the only row that meets the search criterion identified in Listing 8, hence why each column has a value. The second row is the left row that doesn’t match any rows on the right; it could be found using the LEFT
OUTER
JOIN
. The third and fourth rows are the rows from the right that don’t match the rows from the left; they could be found using a RIGHT
OUTER
JOIN
operation.
The code in Listing 9 shows how to find only those rows in either the left or right table that don’t match the search criterion.
Listing 9: Using the FULL OUTER JOIN
1 2 3 4 5 6 |
USE tempdb; GO SELECT * FROM SetA FULL OUTER JOIN SetB ON SetA.Col2 = SetB.ColB WHERE SetA.Col1 is null or SetB.ColA is null; GO |
Report 6 shows the output when Listing 9 is executed.
Report 6: Output when Listing 9 is run
The rows from the left table that don’t match rows in the right table will have nulls in the columns from the right table. Inversely, rows from the right table that don’t match rows in the left table will have nulls identified for all the columns from the left table.
Using Multiple comparison operations
The comparison between two sets in a join operation can also have multiple comparison operations, like the code in Listing 10.
Listing 10: Using multiple comparison operators in a join operation
1 2 3 4 5 |
USE tempdb; GO SELECT * FROM SetA INNER JOIN SetB ON SetA.Col1 = SetB.ColA AND SetA.Col2 = SetB.ColB; GO |
When Listing 10 is executed the results in Report 7 is produced.
Report 7: Output when Listing 10 is run
In Listing 10 only the rows where both Col1
from Set1 matches ColA
from SetB and Col2
from SetA matches ColB
from SetB.
The Ins and Outs of Joining tables
Joining tables is a key skill every T-SQL programmer needs to master. In this article, I used Venn diagrams and T-SQL code examples to show on how the INNER
JOIN
, LEFT
OUTER
JOIN
, RIGHT
OUTER
JOIN
and FULL
OUTER
JOIN
operations can be used to find different sets of records when joining two sets. Hopefully, next time you need to find the rows in one of the three parts produced when joining two sets you will know which join operation and search criteria to use.
Load comments