How to Normalize a Database Into 2NF and 3NF (With Examples)
We present a practical example of how to normalize a table that is in the first normal form (1NF) into a set of tables in the third normal form (3NF).
We present a practical example of how to normalize a table that is in the first normal form (1NF) into a set of tables in the third normal form (3NF).
Database normalization is a very important concept in database design. Every database designer should know the rules for different normal forms. In practice, the most important normal form is the third normal form (3NF). If you want to be a database designer, you should know how to check if a table is in 3NF. You should also know how to normalize a table that’s not in 3NF into one in 3NF.
As you recall, a table is in 3NF if:
- It is in second normal form (2NF).
- All non-prime attributes are directly (non-transitively) dependent on the entire candidate key.
Typically, you normalize a table from 1NF to 3NF in two steps: first you normalize it into 2NF, then you normalize that into 3NF. In this article, we’ll show you an example of normalization from 1NF through 2NF into 3NF. We’ll also discuss problems with 1NF tables that are solved when you normalize them into 2NF.
Normalization to 2NF
We’ll work with a table named Courses1, which contains information about courses offered at a certain university: the titles of the courses, the lecturers, the departments, and the recommended textbooks for each course. The table is in the first normal form, since all column values are atomic. Here’s the table:
Courses1
| course | department | lecturer | textbook |
|---|---|---|---|
| Relational Databases | Computer Science | Jeremy Brown | Database Systems |
| Relational Databases | Computer Science | Jeremy Brown | Introduction to Databases |
| Cloud Databases | Computer Science | Jeremy Brown | Database Systems |
| Cloud Databases | Computer Science | Jeremy Brown | BigQuery Manual |
| Algorithms | Computer Science | James Cormen | Intro to Algorithms |
| Data Science | Mathematics | Jane Downing | Database Systems |
| Data Science | Mathematics | Jane Downing | Statistics |
| Calculus | Mathematics | John Smith | Intro to Calculus |
There are several problems with various operations on data in this table:
INSERT: You can’t add a course with no textbooks.UPDATE: To change the lecturer for the courseRelational Databases, you have to change two rows.DELETE: If you remove theAlgorithmscourse, you also remove the lecturerJames Cormen.
These problems come from the fact that this table is not in 2NF. Let’s see why. There are three non-trivial functional dependencies in this table:
course, textbook -> lecturer, department: Thecourseandtextbookdetermine thelecturerand thedepartment.course -> lecturer, department: Thecoursedetermines thelecturerand thedepartment.lecturer -> department: Thelecturerdetermines the department.
The candidate key in this table is the set {course, textbook}.
It is not in 2NF, because we have functional dependencies with only a part of the candidate key on the left hand-side of a dependency. The functional dependency course -> lecturer, department violates the rules for 2NF.
To normalize this, we need to get rid of the functional dependency that violates 2NF: course -> department, lecturer.
The normalization procedure usually involves decomposing a table into two or more tables that contain the same information. In this case, we’ll extract the columns of the functional dependency violating 2NF, that is, course, department, and lecturer, into a separate table, Courses2, with the candidate key {course}. The remaining column in Courses1 is the textbook column. Of course, we can’t just put it into a new table, because we’ll lose the course-textbook relationship. So, we will put both columns, course and textbook, into another table, CourseTextbooks.
To decompose a table into a set of columns, you identify these columns, take the data from those columns only, and remove the duplicates. For example, we create a table named Courses2 by first taking the columns course, department, and lecturer from Courses1. We then take the data from these columns and remove the duplicates. There are two rows with all of the values Relational Databases, Computer Science, and Jeremy Brown in Courses1, but there is only one row with all of these values in Courses2.
Here’s the final result of the split:
Courses2
| course | department | lecturer |
|---|---|---|
| Relational Databases | Computer Science | Jeremy Brown |
| Cloud Databases | Computer Science | Jeremy Brown |
| Algorithms | Computer Science | James Cormen |
| Data Science | Mathematics | Jane Downing |
| Calculus | Mathematics | John Smith |
CourseTextbooks
| course | textbook |
|---|---|
| Relational Databases | Database Systems |
| Relational Databases | Introduction to Databases |
| Cloud Databases | Database Systems |
| Cloud Databases | BigQuery Manual |
| Algorithms | Intro to Algorithms |
| Data Science | Database Systems |
| Data Science | Statistics |
| Calculus | Intro to Calculus |
You can reconstruct the same information in Courses1: simply join the data from Courses2 and CourseTextbooks.
Both of these tables are in 2NF. There are no functional dependencies within CourseTextbooks, and the candidate key is {course, textbook}. The functional dependencies in Courses1 are:
course->department,lecturer.lecturer->department.
The candidate key is {course}, and there are no functional dependencies on only a part of a key, so it is in 2NF.
The following problems have been removed in the new database:
INSERT: You can now add a course with no textbooks. Simply add the course toCourses2without adding rows toCourseTextbooks.UPDATE: You can change the lecturer for the courseRelational Databasesby changing just one row inCourses2.
This problem still persists:
DELETE: If we remove theAlgorithmscourse, we also remove the lecturerJames Cormen.
We’ll address this problem in a little.
The Algorithm for Normalizing to 2NF
Here’s the general algorithm for normalizing a table from 1NF to 2NF.
Suppose you have a table R with scheme S which is in 1NF but not in 2NF. Let A -> B be a functional dependency that violates the rules for 2NF, and suppose that the sets A and B are distinct (A ∩ B = ∅).
- Let C = S – (A U B). In other words:
- A = attributes on the left-hand side of the functional dependency.
- B = attributes on the right-hand side of the functional dependency.
- C = all other attributes.
- We can split
Rinto two parts:R1, with scheme C U A.R2, with scheme A U B.
- The original relation can be recovered as the natural join of
R1andR2: R =R1 NATURAL JOIN R2.
Normalization to 3NF
Let’s look again at Courses2. It is in 2NF but not in 3NF.
Courses2
| course | department | lecturer |
|---|---|---|
| Relational Databases | Computer Science | Jeremy Brown |
| Cloud Databases | Computer Science | Jeremy Brown |
| Algorithms | Computer Science | James Cormen |
| Data Science | Mathematics | Jane Downing |
| Calculus | Mathematics | John Smith |
We have the following functional dependencies in this table:
course->lecturer,department.lecturer->department.
The candidate key is {course}.
The table is not in 3NF, because there is a transitive functional dependency from the candidate key course to department (via course -> lecturer and lecturer -> department). The functional dependency lecturer -> department violates the rules for 3NF.
To convert Course2 into 3NF, we have to remove this functional dependency. The procedure is similar to the one we performed for 2NF: we extract the columns lecturer and department into a separate table, Lecturers. The remaining column in Courses2 is course. We need to keep the course-lecturer connection, so the other table is Courses3 with columns course and lecturer.
Here is the result:
Lecturers
| lecturer | department |
|---|---|
| Jeremy Brown | Computer Science |
| James Cormen | Computer Science |
| Jane Downing | Mathematics |
| John Smith | Mathematics |
Courses3
| course | lecturer |
|---|---|
| Relational Databases | Jeremy Brown |
| Cloud Databases | Jeremy Brown |
| Algorithms | James Cormen |
| Data Science | Jane Downing |
| Calculus | John Smith |
The problems we had in 2NF are now resolved:
INSERT: You can now add a course with no textbooks. Simply add the course toCourses3without adding rows toCourseTextbooks.UPDATE: You can change the lecturer forRelational Databasesby changing just one row inCourses3.DELETE: Removing theAlgorithmscourse does not remove the lecturerJames Cormen.
Get Started With Database Normalization
To normalize a table from 1NF to 3NF, you need to normalize it to 2NF first then to 3NF. In the normalization process, you decompose a table into multiple tables that contain the same information as the original table. The normalization process usually removes many problems related to data modification.
If you liked this article, check out other normalization articles on our blog.
If you’re a student taking database classes, make sure to create a free academic account in Vertabelo, our online ER diagram drawing tool. It allows you to draw logical and physical ER diagrams directly in your browser.
Vertabelo supports PostgreSQL, SQL Server, Oracle, MySQL, Google BigQuery, Amazon Redshift, and other relational databases. Try it out – see how easy it is to get started!