How Does Database Design Help Organize Teachers, Lessons, and Students?
In this article we will show you how to design a database structure for an educational institution to store information about students, lecturers, classes, attendance, students’ achievements, etc.
An investment in knowledge pays the best interest.
Benjamin Franklin
In the modern world, education is omnipresent. Now more than ever before, it plays an important role in our society. It’s so important, in fact, that many of us continue our education well after finishing school or college.
We have all heard of lifelong learning, non-formal education, and workshops for all ages. These methods differ from formal education in many ways, but they also have things in common. There are classes, lessons, teachers, and students. And just as in a traditional setting, we’ll want to keep track of the class schedule, attendance data, and instructor or student achievement. How can we design a database to meet these needs? That’s what we’ll cover in this article.
Introducing Our Education Database Model
The model presented in this article enables us to store data about:
- classes/lectures
- instructors/lecturers
- students
- lecture attendance
- students’ / lecturers’ achievement
We could also use this model as a school timetable, for other group activities (swimming lessons, dance workshops) or even for one-on-one activities like tutoring. There is still a lot of space for improvements, such as storing class location data or workshop duration; we’ll cover these in upcoming articles.
Let’s get started with our basic Education database elements: the tables.
The Big Three: Student, Instructor, and Class Tables
The student, instructor, and class tables make up the core of our database.
The student table, shown above, is used to store basic data about students, but it can be expanded according to specific needs. With the exception of the three contact attributes, all attributes in the table are required:
first_name– the student’s namelast_name– the student’s surnamebirth_date– the student’s birth datecontact_phone– the student’s phone numbercontact_mobile– the student’s mobile phone numbercontact_mail– the student’s email addresscategory_id– is a reference to thecategorycatalog. With this structure, we’re limited to only one category per student. That works in most cases, but in some situations we may need room to list multiple categories. As you can see, adding a many-to-many relation that connects thestudenttable with thecategorydictionary solves this problem. In this scenario, though, we’ll need to write rather more complex queries to handle our data.
Since we’ve mentioned it, let’s go ahead and discuss the category table here.
This table is a dictionary used to group students based on certain criteria. The name attribute is the only data in the table (besides id, the primary key) and it’s mandatory. One set of values that could be stored here is the student’s employment status: “student”, “employed”, “unemployed” and “retired”. We could also use other sets based on some highly specific criteria, such as “likes yoga”, “likes hiking”, “likes bike riding” and “does not like anything”.
The instructor table contains list of all instructors/lecturers in the organization. The attributes in the table are:
first_name– the instructor’s namelast_name– the instructor’s surnametitle– the instructor’s title (if any)birth_date– the instructor’s birth datecontact_phone– the instructor’s phone numbercontact_mobile– the instructor’s mobile phone numbercontact_mail– the instructor’s email address
The title and all three contact attributes are not mandatory.
The student table and instructor table share a similar structure, but there is another possibility for organizing this information. A second approach would be to have a person table (that stores all employee and student data) and has a many-to-many relation that tells us all of the roles assigned to that person. The most important advantage to the second approach is that we’ll store data only once. If someone is an instructor in one class and a student in another, they’ll appear only once in the database, but with both roles defined.
Why did we select the two-table approach for our educational database model? Generally, students and instructors behave differently, both in real life and in our database. Because of that, it could be wise to store their data separately. We can find other ways to merge the any same-person information that appears in both tables (e.g. pair of insert/update queries based on an external id, such as a social security number or VAT number).
The class table is a catalog that contains details about all classes. We can have multiple instances of each class type. The attributes in the table are as follows (all are mandatory except end_date):
class_type_id– is a reference to theclass_typedictionary.name– is a short name of the class.description– this description is more specific than the one in theclass_typetable.start_date– the start date of the class.end_date– the end date of the class. It’s not mandatory because we might not always know the exact end date for each class in advance.completed– is a Boolean value that denotes whether all planned class activities are finished. This is handy when we’ve reached the plannedend_timefor a class but other class activities have yet to be completed.
The class_type table is a simple catalog, intended to store basic information about the lectures or classes offered to students. It could contain values like “English language (group)”, “Polish language (group)”, “Croatian language (group)”, “English language (in person)“, or “Dance lessons”. It has only two mandatory attributes – name and description, both of which need no further explanation.
The class_schedule table contains specific times for lectures and classes. All attributes in the table are mandatory. The class_id attribute is a reference to the class table, while start_time and end_time are the starting and ending times of that specific lecture.
Who’s Here? Attendance-Related Tables
The attend table stores information regarding which student attended which class and the final result. The attributes in the table are:
student_id– is a reference to thestudenttableclass_id– is a reference to theclasstableclass_enrollment_date– is the date when student started attending that classclass_drop_date– the date when the student quit the class. This attribute shall have value only if the student dropped class before the class end date. In that case, thedrop_class_reason_idattribute value also must be set.drop_class_reason_id– is a reference to thedrop_class_reasontableattendance_outcome_id– is a reference to theattendance_outcometable
All data except class_drop_date and drop_class_reason_id is required. These two will be filled if and only if a student drops the class.
The drop_attendance_reason table is a dictionary containing the various reasons why a student might drop a course. It has only one attribute, reason_text, and it is mandatory. An example set of values is might include: “illness”, “lost interest”, “does not have enough time” and “other reasons”.
The attendance_outcome table contains descriptions about student activity in a given course. The outcome_text is the only attribute in the table and it’s required. A set of possible values is: “in progress”, “completed successfully”, “completed partially” and “has not completed class”.
Who’s In Charge? Teaching-Related Tables
The teach, drop_teach_reason and teach_outcome tables use the same logic as do the attend, drop_attendance_reason and attendance_outcome tables. All these tables store data about instructors’ course-related activities.
The teach table is used to store information about which instructor is teaching which class. The attributes in the table are:
instructor_id– is a reference to theinstructortable.class_id– is a reference to theclasstable.start_date– is the date when instructor started working on that class.end_date– is the date when instructor stopped working on that class. It’s not mandatory because we can’t know in advance if the instructor will teach to the class end date.drop_teach_reason_id– is a reference to thedrop_teach_reasontable. It’s not mandatory because the instructor may not drop the class.teach_outcome_id– is a reference to theteach_outcome_reasontable.
The drop_teach_reason table is a simple dictionary. It contains a set of possible explanations why the instructor finished teaching class before its end date. There is only one mandatory attribute: reason_text. This could be “illness”, “moved to other project/job”, “quit”, or “other reason”.
The teach_outcome table describes instructor’s success on a particular course. The outcome_text is the table’s sole attribute and it’s required. Possible values for this table could be: “in progress”, “completed successfully”, “completed partially” and “has not completed teaching class”.
The student_presence table is used to store data about student presence for a specific lecture. We can assume that for each lecture the instructor will note the presence and/or absence for all students. The attributes in the table are:
student_id– is a reference to thestudenttableclass_schedule_id– is a reference to theclass_scheduletablepresent– is a Boolean marking whether the student is present on lecture or not
We could monitor students’ presence on a specific class with a query like the one that follows (assuming that @id_class contains the class id we want).
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
SELECT a.id, CONCAT(a.first_name, ' ', a.last_name) AS student_name, a.number_total, CONCAT(CONVERT(a.number_present / a.number_total * 100, DECIMAL(5,2)), '%') AS percentage, a.attendance_outcome FROM ( SELECT student.id, student.first_name, student.last_name, SUM(CASE WHEN student_presence.present = True THEN 1 ELSE 0 END) AS number_present, COUNT(DISTINCT class_schedule.id) AS number_total, attendance_outcome.outcome_text AS attendance_outcome FROM class INNER JOIN attend ON class.id = attend.class_id INNER JOIN student ON attend.student_id = student.id LEFT JOIN class_schedule ON class_schedule.class_id = class.id LEFT JOIN student_presence ON student_presence.student_id = student.id AND student_presence.class_schedule_id = class_schedule.id LEFT JOIN attendance_outcome ON attendance_outcome.id = attend.attendance_outcome_id WHERE class.id = @id_class GROUP BY student.id, student.first_name, student.last_name, attendance_outcome.outcome_text ) a |
The “instructor_presence” table uses the same logic as the “student_presence” table, but here we want to focus on the instructors. The attributes in the table are:
instructor_id– is a reference to theinstructortableclass_schedule_id– is a reference to theclass_scheduletablepresent– is a Boolean value representing if the instructor present on lecture or not
We could use the query below to monitor the instructor’s activity in class:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
SELECT a.id, CONCAT(a.first_name, ' ', a.last_name) AS instructor_name, a.number_total, CONCAT(CONVERT(a.number_present / a.number_total * 100, DECIMAL(5,2)), '%') AS percentage, a.teach_outcome FROM ( SELECT instructor.id, instructor.first_name, instructor.last_name, SUM(CASE WHEN instructor_presence.present = True THEN 1 ELSE 0 END) AS number_present, COUNT(DISTINCT class_schedule.id) AS number_total, teach_outcome.outcome_text AS teach_outcome FROM class INNER JOIN teach ON class.id = teach.class_id INNER JOIN instructor ON teach.instructor_id = instructor.id LEFT JOIN class_schedule ON class_schedule.class_id = class.id LEFT JOIN instructor_presence ON instructor_presence.instructor_id = instructor.id AND instructor_presence.class_schedule_id = class_schedule.id LEFT JOIN teach_outcome ON teach_outcome.id = teach.teach_outcome_id WHERE class.id = @id_class GROUP BY instructor.id, instructor.first_name, instructor.last_name, teach_outcome.outcome_text ) a |
Now, let’s finish up by discussing the Contact Person tables.
Who Can We Call? Contact Person Tables
In most cases, we don’t need to store emergency contact information (i.e. in case of an emergency, contact this person). However, this changes when we’re teaching children. By law or by custom, we need to have a Contact Person for each child we’re teaching. In our model tables – contact_person, contact_person_type and contact_person_student – we demonstrate how this can be done.
The contact_person table is list of people that are related to students. Of course, we don’t need to list all relatives; mostly we’ll have one or two contacts per student. This is a good way to find “who you gonna call” when the student needs or wants to leave early. The attributes in the table are:
first_name– is the contact person’s namelast_name– is the person’s surnamecontact_phone– is the person’s phone numbercontact_mobile– is the person’s mobile phone numbercontact_mail– is the person’s email address
Contact details are not mandatory, although they are very useful.
The contact_person_type table is a dictionary with a single, required attribute: type_name. Examples of values stored in this table are: “mother”, “father”, “brother”, “sister” or “uncle”.
The contact_person_student table is a many-to-many relation that connects Contact Persons and their type with students. The attributes in the table are (all are mandatory):
contact_person_id– is a reference to thecontact_persontablestudent_id– is a reference to thestudenttablecontact_person_type_id– is a reference to thecontact_person_typetable
It may be worth mentioning that this many-to-many relation connects three tables together. The attribute pair contact_person_id and student_id is used as alternate (UNIQUE) key. That way, we’ll disable duplicate entries that connect individual students with the same contact person. The attribute contact_person_type_id is not a part of the alternate key. If so, we could have multiple relations for the same contact person and the same student (using different types of relationship), and that makes no sense in real life situations.
The model presented in this article should be able to cover most common needs. Still, parts of the model could be excluded in some cases, e.g. we probably wouldn’t need the entire contact person segment if our students are adults. As I said before, we’ll be adding improvements to this in time. Feel free to add suggestions and share your experience in the discussion sections.