Database Normalization
| SWatch Reader Favorite! A poor database design can cripple an application, producing problems with data redundancy, accuracy, consistency and concurrency. Normalization reduces, if not eliminates, these problems. |
First, normal forms requires there be no multi-valued attributes, and no repeating groups. A multi-valued attribute would contain more than one value for that field in each row.
Consider the following StudentCourses table:
| StudentID | Course |
| 12345 | 3100,3600,3900 |
| 54321 | 1300,2300,1200 |
In this table, the Course field is a multi-valued attribute. There is not a single value for each field.
Now, consider this StudentCourses table:
| StudentID | Course1 | Course2 | Course3 |
| 12345 | 3100 | 3600 | 3900 |
| 54321 | 1300 | 2300 | 1200 |
The Course1, Course2, Course3 fields represent repeating groups.
The
proper way to store this data follows. First Normal form is
satisfied.
| StudentID | Course |
| 12345 | 3100 |
| 12345 | 3600 |
| 12345 | 3900 |
| 54321 | 1300 |
| 54321 | 2300 |
| 54321 | 1200 |
In the first two designs, selecting students that are enrolled in a certain course is difficult. Say I want to do the following:
Tell me all of the students enrolled in course 3100. In the first design, you'll have to pull all of the course data and parse it somehow. And in the second design, you'll have to check 3 different fields for course 3100. In the final design, a simple Select StudentID from StudentCourses where Course=3100.
Original date of publication, 02/21/2001

