SWatch Reader Favorite! A poor database design can cripple an application, producing problems with data redundancy, accuracy, consistency and concurrency. Normalization reduces, if not eliminates, these problems. |
Unsure About an Acronym or Term? |
A poor database design can cripple an application, producing problems
with redundancy, inaccuracy, consistency and concurrency of
data. Normalization is a process that serves to reduce, if not
eliminate, these problems with data. Since most businesses use
3rd normal form in the logical model, I’ll take you
through 1st, 2nd, and 3rd NF’s.
First,
normal forms requires there be no multi-valued attributes, and
no repeating groups. A multi-valued attribute would contain more
than one value for that field in each row.
Consider
the following StudentCourses table:
StudentID | Course |
12345 | 3100,3600,3900 |
54321 | 1300,2300,1200 |
In
this table, the Course field is a multi-valued attribute. There is
not a single value for each field.
Now, consider this StudentCourses table:
StudentID | Course1 | Course2 | Course3 |
12345 | 3100 | 3600 | 3900 |
54321 | 1300 | 2300 | 1200 |
The Course1, Course2, Course3 fields represent repeating groups.
The
proper way to store this data follows. First Normal form is
satisfied.
StudentID | Course |
12345 | 3100 |
12345 | 3600 |
12345 | 3900 |
54321 | 1300 |
54321 | 2300 |
54321 | 1200 |
In the first two designs, selecting students that are enrolled in a
certain course is difficult. Say I want to do the following:
Tell me all of the students enrolled in course 3100.
In the first design, you’ll have to pull all of the course data and
parse it somehow. And
in the second design, you’ll have to check 3 different fields for
course 3100. In
the final design, a simple Select StudentID from StudentCourses where Course=3100.
Original date of publication, 02/21/2001