What is Normalization?Normalization is the process of designing a database table structure that organizes the data to minimize data redundancy and facilitate data integrity. The main objective of "normalizing" the data is to isolate the information so modifications to the data can be performed in just one table and then propagate throughout the database by means of related foreign keys.
There are defined "rules" to follow when going through the normalization process. When a database is properly normalized, it is considered to be in "normal form." For example, a database complying with the first rule is said to be in "First Normal Form (1NF)." Although there are several levels or rules for normalizing a database, most developers will attempt to comply only up to the "Third Normal Form (3NF)." 3NF is considered to be the highest level necessary for most database applications in use today.
First Normal Form (1NF)A database must comply with the following rules to be considered 1NF:
- Eliminate repeating groups in individual tables
- Create a separate table for each set of related data
- Identify each set of related data with a primary key
Example: The following table violates 1NF.
Second Normal Form (2NF)A database must comply with the following rules to be considered 2NF:
- All the rules for 1NF
- Create separate tables for sets of values that apply to multiple records (duplicates)
- Relate these tables with a foreign key
Example: The following table is in 1NF but violates 2NF:
Third Normal Form (3NF)A database must comply with the following rules to be considered 3NF:
- All the rules for 1NF and 2NF
- Eliminate fields that do not depend on the key
Example: The following table is in 2NF but violates 3NF:
How can I remember all that?Here are the main ideas to help you remember each normal form:
- 1NF - Eliminate repeating groups (e.g. Price1, Price2, Price3)
- 2NF - Eliminate duplicate or redundant data (e.g. multiple records with same values)
- 3NF - Ensure all fields in the table are dependent on the primary key (e.g. city is dependent on the venue and not on the artist)
How do I fix this table structure?The following tables show a properly normalized structure:
If you can think of a better example, please let me know. Thank you.
Original post date: January 26, 2016