The process of breaking bad relations/tables smaller good relations
→ leads to a better DB design
Avoids updating, deleting, and inserting anomalies
The normalization process involves achieving different normal forms step by step
1NF (First Normal Form)
2NF (Second Normal Form)
3NF (Third Normal Form)
2 Tools for achieving 3 different Normal Forms
Key : what is the key/PK of a relations?
Functional Dependency
Functional Dependency
A set of attributes X functionally determines a set of attributes Y if the values of X determine a unique value for Y
Simple saying, X determines Y; written as X->Y
Possible FD: Text → Course, [what is special about Text?]
Not FDs : Teacher → Course, Teacher → Text, Couse → Text
Does Text→ Teacher? : Not possible
→ 반드시 X가 primary key일 필요는 없다. 물론 primary key면 당연히 만족한다.
Example
B → C, C → B
1NF
Composite Attributes or Attributes with different data types are not allowed in a relation
A relation must have a primary key
Nested relations are not allowed
Multivalued Attributes is not allowed
Summary
Must have a primary key
No Composite attribute
No Multivalued attribute
No Nested relationship
Issues with DBMS Design
Redundant Information in Tuples
Composite PK = (Player_ID, Item_Type)
→ Delete Anomaly : what if we delete ‘gilal9’?
→ Update Anomaly : what if we change Plyaer_Rating of ‘jdog21’?
→ Insert Anomaly : what if we insert ‘tine42’ with Player_rating = Beginner?
Delete Anomaly
Delete anomalies occur when deleting a certain row inadvertently leads to the deletion of other important data. For example, if a department is shut down and all rows containing that department are deleted, the data of employees working solely in that department will also be deleted.
Example : what if we delete ‘gilal9’?
The Player_Rating of gilal9 (i.e., Beginner) will be deleted, which is not the case for other players.
Update Anomaly
Update anomalies occur when the same data is repeated in multiple rows, and changes are made in some but not all instances. For example, if an employee's address changes and the update is made in one row but not in others, the database will contain inconsistent data.
Example : what if we change Player_Rating of ‘jdong21’?
→ The Player_Rating might display two different ratings
Insert Anomaly
Insert anomalies occur when certain attributes cannot be inserted into the database due to missing additional data. For example, if a new employee is not assigned a department, their data cannot be inserted into the table if the department field does not allow null values.
Example : what if we insert ‘tine42’ with Player_Rating = Beginner?
→ Not possible as part of PK (Item_Type) is missing
2NF
1NF + Each non-key attribute must be depended on the entire primary key : avoid Partial Dependencies
Functional Dependency :
{Player_ID, Item_Type} → {Item_Quantity} : in 2NF
{Player_ID} → {Player_Rating} : not in 2NF
How to achieve 2NF?
Draw the Functional Dependencies
Decompose it into different relations
3NF
Think about this case.
What if we UPDATE Player_Skill_Level of jdog21 from 4 to 8?
It could cause some inconsistency
Solution
2NF + Every non-key attribute depends on only the Key
→ Avoid Transitive Dependency
How to achieve 3NF?
Draw Functional Dependencies and find the Transitive Dependencies
Decompose Transitive dependency into different relations
Conclusion
1NF : All attributes depends on the key
2NF : All attributes depend on the whole key
3NF : All attributes depend on nothing but the key
Formal Normalization Jargons
Candidate key: an attribute (or multiple attributes) uniquely identifies a row
Prime Attributes: an attribute belongs to at least one candidate key
Non-prime attribute: an attribute that doesn’t belong to any candidate key