Functional Dependency
Consider a relation R defined over a set of attributes (A1,A2,…..An) and let X and Y be subset or equal to (A1,A2,……...An), then
X ---> Y
Y is functionally dependent on X if and only ,whenever two tuples in R agree on their X value ,they also agree on their Y value . i.e if each X value in (A1,A2,…..An) has associated with it one Y value in (A1,A2,……..An)
A functional dependency of the form
X ---> Y
where Y subset or equal to X is said to be trivial .
Candidate Key
Consider a relation R on set of n-attributes, then a candidate key K is a set of one or more fields if and only if it satisfies following :
Uniqueness : K uniquely identifies each tuple, i.e. no two tuples can have same value of K
Non - redundancy : K is non - redundant, i.e. no proper subset of K has the uniqueness property
There is no subset of the key which has above above properties and in general more than one candidate key may exist.
Example:
STUDENT( name, father-name, course, enroll-number, grade)
Candidate keys assuming they give an unique identification : {name,father-name}
Candidate keys assuming they give an unique identification : {name,father-name}
{enroll-number}
Non Candidate Key :
{name, enroll-number} violates property 2.
{ course} violates property 1.
Non Candidate Key :
{name, enroll-number} violates property 2.
{ course} violates property 1.
Prime/ Non Prime
Prime attribute : Attribute participates in a candidate key.
Non Prime attribute: Attribute does not participate in any candidate key
Example:
STUDENT( name, father-name, course, enroll-number, grade)
Where key be enroll-number, then
Prime attribute : enroll-number.
Non Prime attribute : name, father-name, course, grade
Where key be enroll-number, then
Prime attribute : enroll-number.
Non Prime attribute : name, father-name, course, grade
Primary Key
A designated candidate key is a primary key. Primary key must be fully defined i.e no unknown values or NULL values are permitted.
Example:
Let candidate key of STUDENT be rollno, the
Every student must have a rollno.
rollno is a PRIME attribute
name, fname, course, grade are NON PRIME attributes.
Every student must have a rollno.
rollno is a PRIME attribute
name, fname, course, grade are NON PRIME attributes.
Super Key
A collection of attributes which is a unique identifier, may be non minimal.
Axioms
Reflexivity rule
If A is a set of attributes and B subset or equal to A
=>A ----> B
Augmentation rule
If A ---->B holds and C is a set of attributes
=>CA---> CB
=>A ----> B
Augmentation rule
If A ---->B holds and C is a set of attributes
=>CA---> CB
Transitivity rule
If A ----> B holds and B ----> C holds
=>A ----> C
=>A ----> C
These axioms are sound and complete as they generate all other functional dependencies for a given set F of functional dependencies.
This collection of rules is known as Armstrong’s axioms
This collection of rules is known as Armstrong’s axioms
Additional Rules
Union rule
If A---->B holds and A---->C holds
=> A----> BC
If A---->B holds and A---->C holds
=> A----> BC
Decomposition rule
If A---->BC holds
=A---->B and A----->C
pseudo transitivity rule
If A---->B holds and CB----->D holds
=> AC----->D
If A---->B holds and CB----->D holds
=> AC----->D
Normalization
Two broad approaches to normalization :
1. Decomposition approach
2. Synthesis approach
1. Decomposition approach
2. Synthesis approach
Decomposition approach:
Treat all the attributes as defining the properties of one entity
Determine the functional/multi-valued dependencies.
Decompose the global entity into its components. Repeatedly decompose each entity thus obtained till no further decomposition is possible.
Synthesis approach:
Identify all the functional / multi-valued dependencies.
Group together into entities all those attributes which exhibit these dependencies.
Definition of Decomposition:
Let r be a relation on relation scheme R and let ri=Ri(r) for
i=1,2,…. then
i=1,2,…. then
r subset or equal to r1 join r2 ………..join rn
The Decomposition of the relational scheme R={A1,A2,A3…An} in its replacement by a set of relation schemes {R1,R2,R3….Rn} such that R1 join R2 join R3…..Rn = R.
Decomposition can be Lossless-Join Decomposition : All the original information can be recovered by joining
Decomposition can be Non-Lossless-Join or Lossy Decomposition : Only partial information can be recovered
Lossless-Join Decomposition
R a relation and F a set of FDs
Decompose R into R1 and R2
Decomposition is lossless if either
R1 ∩ R2 → R1 or
R1 ∩ R2 → R2 is in F+
Decompose R into R1 and R2
Decomposition is lossless if either
R1 ∩ R2 → R1 or
R1 ∩ R2 → R2 is in F+
EmpDept ( empno, empname, job, deptno, dname, dloc)
F = { deptno → dname deptno → dloc empno → empname
empno → deptno empno → job }
F = { deptno → dname deptno → dloc empno → empname
empno → deptno empno → job }
Decompose EmpDept into two relations
Emp ( empno, empname, job, deptno )
Dept( deptno, dname, dloc)
Emp ( empno, empname, job, deptno )
Dept( deptno, dname, dloc)
Decomposition is lossless-join
deptno → dname deptno → dloc
deptno → dname, dloc (union)
deptno → dname, dloc, deptno ( augmentation)
Emp ∩ Dept = { deptno } → Dept
deptno → dname, dloc (union)
deptno → dname, dloc, deptno ( augmentation)
Emp ∩ Dept = { deptno } → Dept
Decompose EmpDept into two relations
Emp ( empno, empname, job)
Ejob( deptno, dname, dloc, job)
Emp ( empno, empname, job)
Ejob( deptno, dname, dloc, job)
Emp ∩ Dept = { job } !→ Emp or Ejob
Decomposition is non-lossless-join
Decomposition is non-lossless-join
Dependency Preserving
A relation R with a set of functional dependencies F be decomposed into the relations R1, R2, ……., Rn .
The restriction of F to Ri = Fi = {FDs in F+ which include attributes only of Ri }
F| = F1 U F2 U … U Fn
Decomposition is dependency preserving if F| = F or F|+ = F+
The restriction of F to Ri = Fi = {FDs in F+ which include attributes only of Ri }
F| = F1 U F2 U … U Fn
Decomposition is dependency preserving if F| = F or F|+ = F+
EmpDept ( empno, empname, job, deptno, dname, dloc)
F = { deptno → dname deptno → dloc empno → empname
empno → deptno empno → job }
F = { deptno → dname deptno → dloc empno → empname
empno → deptno empno → job }
Decompose EmpDept into two relations
Emp ( empno, empname, job, deptno )
Femp = {empno → empname, empno → deptno, empno → job }
Dept( deptno, dname, dloc)
Fdept = {deptno → dname, deptno → dloc }
F| = Femp U Femp = F hence dependency preserving
Notion of Normalization
Normalization refers to the procedure of successive decomposition of a given relation into smaller relations. Normalization should be information preserving.
1st Normal Form
A relation R defined (A1, A2, ……., An) is said to be in 1 NF if : Values in the domain of each attribute of the relation are atomic . i.e the value is not a set of values or a list of values. Relational model expects relations to be in 1 NF.
Example:
STUDENT(name, fname, roll-no, course,grade)
Every attribute takes on a simple value.Thus it is in 1 NF.
EMPLOYEE(name, address, child)
child has attributes like child-name,age,sex.It is not atomic, thus is not in 1NF
PRODUCT(product-no, price, qty)
It is in 1 NF as every attribute has as atomic value
Enforcing the 1 NF
Replacement method: Systematically replaces all complex attributes by their constituents
Example:
For EMPLOYEE (name, address, child) define as
EMPLOYEE(name,address, child-name, cage, csex)
Decomposition method: Split the relation into two components, each of which are in 1 NF.
Example:
Example:
For EMPLOYEE define
EMPLOYEE(ename, address) and CHILD(cname, cage, asex)
Notion Of Anomaly
Knowledge of the relation that is required to perform an operation on relation without creating any data inconsistencies.
Three anomalies are:
INSERT: One can not insert the fact that a particular supplier is located in a particular city until that supplier supplies at least one part
DELETE: If one deletes a tuple for a particular supplier, the information about the supplier’s City is also lost.
UPDATE: The city value for a given supplier appears many times.This redundancy causes problems.For eg is a supplier S1 moves from LONDON to AMSTERDAM, then there is a problem either to find every tuple connecting S1 and LONDON or to produce an inconsistence result.
Partial functional Dependency
An attribute is partially functionally dependent upon another when it is functionally dependent upon it and also upon a proper subset of it.
Example:
A , B------>C
A ----->C
A ----->C
It leads to redundancy
2nd Normal Form
A relation is said to be in 2 NF if and only if it is in 1 NF and every non-key attribute is irreducibly dependent on the primary key.
It is concerned with the elimination of redundancy due to partial functional dependencies.
Redundancy causes inconsistent modifications :
Update Anomaly : In supplier if X shifts business from Delhi to Bangalore then time dependent behavior on the number of parts being supplied at that time.Number of updates performed may be less than required
Deletion Anomaly : In supplier if X stops supplying parts 1, 2 and 3 then all three rows are deleted. And thus information about city of X is lost.
Insertion Anomaly : A new supplier C starts operating from Calcutta then, one can not insert since it will cause an undefined value in the primary key
If partial functional dependency does not involve the primary key then time dependent behavior is shown
Assume (S, P#) is not the primary key then, INSERTION is okay but when first P# is defined then an UPDATE has to be performed and for subsequent P# insertion has to be done.
Again inconsistent modifications are possible.
Assume (S, P#) is not the primary key then, INSERTION is okay but when first P# is defined then an UPDATE has to be performed and for subsequent P# insertion has to be done.
Again inconsistent modifications are possible.
Eliminating Redundancy due to Partial Dependency
Eliminate partial functional dependency by having only full functional dependencies.
This is known as the 2NF norm.
An entity is in 2 NF if it is in 1 NF and if each non-prime field is fully dependent upon each candidate key
Represent the offending partial functional dependency as as separate entity.Thus split the original relation into two.
Represent the offending partial functional dependency as as separate entity.Thus split the original relation into two.
Transitive Dependency
Let A, B, C be three distinct collections of attributes of an entity and following functional dependencies hold :
A → B, B !→ A, B → C
Then we say that A → C transitively or that C is transitively functionally dependent upon A
Then we say that A → C transitively or that C is transitively functionally dependent upon A
Transitive functional dependencies give rise to redundancies and thus inconsistencies.
3rd Normal Form
Codd’s Definition
A relation is in 3NF if it satisfies 2NF and no non prime attribute of R is transitively dependent on the primary key
A relation is in 3NF if it satisfies 2NF and no non prime attribute of R is transitively dependent on the primary key
Equivalently,
A relation is in 3 NF if for every functional dependency of the form X---->A, one of the following statements is true:
A relation is in 3 NF if for every functional dependency of the form X---->A, one of the following statements is true:
1. it is a trivial FD
2. X is a super key
2. X is a super key
3. A is a prime attribute
Example:
Consider a relation Stdinf (Name, Phoneno, Course, Major, Prof. ,Grade ,Major-Elective) with following FD’s
The key of the relation is {Name Course}
The partial dependencies are caused by Name → Phoneno
Name → Major and Course → Prof. as Name, Course is the key
The only transitive dependency is
Name → Major, Major → Major-Elective.
2NF Decomposition:
R1 {Name, Phoneno, Major, Major-Elective}
R2 {Course,Prof..}
R3 {Name,Course,Grade}
R1 {Name, Phoneno, Major, Major-Elective}
R2 {Course,Prof..}
R3 {Name,Course,Grade}
3NF Decomposition:
R1-1{Name,Phoneno,Major}
R1-2{Major,Major-Elective}
R2 {Course,Prof.}
R3 {Name,Course,Grade}
R1-1{Name,Phoneno,Major}
R1-2{Major,Major-Elective}
R2 {Course,Prof.}
R3 {Name,Course,Grade}
Preserves all the Functional Dependencies existing in the original relation
BC Normal Form
Need For BCNF arises when X → A and A → B where B is a subset of X
A relation is in BCNF if whenever a functional dependency
X → A holds then, either
i) X is a super key of R, or
ii) X → A is trivial (A is subset of X)
Lossless BCNF Decomposition:
If A-->B in R violates BCNF then create
R1(A,B ), R2 (R - B)
R1(A,B ), R2 (R - B)
Case where BCNF is Lossless and Dependency Preserving:
Consider Relation Shipping(Ship,Capacity,Date,Cargo,Value) with following FD’s
Ship-->Capacity
Ship,Date-->Cargo
Cargo,Capacity-->Value
In this case first and third FD’s are nontrivial and also determinant is not super key so it is not in BCNF
Ship-->Capacity
Ship,Date-->Cargo
Cargo,Capacity-->Value
In this case first and third FD’s are nontrivial and also determinant is not super key so it is not in BCNF
(Ship,Capacity,Date,Cargo,Value)
(Capacity,Cargo,Value)
With FD(Cargo,Capacity)---->Value
(Ship,Capacity)
With FD Ship---->Capacity
(Ship,Date,Cargo)
With FD Ship ,Date---->Cargo
No comments:
Post a Comment