**Functional Dependency**Consider a relation R defined over a set of attributes (A1,A2,…..An) and let X and Y be subset or equal to (A1,A2,……...An), then

X ---> Y

Y is functionally dependent on X if and only ,whenever two tuples in R agree on their X value ,they also agree on their Y value . i.e if each X value in (A1,A2,…..An) has associated with it one Y value in (A1,A2,……..An)

A functional dependency of the form

X ---> Y

where Y subset or equal to X is said to be trivial .

**Candidate Key**Consider a relation R on set of n-attributes, then a candidate key K is a set of one or more fields if and only if it satisfies following :

Uniqueness : K uniquely identifies each tuple, i.e. no two tuples can have same value of K

Non - redundancy : K is non - redundant, i.e. no proper subset of K has the uniqueness property

There is no subset of the key which has above above properties and in general more than one candidate key may exist.

**Example:**

STUDENT( name, father-name, course, enroll-number, grade)

Candidate keys assuming they give an unique identification : {name,father-name}

Candidate keys assuming they give an unique identification : {name,father-name}

{enroll-number}

Non Candidate Key :

{name, enroll-number} violates property 2.

{ course} violates property 1.

Non Candidate Key :

{name, enroll-number} violates property 2.

{ course} violates property 1.

**Prime/ Non Prime**
Prime attribute : Attribute participates in a candidate key.

Non Prime attribute: Attribute does not participate in any candidate key

**Example:**

STUDENT( name, father-name, course, enroll-number, grade)

Where key be enroll-number, then

Prime attribute : enroll-number.

Non Prime attribute : name, father-name, course, grade

Where key be enroll-number, then

Prime attribute : enroll-number.

Non Prime attribute : name, father-name, course, grade

**Primary Key**
A designated candidate key is a primary key. Primary key must be fully defined i.e no unknown values or NULL values are permitted.

**Example:**

Let candidate key of STUDENT be rollno, the

Every student must have a rollno.

rollno is a PRIME attribute

name, fname, course, grade are NON PRIME attributes.

Every student must have a rollno.

rollno is a PRIME attribute

name, fname, course, grade are NON PRIME attributes.

**Super Key**
A collection of attributes which is a unique identifier, may be non minimal.

**Axioms**
Reflexivity rule

If A is a set of attributes and B subset or equal to A

=>A ----> B

Augmentation rule

If A ---->B holds and C is a set of attributes

=>CA---> CB

=>A ----> B

Augmentation rule

If A ---->B holds and C is a set of attributes

=>CA---> CB

Transitivity rule

If A ----> B holds and B ----> C holds

=>A ----> C

=>A ----> C

These axioms are sound and complete as they generate all other functional dependencies for a given set F of functional dependencies.

This collection of rules is known as Armstrong’s axioms

This collection of rules is known as Armstrong’s axioms

**Additional Rules**
Union rule

If A---->B holds and A---->C holds

=> A----> BC

If A---->B holds and A---->C holds

=> A----> BC

Decomposition rule

If A---->BC holds

=A---->B and A----->C

pseudo transitivity rule

If A---->B holds and CB----->D holds

=> AC----->D

If A---->B holds and CB----->D holds

=> AC----->D

**Normalization**
Two broad approaches to normalization :

1. Decomposition approach

2. Synthesis approach

1. Decomposition approach

2. Synthesis approach

**Decomposition approach:**

Treat all the attributes as defining the properties of one entity

Determine the functional/multi-valued dependencies.

Decompose the global entity into its components. Repeatedly decompose each entity thus obtained till no further decomposition is possible.

**Synthesis approach:**

Identify all the functional / multi-valued dependencies.

Group together into entities all those attributes which exhibit these dependencies.

**Definition of Decomposition:**

Let r be a relation on relation scheme R and let ri=Ri(r) for

i=1,2,…. then

i=1,2,…. then

r subset or equal to r1 join r2 ………..join rn

The Decomposition of the relational scheme R={A1,A2,A3…An} in its replacement by a set of relation schemes {R1,R2,R3….Rn} such that R1 join R2 join R3…..Rn = R.

Decomposition can be Lossless-Join Decomposition : All the original information can be recovered by joining

Decomposition can be Non-Lossless-Join or Lossy Decomposition : Only partial information can be recovered

**Lossless-Join Decomposition**

R a relation and F a set of FDs

Decompose R into R1 and R2

Decomposition is lossless if either

R1 ∩ R2 → R1 or

R1 ∩ R2 → R2 is in F+

Decompose R into R1 and R2

Decomposition is lossless if either

R1 ∩ R2 → R1 or

R1 ∩ R2 → R2 is in F+

EmpDept ( empno, empname, job, deptno, dname, dloc)

F = { deptno → dname deptno → dloc empno → empname

empno → deptno empno → job }

F = { deptno → dname deptno → dloc empno → empname

empno → deptno empno → job }

Decompose EmpDept into two relations

Emp ( empno, empname, job, deptno )

Dept( deptno, dname, dloc)

Emp ( empno, empname, job, deptno )

Dept( deptno, dname, dloc)

Decomposition is lossless-join

deptno → dname deptno → dloc

deptno → dname, dloc (union)

deptno → dname, dloc, deptno ( augmentation)

Emp ∩ Dept = { deptno } → Dept

deptno → dname, dloc (union)

deptno → dname, dloc, deptno ( augmentation)

Emp ∩ Dept = { deptno } → Dept

Decompose EmpDept into two relations

Emp ( empno, empname, job)

Ejob( deptno, dname, dloc, job)

Emp ( empno, empname, job)

Ejob( deptno, dname, dloc, job)

Emp ∩ Dept = { job } !→ Emp or Ejob

Decomposition is non-lossless-join

Decomposition is non-lossless-join

Dependency Preserving

Dependency Preserving

A relation R with a set of functional dependencies F be decomposed into the relations R1, R2, ……., Rn .

The restriction of F to Ri = Fi = {FDs in F+ which include attributes only of Ri }

F| = F1 U F2 U … U Fn

Decomposition is dependency preserving if F| = F or F|+ = F+

The restriction of F to Ri = Fi = {FDs in F+ which include attributes only of Ri }

F| = F1 U F2 U … U Fn

Decomposition is dependency preserving if F| = F or F|+ = F+

EmpDept ( empno, empname, job, deptno, dname, dloc)

F = { deptno → dname deptno → dloc empno → empname

empno → deptno empno → job }

F = { deptno → dname deptno → dloc empno → empname

empno → deptno empno → job }

Decompose EmpDept into two relations

Emp ( empno, empname, job, deptno )

Femp = {empno → empname, empno → deptno, empno → job }

Dept( deptno, dname, dloc)

Fdept = {deptno → dname, deptno → dloc }

F| = Femp U Femp = F hence dependency preserving

**Notion of Normalization**

Normalization refers to the procedure of successive decomposition of a given relation into smaller relations. Normalization should be information preserving.

**1st Normal Form**
A relation R defined (A1, A2, ……., An) is said to be in 1 NF if : Values in the domain of each attribute of the relation are atomic . i.e the value is not a set of values or a list of values. Relational model expects relations to be in 1 NF.

**Example:**

STUDENT(name, fname, roll-no, course,grade)

Every attribute takes on a simple value.Thus it is in 1 NF.

EMPLOYEE(name, address, child)

child has attributes like child-name,age,sex.It is not atomic, thus is not in 1NF

PRODUCT(product-no, price, qty)

It is in 1 NF as every attribute has as atomic value

**Enforcing the 1 NF**
Replacement method: Systematically replaces all complex attributes by their constituents

**Example:**

For EMPLOYEE (name, address, child) define as

EMPLOYEE(name,address, child-name, cage, csex)

Decomposition method: Split the relation into two components, each of which are in 1 NF.

**Example:**For EMPLOYEE define

EMPLOYEE(ename, address) and CHILD(cname, cage, asex)

**Notion Of Anomaly**
Knowledge of the relation that is required to perform an operation on relation without creating any data inconsistencies.

Three anomalies are:

INSERT: One can not insert the fact that a particular supplier is located in a particular city until that supplier supplies at least one part

DELETE: If one deletes a tuple for a particular supplier, the information about the supplier’s City is also lost.

UPDATE: The city value for a given supplier appears many times.This redundancy causes problems.For eg is a supplier S1 moves from LONDON to AMSTERDAM, then there is a problem either to find every tuple connecting S1 and LONDON or to produce an inconsistence result.

**Partial functional Dependency**
An attribute is partially functionally dependent upon another when it is functionally dependent upon it and also upon a proper subset of it.

**Example:**

A , B------>C

A ----->C

A ----->C

It leads to redundancy

**2nd Normal Form**
A relation is said to be in 2 NF if and only if it is in 1 NF and every non-key attribute is irreducibly dependent on the primary key.

It is concerned with the elimination of redundancy due to partial functional dependencies.

Redundancy causes inconsistent modifications :

Update Anomaly : In supplier if X shifts business from Delhi to Bangalore then time dependent behavior on the number of parts being supplied at that time.Number of updates performed may be less than required

Deletion Anomaly : In supplier if X stops supplying parts 1, 2 and 3 then all three rows are deleted. And thus information about city of X is lost.

Insertion Anomaly : A new supplier C starts operating from Calcutta then, one can not insert since it will cause an undefined value in the primary key

If partial functional dependency does not involve the primary key then time dependent behavior is shown

Assume (S, P#) is not the primary key then, INSERTION is okay but when first P# is defined then an UPDATE has to be performed and for subsequent P# insertion has to be done.

Again inconsistent modifications are possible.

Assume (S, P#) is not the primary key then, INSERTION is okay but when first P# is defined then an UPDATE has to be performed and for subsequent P# insertion has to be done.

Again inconsistent modifications are possible.

**Eliminating Redundancy due to Partial Dependency**

Eliminate partial functional dependency by having only full functional dependencies.

This is known as the 2NF norm.

An entity is in 2 NF if it is in 1 NF and if each non-prime field is fully dependent upon each candidate key

Represent the offending partial functional dependency as as separate entity.Thus split the original relation into two.

Represent the offending partial functional dependency as as separate entity.Thus split the original relation into two.

**Transitive Dependency**
Let A, B, C be three distinct collections of attributes of an entity and following functional dependencies hold :

A → B, B !→ A, B → C

Then we say that A → C transitively or that C is transitively functionally dependent upon A

Then we say that A → C transitively or that C is transitively functionally dependent upon A

Transitive functional dependencies give rise to redundancies and thus inconsistencies.

3rd Normal Form3rd Normal Form

Codd’s Definition

A relation is in 3NF if it satisfies 2NF and no non prime attribute of R is transitively dependent on the primary key

A relation is in 3NF if it satisfies 2NF and no non prime attribute of R is transitively dependent on the primary key

Equivalently,

A relation is in 3 NF if for every functional dependency of the form X---->A, one of the following statements is true:

A relation is in 3 NF if for every functional dependency of the form X---->A, one of the following statements is true:

1. it is a trivial FD

2. X is a super key

2. X is a super key

3. A is a prime attribute

**Example:**

Consider a relation Stdinf (Name, Phoneno, Course, Major, Prof. ,Grade ,Major-Elective) with following FD’s

The key of the relation is {Name Course}

The partial dependencies are caused by Name → Phoneno

Name → Major and Course → Prof. as Name, Course is the key

The only transitive dependency is

Name → Major, Major → Major-Elective.

2NF Decomposition:

R1 {Name, Phoneno, Major, Major-Elective}

R2 {Course,Prof..}

R3 {Name,Course,Grade}

R1 {Name, Phoneno, Major, Major-Elective}

R2 {Course,Prof..}

R3 {Name,Course,Grade}

3NF Decomposition:

R1-1{Name,Phoneno,Major}

R1-2{Major,Major-Elective}

R2 {Course,Prof.}

R3 {Name,Course,Grade}

R1-1{Name,Phoneno,Major}

R1-2{Major,Major-Elective}

R2 {Course,Prof.}

R3 {Name,Course,Grade}

Preserves all the Functional Dependencies existing in the original relation

**BC Normal Form**
Need For BCNF arises when X → A and A → B where B is a subset of X

A relation is in BCNF if whenever a functional dependency

X → A holds then, either

i) X is a super key of R, or

ii) X → A is trivial (A is subset of X)

Lossless BCNF Decomposition:

If A-->B in R violates BCNF then create

R1(A,B ), R2 (R - B)

R1(A,B ), R2 (R - B)

Case where BCNF is Lossless and Dependency Preserving:

Consider Relation Shipping(Ship,Capacity,Date,Cargo,Value) with following FD’s

Ship-->Capacity

Ship,Date-->Cargo

Cargo,Capacity-->Value

In this case first and third FD’s are nontrivial and also determinant is not super key so it is not in BCNF

Ship-->Capacity

Ship,Date-->Cargo

Cargo,Capacity-->Value

In this case first and third FD’s are nontrivial and also determinant is not super key so it is not in BCNF

(Ship,Capacity,Date,Cargo,Value)

(Capacity,Cargo,Value)

With FD(Cargo,Capacity)---->Value

(Ship,Capacity)

With FD Ship---->Capacity

(Ship,Date,Cargo)

With FD Ship ,Date---->Cargo

## No comments:

## Post a Comment