DBMS

Created by	BBorhan
Last edited time	@October 21, 2024 9:29 PM
Tag

Resources:

Videos :
- Bangla → https://www.youtube.com/watch?v=S4cBelRt2D8&list=PLgrAmbRAezuhRWTyOKPxvoiyNQfPo6Iv3,
- To Understand normalization, FD → https://www.youtube.com/playlist?list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y,
- Understanding Discriminator : https://www.youtube.com/watch?v=qgoXPY2DmO0&ab_channel=RituKapurClasses,
- ERD to Relational Schema : https://www.youtube.com/watch?v=OwdFzygGZqk&list=PLKDJE8BkZ4wzsaO-eutJTega6iXhDqjNC&ab_channel=OrangeOutputs,

PDF and Articles:
- Best One : https://www.poriyaan.in/paper/database-management-system-79/
- Theory → https://www.mbbcollege.in/db/notes/233.pdf, https://www.du.ac.in/du/uploads/departments/Operational Research/24042020_E-R Model.pdf
- DIU : https://elearn.daffodilvarsity.edu.bd/course/view.php?id=10248
- Have some good Example → https://kp.kiit.ac.in/pdf_files/06/4th-Sem_CSE_Database-Management-System.pdf
- ER Model Example → https://www.vssut.ac.in/lecture_notes/lecture1423726199.pdf
- ERD to Relational Schema Example : https://mebrahimii.github.io/comp440-fall2020/lecture/week_12/Lecture 16.pdf
- Example on Normalization : https://cse.poriyaan.in/topic/example-on-normalization-50872/
- Integrity Constraints : https://www.linkedin.com/pulse/integrity-constraints-dbms-sarbani-sahoo-cvobe/, https://medium.com/@reetesh043/types-of-constraints-in-dbms-73cce32aca80#:~:text=A key constraint in a,in establishing relationships between tables
- Anomalies : https://testbook.com/gate/anomalies-in-dbms-notes#:~:text=Last Updated on Jul 31,data loss%2C and incorrect data,
- Degree of relationship with example : https://afteracademy.com/blog/what-is-the-degree-of-relation-in-dbms/#:~:text=degree of relationship.-,Degree of Relationship,the degree of that relationship

ER Diagram : Entity Relationship Diagram

ER diagram shows the relationship among the entity sets and also helps to visualize the logical structure of database.

purpose of defining the relationship between the database entities.

Entity

An entity is an object that exists and is distinguishable from other objects. Example : Company, Student. (It is the logical representation of table.)

Types:
- Strong entity: which have a primary key
- Weak entity
  - doesn’t have a PK
  - dependent on a strong entity
  - A weak entity set doesn’t have a primary key, but we need a means of distinguish between all entities in the weak entity set that depend/linked with a particular strong entity.

Example:

Discriminator/Partial Key: The discriminator of a weak entity set is a set of attributes that allows the distinguish to be made. For an example, we may have same course_id multiple times in “section” table, but how could we uniquely identify each of them ? We can find them out by sec_id, semester, year, so they are discriminator for this weak entity. sec_id, semester, year can be same for each of “course_id”, but a “course_id” must not have same sec_id, semester, year for multiple times.

Strong Entity Set	Weak Entity Set
Strong entity set always has a primary key.	It does not have enough attributes to build a primary key.
It is represented by a rectangle symbol.	It is represented by a double rectangle symbol.
It contains a Primary key represented by the underline symbol.	It contains a Partial Key which is represented by a dashed underline symbol.
The member of a strong entity set is called as dominant entity set.	The member of a weak entity set called as a subordinate entity set.
In the ER diagram the relationship between two strong entity set shown by using a diamond symbol	The relationship between one strong and a weak entity set shown by using the double diamond symbol.
The connecting line of the strong entity set with the relationship is single.	The line connecting the weak entity set for identifying relationship is double.

Attributes

Attributes are the properties which define the entity type. Example: Roll_No, Name, DOB, Age.

Types of Attributes:

Type	Description
Simple Attribute/ Atomic Attribute	- Can’t be sub-divided further into any other components. - Example : Roll, reg_no
Composite Attribute	- Can further sub-divided into different components - combination of other attributes - Example: name →{ first_name, last_name}
Single-valued	- Consists of a single value for each entity. - Cannot store more than one value. Example : Roll_no, DOB, reg; Not: Phone number, because each might have more than one phone number
Multi-valued	- Can store more than one value - Represented by double ellipse/ ovals Example : Phone, email
Derived	- Can be derived from the values of other attributes - Represented by dashed ellipse/oval Example : Age. We can find out the age from DOB.
Key	- can uniquely identify an entity from an entity set - identify each tuple according to this key - represented by underlining Example : Roll, ID
Stored	- fixed value, that won’t be changed in future Example : DOB

Relationship and Mapping Cardinality

Relationships are association between or among entities.

Mapping Cardinality represents the number of entities to which another entity can be associated via a relationship set.

Types of Relationship:

One to One (1:1) : One instance in an entity (parent) refers to one and only one instance in the related entity (child). For example, a person (parent table) has only one passport and a passport/NID (child) is given to one person.

(As a rule of thumb, you should add a foreign key on the child table referencing the parent table) [Ref]

(Here, the Person table has a PersonID column that’s referenced by the PersonID column in the Passport table. The diagram shows a one-to-one relationship by having a 1 against each column reference. From : Database Guide)

One to Many (1:M): One instance in an entity (parent) refers to one or more instances in the related entity (child). For example – a customer can place many orders but a order cannot be placed by many customers.

(In this example the primary key field in the Customers table, Customer ID, is designed to contain unique values. The foreign key field in the Orders table, Customer ID, is designed to allow multiple instances of the same value.) [Ref]

Many to One (M:1) : When more than one instances of an entity is associated with a single instance of another entity then it is called many to one relationship. For example – many students can study in a single college but a student cannot study in many colleges at the same time.

Many to Many (M:N) : Many to Many relationship exists when one instance of the first entity (parent) can relate to many instances of the second entity (child), and one instance of the second entity can relate to many instances of the first entity. For example, a can be assigned to many classes and a class can be assigned to many students.

(The primary key Student ID uniquely identifies each student in the Students table. The primary key Class ID uniquely identifies each class in the Classes table. The Enrollments table contains the foreign keys Student ID and Class ID.) [Ref]

Degree of Relationships

It represents the number of entity types that associate in a relationship.

Types:

Unary : when both the participating entity type are the same, degree: 1

Binary : when exactly two entity type participates, degree: 2

Ternary: when exactly three entity type participates, degree: 3

N-ary: when exactly n entity type participates, degree: n

Participation Constraints

Participation constraints define the least number of relationship instances in which an entity must compulsorily participate.

Types:

Total/Mandatory Participation: each entity in the entity set must compulsorily participate in at least one relationship instance. It represents by double lines, minimum cardinality 1.

Partial participation: each entity in the entity set may or may not participate in the relationship instance in that relationship set. It represents by single line, cardinality 0.

Relational Model

Integrity Constraints

Integrity constraints are a set of rules defined in a database to maintain data accuracy, consistency and reliability.

Types of Integrity Constraints

Domain Constraint : This constraint is related to attributes and ensures the data entered into a column satisfies the rules defined for that column. Example: age column can never accept negative integers.

Entity Integrity Constraint : A table should have at least one primary key which will uniquely identify each row, a primary key cannot be null.

Relational Integrity Constraint: This is totally related to foreign key. This integrity is applied to establish relationship between tables in a database. two tables are related to each other through foreign key. the primary key of one table serves as the foreign key for its related table.

Key Constraint : A key constraint in a Database Management System (DBMS) refers to a set of rules applied to one or more columns in a database table to ensure the uniqueness and integrity of data. Keys are used to uniquely identify rows in a table, and they play a fundamental role in establishing relationships between tables.

Normalization

Normalization is a database design technique which organizes tables in a manner that reduces redundancy and dependency of data

It divides larger tables to smaller tables and links them using relationships.

Anomaly

Anomalies in the relational model refer to inconsistencies or errors that can arise when working with relational databases.

Types of Anomalies:

[ The table has four columns: e_id (employee's id), e_name (employee's name), e_address (employee's address), and e_dept (employee's department). Suppose the table has not been normalized and has redundant data. ]

Insert Anomaly:
1. Insert anomalies occur when certain attributes cannot be inserted into the database due to missing additional data.
1. For example, if a new employee is not assigned a department, their data cannot be inserted into the table if the department field does not allow null values.

Update Anomaly:
1. Update anomaly is something when we are trying to update some records in table, and that update is causing data inconsistency.
1. For example, if an employee's address changes and the update is made in one row but not in others, the database will contain inconsistent data

Delete anomaly
1. Delete anomaly is something when we delete some data from the table, and due to that delete operation we loss some other useful data.
1. For example, if a department is shut down and all rows containing that department are deleted, the data of employees working solely in that department will also be deleted.

Prerequisite for Normal forms:

Functional Dependency : relationship between attributes that if we are given the value of one of the attributes we can look up the value of the other. Example : s_id → s_name, s_id determines s_name.
- Functional dependency describes relationship between attributes. For example, if A and B are attributes of relation R, B is functionally dependent on A (denoted A → B), if each value of A in R is associated with exactly one value of B in R
- Properties of FD:
  - Reflexivity: If Y is a subset of X, then X → Y
  - Augmentation: If X → Y, then XZ → YZ
  - Transitivity: If X → Y and Y → Z, then X → Z
  - Decomposition: If A→BC holds, then A→B holds and A→C holds.
  - Union rule: If A→B holds and A→C then A→BC holds
  - Pseudo transitivity rule: If A→B holds and BC→D holds, then AC→D holds.
- Types:
  - Trivial : This type occurs when a set of attributes is functionally dependent on a subset of itself. Example : {name, roll} → {name}
  - Non-Trivial: This type occurs when a set of attributes is functionally dependent on another set of attributes that is not a subset of itself. Example: {roll} → {name}
  - Multivalued: In Multivalued functional dependency, entities of the dependent set are not dependent on each other. Example : a → {b, c} and there exists no functional dependency between b and c.
  - Transitive: If X → Y and Y → Z, then X → Z
  - Partial Dependency:
    - Partial dependency in a relational database occurs when a non-prime attribute (i.e., not part of any candidate key) is functionally dependent on only a part of the candidate key, rather than the entire candidate key. [Ref]
    - Given a relation dependencies F defined on the attributes of R and K as a candidate key ,if X is a proper subset of K and if F|= X→A, then A is said to be partial dependent on K
  [Ref]

Closure Method
- Helps to find all candidate keys of a table
$R(ABCD)$
FD: $\{ A\rightarrow B, B\rightarrow C, C\rightarrow D\}$
Closure of A, $A^+ = \{A BCD\}$ , $B^+ = BCD, C^+ = CD, D^+=D$
Explanation of $A^+$ : A can determines itself. A determines B so B is here. B determines C and A determines B, by transitive property A determines C too. By the same way, A can determine D.
Hence, A can determines all attributes, so it is a Candidate Key.
$CK = A$
$\text{Prime attribute} = {A}$ , which is used in making of the candidate key.
${\text {Non-prime}} = {B, C, D}$ , which is not used in making of the candidate key.
$(AB)^+ = ABCD$ , super key.

$FD = \{ A\rightarrow B, B\rightarrow C, C\rightarrow D, D\rightarrow A, \}$
$A^+ = B^+ = C^+ = D^+ = ABCD$ , so all of them are candidate key.
$CK = {A,B,C,D}$
$\text{Prime attribute (PA)} = A, B, C, D$
${\text {Non-prime attribute (NPA)}} = \{\}$

How to find CK ?
- Find the attribute which are absent in the right side.
$R(ABCDE)$
$FD = \{A \rightarrow B, BC \rightarrow D, E \rightarrow C, D\rightarrow A\}$
$E$ is absent in right side. So, each CK must contain $E$ .
$E^+ = EC$ , so it can’t determines all attribute. We have to add another attribute from left side.
$(AE)^+ = ABECD$ , is a CK. Now we can check BE, CE, DE etc. But, there is another easy way to find it out, just replace the $A$ with $X$ when $X \rightarrow A$ . Do it recursively.
So, $CK = \{ AE, DE, BE\}$

Decompositions
- A decomposition of a relation schema R consists of replacing the relation schema by two (or more) relation schemas that each contain a subset of the attributes of R and together include all attributes in R.
- Types
  - Lossless-join Decomposition
    - A decomposition of a relation scheme $R<S,F>$ into the relation schemes $R_i(1<=i<=n)$ is said to be a lossless join decomposition or simply lossless if for every relation R that satisfies the FDs in F, the natural join of the projections or $R_i$ gives the original relation R, i.e,
      $R=∏_{R1}( R ) ∏_{R2}( R ) …….. ∏_{Rn}( R )$
      To be lossless, $R_1, ... R_n$ should have a common attribute which is a CK or Super Key of either $R_1, or ... R_n$ or both.
      Example : Let R(A,B,C) AND F={A→B}. Then the decomposition of R into R1(A,B) and R2(A,C) is lossless because the FD { A→B} is contained in R1 and the common attribute A is a key of R1.
  - Lossy Decomposition
    - if $R \text{ is a subset of } ∏_{R1}( R ) ∏_{R2}( R ) …….. ∏_{Rn}( R )$
    The common attribute of $R_1, ... R_n$ is not a CK, Super Key.
    Example: Let R(A,B,C) AND F={A→B}. Then the decomposition of R into R1(A,B) and R2(B,C) is not lossless because the common attribute B does not functionally determine either A or C. i.e, it is not a key of R1 or R 2.
  - Dependency Preserving Decomposition:
    - If we decompose $R$ into $R_1, R_2, ... R_n$ , then
    $FD_1 \space U \space FD_2 ... \space U \space FD_n = FD^+$ , then dependency is preserved.

Normal forms

Normal forms are series of guidelines that help to ensure that the design of a database is efficient, organized and free from data anomalies.

First Normal Form:
1. The value of each attribute is atomic.
1. Each record needs to be unique
1. Each column must have a unique name
1. Attribute domain should be same

Not in First normal form

In First normal form

Second Normal Form
1. Table should be in 1NF first.
1. Partial dependency is not allowed = All the non-prime attributes should be fully functional dependent on CK = $\text{NOT } (PartOfCK \text{(proper subset of CK)} \rightarrow NPA )$
$R(ABCDEF)$
$FD = \{C \rightarrow F, E \rightarrow A, EC \rightarrow D, A \rightarrow B \}$
$EC^{+} = ABCDEF$ , why EC ? because they are absent in right side.
$CK = E C, \\ NPA=A, B, D, F, \\ PA=E, C$
Now, ${C \rightarrow F, E \rightarrow A }$ where a part of CK determines a NPA. So, there are partial dependency.

Third Normal Form
- The table should be 2NF
- There should be no transitive dependency in table = NO $NPA \rightarrow NPA$ in functional dependency = $A \rightarrow B,$ $A$ candidate key or $B$ PA

[Ref]

$FD = {AB \rightarrow C, C \rightarrow D}$

$CK = AB$

$PA = A, B \\ NPA = C, D$

This is in 2nd NF. Hence, $C \rightarrow D,$ is $NPA \rightarrow NPA$ so there is transitive dependency. It is not in 3rd NF.

$FD = {AB \rightarrow CD, D \rightarrow A} \\ CK = AB \\ PA = A, B\\ NPA = C, D$

This is in 2nd NF . Hence, there is no $NPA \rightarrow NPA,$ = No transitive dependency. So, it is in 3rd NF.

BCNF/Boyce Codd Normal Form/3.5 NF
- Table should be in 3NF
- For any function dependencies, $A \rightarrow B$ , $A$ must be a super key

Fourth Normal Form
- Table should be in BCNF
- There mustn’t multi-value dependency
Multi-valued Dependency
- Table should have at least 3 columns
- For a FD A→ B, if for a single value of A, multiple values of B exists
- For a relation R(A, B, C), if there is multi-valued dependency between A and B , A and C, then B and C should be independent of each other.

Here sid and course are dependent but the Course and Skill are independent. The multivalued dependency is denoted as :

sid→ Course

sid→ Skill

Convert it to 4 NF:

Fifth Normal Form
- It should be in 4NF
- it is a lossless decomposition

(5)

(c)

(6)

(a) A weak entity set can always be made into a strong entity set by adding to its attributes the primary-key attributes of its identifying entity set. Outline what sort of redundancy will result if we do so.

The redundancy resulting from this conversion leads to the unnecessary duplication of primary key attributes in both the weak entity set and the relationship set.

The primary key of a weak entity set can be inferred from its relationship with the strong entity set. If we add primary key attributes to the weak entity set, they will be present in both the entity set and the relationship set and they have to be the same. Hence there will be redundancy.

(b) What are the different steps involved in normalizing a database? Why do you need to use functional dependencies while normalizing databases?

Normalization is a database design technique which organizes tables in a manner that reduces redundancy and dependency of data.

Step:

Unnormalized Form

BCNF

Why do we need to use functional dependency

Eliminating Redundancy: By analyzing FDs, we can break tables into smaller, related tables to reduce redundancy.

Ensuring Data Integrity: When a functional dependency is violated, data integrity problems can occur.

Detecting Partial Dependencies: 2NF

Detecting Transitive Dependencies: 3NF

Candidate Key Determination: All NF

Achieving Higher Normal Forms: To check the highest NF of a relation

(c)