Friday, June 2, 2017

[DBMS] Unit 5- Design Normalization of Database Tables


  ♦ Database Tables and Normalization:
While data normalization rules tend to increase the duplication of data, it does not introduce data redundancy, which is unnecessary duplication. Database normalization is typically a refinement process after the initial exercise of identifying the data objects that should be in the relational database, identifying their relationships and defining the tables required and the columns within each table.

♦ The Normalization Process (1NF, 2NF, 3NF):
First normal form (1NF). This is the "basic" level of database normalization, and it generally corresponds to the definition of any database, namely:
•    It contains two-dimensional tables with rows and columns.
•    Each column corresponds to a sub-object or an attribute of the object represented by the entire table.
•    Each row represents a unique instance of that subobject or attribute and must be different in some way from any other row (that is, no duplicate rows are possible).
•    All entries in any column must be of the same kind. For example, in the column labeled "Customer," only customer names or numbers are permitted.
Second normal form (2NF). At this level of normalization, each column in a table that is not a determiner of the contents of another column must itself be a function of the other columns in the table. For example, in a table with three columns containing the customer ID, the product sold and the price of the product when sold, the price would be a function of the customer ID (entitled to a discount) and the specific product.
Third normal form (3NF). At the second normal form, modifications are still possible because a change to one row in a table may affect data that refers to this information from another table. For example, using the customer table just mentioned, removing a row describing a customer purchase (because of a return, perhaps) will also remove the fact that the product has a certain price. In the third normal form, these tables would be divided into two tables so that product pricing would be tracked separately.

♦ Surrogate Key Considerations:

A surrogate key has the following characteristics:
1) It is typically an integer.
2) It has no meaning. You will not be able to know the meaning of that row of data based on the surrogate key value.
3) It is not visible to end users. End users should not see a surrogate key in a report.
Surrogate keys can be generated in a variety of ways, and most databases offer ways to generate surrogate keys. For example, Oracle uses SEQUENCE, MySQL uses AUTO_INCREMENT, and SQL Server uses IDENTITY.
 
♦ DE normalization:
Denormalization is an approach to speeding up read-oriented data retrieval performance in a relational database, where the database administrator selectively adds back specific instances of redundant data after the data structure has been normalized.

No comments:

Post a Comment