Friday, June 2, 2017

[DBMS] Unit 5- Design Normalization of Database Tables


  ♦ Database Tables and Normalization:
While data normalization rules tend to increase the duplication of data, it does not introduce data redundancy, which is unnecessary duplication. Database normalization is typically a refinement process after the initial exercise of identifying the data objects that should be in the relational database, identifying their relationships and defining the tables required and the columns within each table.

♦ The Normalization Process (1NF, 2NF, 3NF):
First normal form (1NF). This is the "basic" level of database normalization, and it generally corresponds to the definition of any database, namely:
•    It contains two-dimensional tables with rows and columns.
•    Each column corresponds to a sub-object or an attribute of the object represented by the entire table.
•    Each row represents a unique instance of that subobject or attribute and must be different in some way from any other row (that is, no duplicate rows are possible).
•    All entries in any column must be of the same kind. For example, in the column labeled "Customer," only customer names or numbers are permitted.
Second normal form (2NF). At this level of normalization, each column in a table that is not a determiner of the contents of another column must itself be a function of the other columns in the table. For example, in a table with three columns containing the customer ID, the product sold and the price of the product when sold, the price would be a function of the customer ID (entitled to a discount) and the specific product.
Third normal form (3NF). At the second normal form, modifications are still possible because a change to one row in a table may affect data that refers to this information from another table. For example, using the customer table just mentioned, removing a row describing a customer purchase (because of a return, perhaps) will also remove the fact that the product has a certain price. In the third normal form, these tables would be divided into two tables so that product pricing would be tracked separately.

♦ Surrogate Key Considerations:

A surrogate key has the following characteristics:
1) It is typically an integer.
2) It has no meaning. You will not be able to know the meaning of that row of data based on the surrogate key value.
3) It is not visible to end users. End users should not see a surrogate key in a report.
Surrogate keys can be generated in a variety of ways, and most databases offer ways to generate surrogate keys. For example, Oracle uses SEQUENCE, MySQL uses AUTO_INCREMENT, and SQL Server uses IDENTITY.
 
♦ DE normalization:
Denormalization is an approach to speeding up read-oriented data retrieval performance in a relational database, where the database administrator selectively adds back specific instances of redundant data after the data structure has been normalized.

[DBMS] Unit 4- Conceptualize Design using the Entity Relationship (ER) Modeling


   ♦ The Entity Relationship (ER) model :

An entity relationship model, also called an entity-relationship (ER) diagram, is a graphical representation of entities and their relationships to each other, typically used in computing in regard to the organization of data within databases or information systems. An entity is a piece of data-an object or concept about which data is stored.

♦ Developing an ER Diagram.
An Entity Relationship Diagram (ERD) is a visual representation of different data using conventions that describe how these data are related to each other.

 

Fig:ER diagram example with entity having attributes

 ♦ Database Design Challenges:  Conflicting Goals:


Database designer often must make design compromises that are triggered by conflicting goals such as;

• Observance  to design standards

• Processing speed, and

• Information requirements

 

►Design Standards :

The database design must conform to design standards. Such standards have guided you in developing logical structures that minimize data redundancies. Without the design standards, it is nearly impossible to formulate a proper design process, to evaluate an existing design, or to trace the likely logical impart of changes in design.

Thursday, June 1, 2017

Unit 3- Conceptualize Design using the Relational Database Model (RDBMS)


  A Logical View of Data:
Develop a logical view of data to represent and access data in an enterprise. After you develop a logical view of data, you
can add it to a data service to make virtual data available for end users.
Before you develop a logical view of data, you can define the physical data objects that you want to use in a logical data
object mapping. You can also profile the physical data sources to analyze data quality.
1. Create or import a logical data model.
2. Optionally, add logical data objects to the logical data object model and define relationships between objects.
3. Create a logical data object mapping to read data from a logical data object or write data to a logical data object. A
logical data object mapping can contain transformation logic to transform the data. The transformations can
include data quality transformations to validate and cleanse the data.
4. View the output of the logical data object mapping.
 
Keys: Keys are very important part of Relational database. They are used to establish and identify relation between tables. They also ensure that each record within a table can be uniquely identified by combination of one or more fields within a table.
For example: Primary Key, Foregin key
Integrity Rules:
Integrity Rules are imperative to a good database design. Most RDBMS have these rules automatically, but it is safer to just make sure that the rules are already applied in the design. There are two types of integrity mentioned in integrity rules, entity and reference.
·         Integrity rules are needed to inform the DBMS about certain constraints in the real world.
·         Specific integrity rules apply to one specific database.
Example: part weights must be greater than zero.
·         General integrity rules apply to all databases.
Relational Set Operators.
One of the characteristics of RDBMS is that it should support all the transaction on the records in the table by means relational operations. That means it should have strong query language which supports relational algebra. There are three main relational algebras on sets – UNION, SET DIFFERENCE and SET INTERSECT.  The same is implemented in database query language using set operators.
The Data Dictionary and the System Catalog:
Data dictionary provides a detailed description of all tables found within the user/designer-created database. It contains the attribute names and characteristics for each table in the system. In short it contains metadata (Descriptive statistical information about the element of a set of data.)

System catalog contains metadata .the system catalog can be described as detailed system data dictionary that describes all objects within the database, including data about table name, the table s creator and creation date, the number of columns in each table, the data type corresponding to each column, index filenames, index creators, authorized users, and access privileges. Current relational database software generally provides only a system catalog, from which the designer s data dictionary information may be derived.
In effect, the system catalog automatically produces database documentation. As new tables are added to the database, that documentation also allows the RDBMS to check for and eliminate homonyms and synonyms.
Homonyms indicate the use of the same attribute name to label different attributes. For example, you might use C-NAME to label a customer name attribute in a CUSTOMER table and also use C-NAME to label a consultant name attribute in a CONSULTANT table.
Synonym is the opposite of homonym and indicates the use of different names to describe the same attribute.
Relationships within the Relational Database:
A heavy part of designing a relational database is dividing the data elements into related tables. Once you're ready to start working with the data, you rely on relationships between the tables to pull the data together in meaningful ways. For instance, order information is useless unless you know which customer placed a particular order. By now, you probably realize that you don't store customer and order information in the same table. Instead, you store order and customer data in two related tables and then use a relationship between the two tables to view each order and its corresponding customer information at the same time.
There are three types of relationships:
  • One-to-one: Both tables can have only one record on either side of the relationship. Each primary key value relates to only one (or no) record in the related table. They're like spouses—you may or may not be married, but if you are, both you and your spouse have only one spouse. Most one-to-one relationships are forced by business rules and don't flow naturally from the data. In the absence of such a rule, you can usually combine both tables into one table without breaking any normalization rules.
  • One-to-many: The primary key table contains only one record that relates to none, one, or many records in the related table. This relationship is similar to the one between you and a parent. You have only one mother, but your mother may have several children.
  • Many-to-many: Each record in both tables can relate to any number of records (or no records) in the other table. For instance, if you have several siblings, so do your siblings (have many siblings). Many-to-many relationships require a third table, known as an associate or linking table, because relational systems can't directly accommodate the relationship.

Data Redundancy Revisited:
Data redundancy in database means that some data fields are repeated in the database.
This data repetition may occur either if a field is repeated in two or more tables or if the field is repeated within the table.
Data can appear multiple times in a database for a variety of reasons. For example, a shop may have the same customer’s name appearing several times if that customer has bought several different products at different dates.

Disadvantages of data redundancy

  1. Increases the size of the database unnecessarily.
  2. Causes data inconsistency.
  3. Decreases efficiency of database.
  4. May cause data corruption.

Indexes:
Indexing is a data structure technique to efficiently retrieve records from the database files based on some attributes on which the indexing has been done. Indexing in database systems is similar to what we see in books.
The only minor disadvantage of using index is that it takes up a little more space than the main table. Additionally, index needs to be updated periodically for insertion or deletion of records in the main table.

Thursday, March 26, 2015

Online learning portal should be reviewed.


I have been always willing to promote such portals which will help peoples got engaged and learn, learn lot's  of things in free. It might sound like are you fucking kidding me if all the faculty that will be taught by any of the portal with sound record of attendance and assuring that student are qualified of what they have been learning, I am damm sure that such websites or such online portals which has been earning with students or their user for the education they have been taking Such websites will be no more in competition. In internet now every body should be focused in the services they have been providing or taking it's how much cheaper or if they have already been made free by any other of startups for their good marketing and getting popular in overnight will.
These days education has become so easier to get any thing we need to know but if it comes up to certification and qualification then every body with lower income got stucked, or those who don't have enough time it will be help full for them too.
Free Education is what we all want! Don't monetize Knowledge but use knowledge to get monetized.