What is important when designing a database
So if tomorrow we want to update the syllabus we have to update it for each student, which is painstaking and not logical. It makes more sense to move these fields out and associate them with the Standard table.
If you are working on OLTP applications, getting rid of derived columns would be a good thought, unless there is some pressing reason for performance. In case of OLAP where we do a lot of summations, calculations, these kinds of fields are necessary to gain performance.
In the above figure you can see how the average field is dependent on the marks and subject. This is also one form of redundancy. So for such kinds of fields which are derived from other fields, give a thought: are they really necessary? If the redundant data is calculative data, see the situation and then decide if you want to implement the 3 rd normal form.
Do not make it a strict rule that you will always avoid redundancy. If there is a pressing need for performance think about de-normalization.
In normalization, you need to make joins with many tables and in denormalization, the joins reduce and thus increase performance. OLAP projects mostly deal with multidimensional data. For instance you can see the below figure, you would like to get sales per country, customer, and date.
In simple words you are looking at sales figures which have three intersections of dimension data. For such kinds of situations a dimension and fact design is a better approach.
In simple words you can create a simple central sales fact table which has the sales amount field and it makes a connection with all dimension tables using a foreign key relationship. Many times I have come across name value tables. Must Learn Expand child menu Expand. Big Data Expand child menu Expand. Live Project Expand child menu Expand. AI Expand child menu Expand. Toggle Menu Close. Search for: Search. Factless identifiers are ideal for use as a primary key because they do not change.
A primary key that contains facts about a row — a telephone number or a customer name, for example — is more likely to change, because the factual information itself might change. A column set to the AutoNumber data type often makes a good primary key. No two product IDs are the same.
In some cases, you may want to use two or more fields that, together, provide the primary key of a table. When a primary key employs more than one column, it is also called a composite key. Now that you have divided your information into tables, you need a way to bring the information together again in meaningful ways. For example, the following form includes information from several tables. Access is a relational database management system.
In a relational database, you divide your information into separate, subject-based tables. You then use table relationships to bring the information together as needed. Consider this example: the Suppliers and Products tables in the product orders database.
A supplier can supply any number of products. It follows that for any supplier represented in the Suppliers table, there can be many products represented in the Products table.
The relationship between the Suppliers table and the Products table is, therefore, a one-to-many relationship. To represent a one-to-many relationship in your database design, take the primary key on the "one" side of the relationship and add it as an additional column or columns to the table on the "many" side of the relationship. In this case, for example, you add the Supplier ID column from the Suppliers table to the Products table.
Access can then use the supplier ID number in the Products table to locate the correct supplier for each product. The Supplier ID column in the Products table is called a foreign key. The Supplier ID column in the Products table is a foreign key because it is also the primary key in the Suppliers table.
You provide the basis for joining related tables by establishing pairings of primary keys and foreign keys. If you are not sure which tables should share a common column, identifying a one-to-many relationship ensures that the two tables involved will, indeed, require a shared column. A single order can include more than one product. On the other hand, a single product can appear on many orders.
Therefore, for each record in the Orders table, there can be many records in the Products table. And for each record in the Products table, there can be many records in the Orders table.
This type of relationship is called a many-to-many relationship because for any product, there can be many orders; and for any order, there can be many products. Note that to detect many-to-many relationships between your tables, it is important that you consider both sides of the relationship. The subjects of the two tables — orders and products — have a many-to-many relationship.
This presents a problem. To understand the problem, imagine what would happen if you tried to create the relationship between the two tables by adding the Product ID field to the Orders table. To have more than one product per order, you need more than one record in the Orders table per order. You would be repeating order information for each row that relates to a single order — resulting in an inefficient design that could lead to inaccurate data.
You run into the same problem if you put the Order ID field in the Products table — you would have more than one record in the Products table for each product. How do you solve this problem? The answer is to create a third table, often called a junction table, that breaks down the many-to-many relationship into two one-to-many relationships. You insert the primary key from each of the two tables into the third table. As a result, the third table records each occurrence or instance of the relationship.
Each record in the Order Details table represents one line item on an order. But together, the two fields always produce a unique value for each record. In the product sales database, the Orders table and the Products table are not related to each other directly. Instead, they are related indirectly through the Order Details table. The many-to-many relationship between orders and products is represented in the database by using two one-to-many relationships:.
The Orders table and Order Details table have a one-to-many relationship. Each order can have more than one line item, but each line item is connected to only one order. The Products table and Order Details table have a one-to-many relationship. Each product can have many line items associated with it, but each line item refers to only one product. From the Order Details table, you can determine all of the products on a particular order. You can also determine all of the orders for a particular product.
After incorporating the Order Details table, the list of tables and fields might look something like this:. Another type of relationship is the one-to-one relationship. For instance, suppose you need to record some special supplementary product information that you will need rarely or that only applies to a few products. Like the Products table, you use the ProductID as the primary key. The relationship between this supplemental table and the Product table is a one-to-one relationship.
For each record in the Product table, there exists a single matching record in the supplemental table. When you do identify such a relationship, both tables must share a common field. When you detect the need for a one-to-one relationship in your database, consider whether you can put the information from the two tables together in one table.
If the two tables have the same subject, you can probably set up the relationship by using the same primary key in both tables. If the two tables have different subjects with different primary keys, choose one of the tables either one and insert its primary key in the other table as a foreign key. Determining the relationships between tables helps you ensure that you have the right tables and columns. When a one-to-one or one-to-many relationship exists, the tables involved need to share a common column or columns.
When a many-to-many relationship exists, a third table is needed to represent the relationship. Once you have the tables, fields, and relationships you need, you should create and populate your tables with sample data and try working with the information: creating queries, adding new records, and so on.
Doing this helps highlight potential problems — for example, you might need to add a column that you forgot to insert during your design phase, or you may have a table that you should split into two tables to remove duplication. See if you can use the database to get the answers you want. Create rough drafts of your forms and reports and see if they show the data you expect. Look for unnecessary duplication of data and, when you find any, alter your design to eliminate it.
As you try out your initial database, you will probably discover room for improvement. Here are a few things to check for:. Did you forget any columns? If so, does the information belong in the existing tables? If it is information about something else, you may need to create another table. Create a column for every information item you need to track. Are any columns unnecessary because they can be calculated from existing fields? If an information item can be calculated from other existing columns — a discounted price calculated from the retail price, for example — it is usually better to do just that, and avoid creating new column.
Are you repeatedly entering duplicate information in one of your tables? These kinds of fields are termed as "Repeating groups". If we have to manipulate this data, the query would be complex and also I doubt performance of the queries. These kinds of columns which have data stuffed with separator's need special attention and a better approach would be to move that field to a different table and link the same with keys for better management. So now let's apply the second rule of 1st normal form "Avoid repeating groups".
You can see in the above figure I have created a separate syllabus table and then made a many-to-many relationship with the subject table. With this approach the syllabus field in the main table is no more repeating and having data separators. Rule Watch for partial dependencies. Watch for fields which are depending partially on primary keys. For instance in the above table we can see primary key is created on roll number and standard.
Now watch the syllabus field closely. Syllabus field is associated with a standard and not with a student directly roll number. Syllabus is associated with the standard in which the student is studying and not directly with the student. So if tomorrow we want to update syllabus we have to update for each student which is pain staking and not logical. It makes more sense to move these fields out and associate them with the standard table.
This rule is nothing but second normal form "All keys should depend on the full primary key and not partially". Rule Choose derived columns preciously. If you are working on OLTP applications must be getting rid of derive columns would be good thought, until there is some pressing reason of performance.
In case of OLAP where we do lot of summations, calculations these kinds of fields are necessary to gain performance. In the above figure you can see how average field is dependent on marks and subject.
This is also one of form of redundancy. So for such kind of fields which are derived from other fields give a thought are they really necessary. This rule is also termed as 3rd normal form "No columns should depend on other non-primary key columns". My personal thought is do not apply this rule blindly see the situation; it's not that redundant data is always bad.
If the redundant data is calculative data , see the situation and then decide do you want to implement the third normal form. Do not make it a strict rule that you will always avoid redundancy. If there is a pressing need of performance think about de-normalization. In normalization you need to make joins with many table and in denormalization the joins reduces and thus increasing performance. OLAP projects mostly deal with multidimensional data.
0コメント