Lecture 1 – Introduction to Database



What is a database? Why do we need one? What is the benefit of database? What are the other options?

These kinds of questions might arise during this course. I shall give you a tour of database and in turn you would be able to answer these questions yourself.

Database


A database is a set of logically related data organized as a single unit.

What are data?

Data


Data are the raw facts and figures which are or may be input to some process.

For example       Your name, Date of Birth, City, Gender, Height, Weight etc

Information


Information is the processed form of data or required data is called information. You need to apply some process to data in order to get information. Examples of information are

Age – As it is calculated on the basis of Date of Birth


BMI (Body Mass Index) – As it requires two pieces of data i.e. Height and Weight


 


Data vs Information


Data
Information
Data are the raw facts and figures
Information is the processed form of data
Data is stored
Information is calculated/generated
Requires a lot of storage space
Requires less storage space
Data are independent of information
Information is dependent on Data
Difficult to gather
Easy to generate

Table 1 – Data vs Information

 
 
Now the question arises why do we need to have a database?

In order for us to be able to answer this question we need to learn how the data were organized before the database and what were the drawbacks of this approach.

We used to store data in files, which were application dependent, redundant, inconsistent, non-flexible, not shareable etc.

Drawbacks of File Manipulation System


Application Dependent


These files were dependent on the type of the application program you are using. For example MS Word DATA FILE (myFile.doc) is dependent on MS Word PROGRAM FILE (winword.exe). You need to have MS Word Installed in order to open your data file.

Redundancy


The term redundancy refers to the “un-necessary duplication of data”. When the data is duplicated and we do not need it then it is called redundancy. The data in files were redundant. When the data is duplicated on purpose then it is called replication/backup. The difference between replication and backup is that in backup the copy of data is kept at a safe location. It would be used in case the original data is destroyed or damaged.

Inconsistency


The term inconsistency refers to the situation where the portion of data is missing or incomplete. For example if I count from 1 to 10, like 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. In this example the data is consistent. Now consider, 1, 2, 3, 4, 6, 7, 9, 10. Now you can see that there are some pieces of information are missing. This state is called inconsistent state.

Non-Flexible


Data in files were not flexible. That means we were not able to generate information which is required. The data needed to be arranged in a particular sequence if we need to generate different type of information.

Non-Shareable


Files do not support sharing of data among different users. It is not possible to entertain more than 1 person at a time. But consider a scenario where multiple people are reserving their seats in and airplane, or using credit cards, logging into their email accounts etc. These facilities are not provided by File Manipulation System.

 

There are there categories on the basis of which we can differentiate files.

1.       Usage

2.       Functional

3.       Access Mechanism

Different File Types


Usage Basis


There are three sub-types on the basis of usage

1.       Master File – Contains data which remains constant for a long period. i.e. Student Profile (Name, Address, Phone etc)

2.       Transaction File – Contains data which is used to store input data or it is used to store daily transactions.

3.       Backup File – Contains Backup of the important pieces of information.

Functionality Basis


There are two sub-types of files on the basis of Functionality

1.       Data File – Contains data i.e. myFile.doc , DSC0001.JPG , myFile.xls , song.mp3

2.       Program File – Contains the software code. i.e. WINWORD.EXE, EXCEL.EXE, myProgram.EXE etc.

Access Mechanism


There are two sub-types of files on the basis of Access Mechanism

1.       Sequential/Serial  – Can be accessed in a sequence e.g. audio or video cassette
Random/Direct – Can be accessed randomly, no sequence required e.g. CD, DVD, HDD

Comments

Popular posts from this blog

Lecture 17 – Functions (continued)

Lecture 6 – Operators in C (continued)

Lecture 10 – Decisions (if-else-if)