Lecture 1 – Introduction to Database
What
is a database? Why do we need one? What is the benefit of database? What are
the other options?
These
kinds of questions might arise during this course. I shall give you a tour of
database and in turn you would be able to answer these questions yourself.
Database
A
database is a set of logically related data organized as a single unit.
What
are data?
Data
Data
are the raw facts and figures which are or may be input to some process.
For
example Your name, Date of Birth,
City, Gender, Height, Weight etc
Information
Information
is the processed form of data or required data is called information. You need
to apply some process to data in order to get information. Examples of
information are
Age
– As it is calculated on the basis of Date of Birth
BMI
(Body Mass Index) – As it requires two pieces of data i.e. Height and Weight
Data vs
Information
|
Data
|
Information
|
|
Data are the raw facts and figures
|
Information is the processed form of data
|
|
Data is stored
|
Information is calculated/generated
|
|
Requires a lot of storage space
|
Requires less storage space
|
|
Data are independent of information
|
Information is dependent on Data
|
|
Difficult to gather
|
Easy to generate
|
Table
1 – Data vs Information
Now
the question arises why do we need to have a database?
In
order for us to be able to answer this question we need to learn how the data
were organized before the database and what were the drawbacks of this
approach.
We
used to store data in files, which were application dependent, redundant,
inconsistent, non-flexible, not shareable etc.
Drawbacks of File Manipulation System
Application Dependent
These
files were dependent on the type of the application program you are using. For
example MS Word DATA FILE (myFile.doc) is dependent on MS Word PROGRAM
FILE (winword.exe). You need to have MS Word Installed in order to open
your data file.
Redundancy
The
term redundancy refers to the “un-necessary duplication of data”. When the data
is duplicated and we do not need it then it is called redundancy. The data in
files were redundant. When the data is duplicated on purpose then it is called
replication/backup. The difference between replication and backup is that in
backup the copy of data is kept at a safe location. It would be used in case
the original data is destroyed or damaged.
Inconsistency
The
term inconsistency refers to the situation where the portion of data is missing
or incomplete. For example if I count from 1 to 10, like 1, 2, 3, 4, 5, 6, 7,
8, 9, 10. In this example the data is consistent. Now consider, 1, 2, 3, 4, 6,
7, 9, 10. Now you can see that there are some pieces of information are
missing. This state is called inconsistent state.
Non-Flexible
Data
in files were not flexible. That means we were not able to generate information
which is required. The data needed to be arranged in a particular sequence if
we need to generate different type of information.
Non-Shareable
Files
do not support sharing of data among different users. It is not possible to
entertain more than 1 person at a time. But consider a scenario where multiple
people are reserving their seats in and airplane, or using credit cards,
logging into their email accounts etc. These facilities are not provided by File
Manipulation System.
There are there categories on the basis of which we can
differentiate files.
1.
Usage
2.
Functional
3.
Access Mechanism
Different File Types
Usage Basis
There
are three sub-types on the basis of usage
1.
Master
File – Contains data which remains constant for a long period. i.e. Student
Profile (Name, Address, Phone etc)
2.
Transaction
File – Contains data which is used to store input data or it is used to
store daily transactions.
3.
Backup
File – Contains Backup of the important pieces of information.
Functionality Basis
There
are two sub-types of files on the basis of Functionality
1.
Data File
– Contains data i.e. myFile.doc , DSC0001.JPG , myFile.xls , song.mp3
2.
Program
File – Contains the software code. i.e. WINWORD.EXE, EXCEL.EXE,
myProgram.EXE etc.
Access Mechanism
There
are two sub-types of files on the basis of Access Mechanism
1.
Sequential/Serial – Can be accessed in a sequence e.g. audio or
video cassette
Random/Direct – Can be accessed randomly, no sequence
required e.g. CD, DVD, HDD
Comments
Post a Comment