34
8/6/2019 Chapter 1 III 3 Topic http://slidepdf.com/reader/full/chapter-1-iii-3-topic 1/34

Chapter 1 III 3 Topic

Embed Size (px)

Citation preview

Page 1: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 1/34

Page 2: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 2/34

The transactional database may have additional

tables associated with it, which contain other

information regarding the sale, such as the date

of the transaction, the customer ID number, the

ID number of the salesperson and of the branch

at which the sale occurred, and so on.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 3: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 3/34

Example 1.3 A transactional database for

 AllElec tronic s. Transac tions c an be stor ed  in a table,

w ith one record per transaction. A fragment of a

transactional database for AllElec tronic sis shown in

Figure 1.9. From the relational database point of 

view, the sales table in Figure 1.9 is a nested

relation because the attribute list of it  em IDs

c ontains a set of it ems.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 4: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 4/34

As an analyst of the AllElec tronic s d atabase,

 y ou ma y  ask  , ´S how me all the it ems purchased

by Sandy Smithµ or ´How many transactions

include item number I3?µ Answering suchqueries may require a scan of the entire

transactional database.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 5: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 5/34

1.3.4 Advanced Data and InformationSystems and Advanced Applications

With the progress of database technology,

various kinds of advanced data and information

systems have emerged and are undergoing

development to address the requirements of 

new applications.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 6: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 6/34

The new database applications include handling� Object-oriented and object-relational databases

� Spatial databases

� Time-series data and temporal data

� Text databases and multimedia databases

Heterogeneous and legacy databases� WWW

These applications require efficient data structures and

scalable methods for handling complex object structures;

variable-length records; semistructured or unstructured data;text, spatiotemporal, and multimedia data; and database

schemas with complex structures and dynamic changes.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 7: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 7/34

1.3.4 .1 Object-Relational Databases

Object-relational databases are constructed

based on an object-relational data model.

This model extends the relational model by

providing a rich data type for handling complex

objects and object orientation.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 8: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 8/34

the object-relational data model inherits the

essential concepts of object-oriented databases,

where, in general terms, each entity is considered

as an object. Following the AllElec tronic s ex ample,

obj ec ts c an be ind ivid ual emplo yees , c ustomers , or

items. Data and code relating to an object are

enc a psulat ed  into a single unit. Each object has

associated with it the following:

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 9: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 9/34

1. A set of variables that describe the objects. Thesecorrespond to attributes in the entity-relationship and

relational models.

2. A set of messages that the object can use to

communicate with other objects, or with the rest of the

database system.

3. A set of methods, where each method holds the code to

implement a message. Upon receiving a message, the

method returns a value in response. For instance, the

method for the message get  photo( emplo yee ) w ill

r etriev e and r eturn a photo of the giv en employee object.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 10: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 10/34

� Objects that share a common set of 

properties can be grouped into an object

class.

� Each object is an instance of its class.

� Object classes can be organized into

class/subclass hierarchies so that each classrepresents properties that are common to

objects in that class.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 11: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 11/34

For instance, an emplo  yee class c an c ontain

variables lik e name, add r ess , and birthd at e.

Suppose that the class, sales person , is a

subclass of the class , emplo yee. A sales person

object would inherit all of the variables

pertaining to its superclass of emplo yee.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 12: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 12/34

For data mining in object-relational systems,

techniques need to be developed for handling

complex object structures, complex data types,

class and subclass hierarchies, property

inheritance, and methods and procedures.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 13: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 13/34

1.3.4.2 Temporal Databases, SequenceDatabases, and Time-Series Databases

� A temporal database typically stores

relational data that include time-related

attributes.

� These attributes may involve several

timestamps, each having different semantics.

A sequence database stores sequences of ordered events, with or without a concrete

notion of time.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 14: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 14/34

� A time-series database stores sequences of values or

events obtained over repeated measurements of time

(e.g., hourly, daily, weekly). Examples include data

collected from the stock exchange, inventory control, and

the observation of natural phenomena (like temperatureand wind).

� Data mining techniques can be used to find the

characteristics of object evolution, or the trend of 

changes for objects in the database. Such information canbe useful in decision making and strategy planning.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 15: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 15/34

1.3.4.3 Spatial Databases

� Spatial databases contain spatial-related

information.

� Examples include geographic (map)

databases, very large-scale integration

(VLSI) or computed-aided design databases,

and medical and satellite image databases.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 16: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 16/34

� Spatial data may be represented in rasterformat, consisting of n-d imensional bit  ma ps or 

 pi xel ma ps.

� Maps can be represented in vector format,

where roads, bridges, buildings, and lakes are

represented as unions or overlays of basic

geometric constructs, such as points, lines,

polygons, and the partitions and networks

formed by these components.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 17: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 17/34

Geographic databases have numerous

applications, ranging from forestry and ecology

planning to providing public service

information regarding the location of 

telephone and electric cables, pipes, and

sewage systems.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 18: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 18/34

Geographic databases are commonly used in

vehicle navigation and dispatching systems. An

example of such a system for taxis would store

a city map with information regarding one-way

streets, suggested routes for moving from

region A to region B during rush hour, and thelocation of restaurants and hospitals, as well as

the current location of each driver.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 19: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 19/34

´W hat kind  of  d ata mining c an be perfor med  ons patial d atabases?µ 

Data mining may uncover patterns describing the

characteristics of houses located near a specified

kind of location, such as a park, for instance. Other

patterns may describe the climate of mountainous

areas located at various altitudes, or describe the

change in trend of metropolitan poverty rates

based on city distances from major highways.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 20: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 20/34

A spatial database that stores spatial objects

that change with time is called a

spatiotemporal database, from which

interesting information can be mined.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 21: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 21/34

1.3.4.4 Text Databases andMultimedia Databases

Text databases are databases that contain word

descriptions for objects. These word

descriptions are usually not simple keywords

but rather long sentences or paragraphs, such

as product specifications, error or bug reports,

warning messages, summary reports, notes, orother documents.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 22: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 22/34

Page 23: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 23/34

Multimedia databases store image, audio, and

video data. They are used in applications such

as picture content-based retrieval, voice-mail

systems, video-on-demand systems, the World

Wide Web, and speech-based user interfaces

that recognize spoken commands.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 24: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 24/34

Page 25: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 25/34

1.3.4.5 Heterogeneous Databasesand Legacy Databases

� A heterogeneous database consists of a set

of interconnected, autonomous component

databases.

� The components communicate in order to

exchange information and answer queries.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 26: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 26/34

� A legacy database is a group of  het erogeneousd atabases that  c ombines different kinds of data

systems, such as relational or object-oriented

databases, hierarchical databases, network databases, spreadsheets, multimedia databases,

or file systems.

� The heterogeneous databases in a legacy

database may be connected by intra or inter-

computer networks.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 27: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 27/34

Information exchange across such databases is difficult

because it would require precise transformation rules

from one representation to another, considering diverse

semantics.

� Data mining techniques may provide an interestingsolution to the information exchange problem by

performing statistical data distribution and correlation

analysis, and transforming the given data into higher,

more generalized, conceptual levels (such as fair  , good,

or  excellent for student grades) , from w hic h infor mation

exc hange c an then more easily be performed.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 28: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 28/34

1.3.4.6 Data Streams

Many applications involve the generation and

analysis of a new kind of data, called stream

data, where data flow in and out of an

observation platform (or window) dynamically.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 29: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 29/34

Typical ex

amples of data streams include various kinds of � scientific and engineering data,

� time-series data,

� data produced in other dynamic environments, such as� power supply,� network traffic,

� stock exchange,

� telecommunications,

� Web click streams,

video surveillance, and� weather or environment monitoring.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 30: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 30/34

� Because data streams are normally not stored inany kind of data repository, effective and

efficient management and analysis of stream

data poses great challenges to researchers.

� Mining data streams involves the efficient

discovery of general patterns and dynamic

changes within stream data. For example, wemay like to detect intrusions of a computer

network based on the anomaly of message flow

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 31: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 31/34

1.3.4.7 The World Wide Web

The World Wide Web and its associated distributedinformation services, such as Yahoo!, Google,

America Online, and AltaVista, provide rich,

worldwide, on-line information services, where data

objects are linked together to facilitate interactive

access.

� Users seeking information of interest traverse from

one object via links to another.� Such systems provide ample opportunities and

challenges for data mining.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 32: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 32/34

� Although Web pages may appear fancy andinformative to human readers, they can be

highly unstructured and lack a predefined

schema, type, or pattern.

� Thus it is difficult for computers to understand

the semantic meaning of diverse Web pages and

structure them in an organized way forsystematic information retrieval and data

mining.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL

Page 33: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 33/34

Page 34: Chapter 1 III 3 Topic

8/6/2019 Chapter 1 III 3 Topic

http://slidepdf.com/reader/full/chapter-1-iii-3-topic 34/34

Web community analysis helps identify hidden Websocial networks and communities and observe their

evolution.

� Web mining is the development of scalable and effective

Web data analysis and mining methods.

� It may help us learn about the distribution of 

information on the Web in general, characterize and

classify Web pages, and uncover Web dynamics and the

association and other relationships among different Web

pages, users, communities, and Web-based activities.

Ms. S.Bharathi MCA., MBA., M.Phil., PGDPM&LL