View
220
Download
0
Embed Size (px)
Citation preview
Database structure & Database structure & file organizationfile organization
LIS 670Bair-Mundy
Electronic databases
Governmentaldatabases
Commercialdatabases
OPACSpecialtydatabases
ERICEBSCO
Host
Trust Territory
Online Public Access Catalog
The online catalog
Bibliographicdata
Nameauthority
data
Holdingsdata
Circulationdata
TitlePublisherDate of publicationExtent of item
Caroll, LewisKarol, LuisKerroll, L.Dodgson,
Charles
Copies of a book;
Issues of a journal
Who checked out what; for how long; patron data
Entity-relationship diagram
NAMENAME_ID
PERS_NAMCRP_NAMCNF_NAMAD_P_NAM
SUBJ_ID
SUBJECTUSE_FORREL_TERMNAR_TERM
SUBJECT
LCN
ISBNPERS_NAM (FK)TITLEPLACE_PUBLPUBLISHERSUBJECT (FK)
BIBL-ENTITY
TRANS_IDTR_PA_ID (FK)TR_DATEDUE_DATETR_HLD_ID (FK)
CHKED_OUT_ITEM
PATRON_ID
PAT_NAMESTREET_1STREET_2CITYSTATE
PATRON
HOLDINGCP_NOHLD_LCN (FK)
LOCATIONCALL_NOHLD_DTE (FK)
Bibliographic Entity (Resource)
LCN
ISBNPERS_NAM (FK)TITLEEDITIONPLACE_PUBLPUBLISHERDATE_PUBLEXTENT_OF_ITEMSERIESSUBJECT (FK)
BIBL-ENTITY
Abstraction of a book, journal, video, photograph, or artifact
Unique ID(Primary Key)
Attributes
Name Entity
NAMENAME_ID
PERS_NAMCRP_NAMCNF_NAMAD_P_NAM
LCN
ISBNPERS_NAM (FK)TITLEEDITIONPLACE_PUBLPUBLISHERDATE_PUBLEXTENT_OF_ITEMSERIESSUBJECT (FK)
BIBL-ENTITY
Abstraction of a book, journal, video, photograph, or artifact
Unique ID
Attributes
Subject Entity
NAMENAME_ID
PERS_NAMCRP_NAMCNF_NAMAD_P_NAM
SUBJ_IDSUBJECTUSE_FORREL_TERMNAR_TERM
SUBJECT
LCN
ISBNPERS_NAM (FK)TITLEEDITIONPLACE_PUBLPUBLISHERDATE_PUBLEXTENT_OF_ITEMSERIESSUBJECT (FK)
BIBL-ENTITY
Abstraction of a book, journal, video, photograph, or artifact
Additional entities
NAMENAME_ID
PERS_NAMCRP_NAMCNF_NAMAD_P_NAM
SUBJ_ID
SUBJECTUSE_FORREL_TERMNAR_TERM
SUBJECT
LCN
ISBNPERS_NAM (FK)TITLEPLACE_PUBLPUBLISHERSUBJECT (FK)
BIBL-ENTITY
TRANS_IDTR_PA_ID (FK)TR_DATEDUE_DATETR_HLD_ID (FK)
CHKED_OUT_ITEM
PATRON_ID
PAT_NAMESTREET_1STREET_2CITYSTATE
PATRON
HOLDINGCP_NOHLD_LCN (FK)
LOCATIONCALL_NOHLD_DTE (FK)
Entity RelationshipsNAME
NAME_ID
PERS_NAMCRP_NAMCNF_NAMAD_P_NAM
SUBJ_ID
SUBJECTUSE_FORREL_TERMNAR_TERM
SUBJECT
LCN
ISBNPERS_NAM (FK)TITLEPLACE_PUBLPUBLISHERSUBJECT (FK)
BIBL-ENTITY
TRANS_IDTR_PA_ID (FK)TR_DATEDUE_DATETR_HLD_ID (FK)
CHKED_OUT_ITEM
PATRON_ID
PAT_NAMESTREET_1STREET_2CITYSTATE
PATRON
HOLDINGCP_NOHLD_LCN (FK)
LOCATIONCALL_NOHLD_DTE (FK)
Foreign KeysNAME
NAME_ID
PERS_NAMCRP_NAMCNF_NAMAD_P_NAM
SUBJ_ID
SUBJECTUSE_FORREL_TERMNAR_TERM
SUBJECT
LCN
ISBNPERS_NAM (FK)TITLEPLACE_PUBLPUBLISHERSUBJECT (FK)
BIBL-ENTITY
TRANS_IDTR_PA_ID (FK)TR_DATEDUE_DATETR_HLD_ID (FK)
CHKED_OUT_ITEM
PATRON_ID
PAT_NAMESTREET_1STREET_2CITYSTATE
PATRON
HOLDINGCP_NOHLD_LCN (FK)
LOCATIONCALL_NOHLD_DTE (FK)
Relational databases
The logical view: data and data relationships in the database.
Main Author:Eco, Umberto. Uniform Title:Nome della rosa. English Title:The name of the rose / Umberto Eco ; translated from the Italian by William Weaver. Publisher:San Diego : Harcourt Brace, 1994. Description:1st Harvest ed.
536 p. ill. ; 21 cm. Series:Harvest in translation
A Harvest book Subject(s):Historical fiction
Detective and mystery stories Call Number: PQ4865 .C6 N613 1994 Status:Not Checked Out
Relational databasesMain Author: Eco, Umberto. Uniform Title: Nome della rosa. English Title:The name of the rose / Umberto Eco ; translated from the Italian by William Weaver. Publisher:San Diego : Harcourt Brace, 1994.
.
.
.
NAF 13789Norton, Peter
NAF 13789Norton, Peter
NAF 29563Frost, Robert
NAF 29563Frost, Robert
NAF 19568Eco, Umberto
NAF 19568Eco, Umberto
NAF 19568NAF 19568
Relational databasesAuthor: Dalyrimple, JenTitle:Try your best / by Jen Dalyrimple. …NAF 25793
Dalyrimple, Jan
NAF 25793Dalyrimple, Jan
Dalyrimple, JanDalyrimple, Jan
Author: Dalyrimple, JenTitle:The name of my nose / by Jen Dalyrimple. …
Dalyrimple, JanDalyrimple, Jan
Author: Dalyrimple, JenTitle:My name and fame / by Jen Dalyrimple. …
Dalyrimple, JanDalyrimple, Jan
Dalyrimple, JenDalyrimple, Jen
Dalyrimple, JenDalyrimple, Jen
Dalyrimple, JenDalyrimple, Jen
Oops,Typo!
NAF 25793Dalyrimple, Jen
Dalyrimple, J.S.
NAF 25793Dalyrimple, Jen
Dalyrimple, J.S.
Cataloging record
Main Author:Eco, Umberto. Uniform Title:Nome della rosa. English Title:The name of the rose / Umberto Eco ; translated from the Italian by William Weaver. Publisher:San Diego : Harcourt Brace, 1994. Description:1st Harvest ed.
536 p. ill. ; 21 cm. Series:Harvest in translation
A Harvest book Subject(s):Historical fiction
Detective and mystery stories Call Number: PQ4865 .C6 N613 1994 Status:Not Checked Out
12340005
Historical fiction
SUBJECTrecords
Detective and mystery stories
123411171234000512341117
Historical fiction
Detective and mystery stories
Subject heading change
Title:This is how we flow : rhythm in Black cultures / edited by Angela M.S. Nelson. Publisher:Columbia, S.C. : University of South Carolina Press, c1999. Description: vi, 160 p. : ill., maps, music ; 24 cm. Subject(s):Afro-Americans.
12342001
Afro-Americans
SUBJECTrecord
12342001African Americans
African Americans
Tuples
PUBLISHER
Bugsy Press
Tara Pub. Co.
Beau Gens
Bowring Press
Earth Press
BIBL-ENTITY
LCN
00001
00002
00003
00004
00005
TITLE
My life of crime / …
Gone with the wind / …
Life after library school
Drudgery made fun /…
Mudpies…
PLACE_PUBL
London
Athens, Ga.
Paris
San Diego
Fresno
Data descriptions for a table
BIBL-ENTITY
Req'd
Yes
No
No
Yes
No
No
Yes
No
Attribute name
LCN
ISBN
PERS_NAM (FK)
TITLE
PLACE_PUBL
PUBLISHER
PUB_DATE
SUB_HD (FK)
Type
Integer
Text
Integer
Text
Text
Text
Text
Integer
Size
5
20
75
200
100
100
100
10
DataUpdateable
False
True
True
True
True
True
True
True
Attribute
Counter
Fixed length
Variable length
Variable length
Variable length
Variable length
Variable length
Fixed length
Telephone numbers
Room no. Extension no.101 67321102 69518103 65835104 69112105 69345106 68123107 67721
Fixed-length fieldsThe same amount of space is allocated for every instance of the field.
6 7 3 2 1
6 9 5 1 8
6 5 8 3 5
1 0 1
1 0 2
1 0 3
Records with fixed-length fieldsPhone directory:
office number (3 digits)telephone extension (5 digits)
10167321$10269518$10365835$
extensionno.
officeno.
End of Record markers
Beginning of file
End of file
Titles
Godzilla / by Simian Amicus
Voyage around the world in the vessel La Perouse under Captain Swashbuckler during the years 1887, 1888, and 1889 with the full blessings of Her Majesty the Queen of Elbonia / by A. Hoy Maytees
Variable-length fieldsLength of field varies according to the amount of data stored within.
G o d z I l l a
V o y a g e a r o u n d t h e w o r l d i n t h e v e s s e l L a
Records with variable-length fields
008000110001392450154 …Maytees, A. HoyVoyage around the world in the vessel La Perouse under Captain Swashbuckler during the years 1887, 1888, and 1889 with the full blessings of Her Majesty the Queen of Elbonia / by A. Hoy Maytees…
Header
Pos. 139
File structures
Physical views of data
Methods for organizing records
Sequential files
Indexed files
Lists
Balanced trees
Direct access structures
Sequential files (1)Records stored contiguously in order on a sort key
Record no.123456
Publ Date(sort key)
190219281937197819841999
TitleProspect for a new centuryMy life as a flapperWhy the market crashedPolyester pantsuit revolutionWhere is Big Brother?The end is near
Sort key
Sequential files (2)
Slow - when add new record must re-sort file
123
190219281937
Prospect for a new centuryMy life as a flapperWhy the market crashed
Good for high search/record addition ratio
Requires less space than indexed files
Searching sequential files
Examine each record in sequence
Binary search
Examine each record in sequenceRecord #
123456789
101112131415
NameAuBakerChouDietrichDoiIngKawamotoLiebowitzMarcuseRowlingSeussSmithTanakaTorranceZeus
Searching for Rowling
Accession 1:Does Au = Rowling?
Accession 2:Does Baker = Rowling?
Accession 10:Does Rowling = Rowling?
.
.
.
Binary search: step oneRecord #
123456789
101112131415
NameAuBakerChouDietrichDoiIngKawamotoLiebowitzMarcuseRowlingSeussSmithTanakaTorranceZeus
Searching for Rowling
Accession 1:Does Liebowitz = Rowling?
Is Rowling below Liebowitz in the alphabet?
Binary search: step twoRecord #
123456789
101112131415
NameAuBakerChouDietrichDoiIngKawamotoLiebowitzMarcuseRowlingSeussSmithTanakaTorranceZeus
Searching for Rowling
Accession 2:Does Smith = Rowling?
Is Rowling below Smith in the alphabet?
Binary search: step threeRecord #
123456789
101112131415
NameAuBakerChouDietrichDoiIngKawamotoLiebowitzMarcuseRowlingSeussSmithTanakaTorranceZeus
Searching for Rowling
Accession 3:Does Rowling = Rowling?
Maximum no. of accessions to find a record = log2n where n is number of records in the file
Binary search: our bead game
Left Right
No. cups Questions1 02 14 28 3
L R
L R
n log2n
Records Accessions
Index filesPUBLISHER
Bugsy Press
Tara Pub. Co.
Beau Gens
Bowring Press
Earth Press
ISBN Index
LCN
00001
00002
00003
00004
00005
TITLE
My life of crime / …Gone with the wind / …Life afterlibrary sch…Drudgerymade fun /…Mudpies…
PLACE_PUBL
London
Athens, Ga.
Paris
San Diego
Fresno
ISBN
7534678945
5675849246
1234567890
4378159721
4678591357
ISBN12345678904378159721467859135756758492467534678945
LCN0000300004000050000200001
SeriesIndex
Multiple indexes to main file
Bibliographicrecords
ISBNIndex
BrowseTitle
Index
KeywordIndex
Call no.Index
PublisherIndex
Index files - advantages
Fast searches Index file smaller than main file Index file sorted so can use
sequential or binary search
Good for system with high volume of searches
Index files - disadvantages
Use additional storage space
When add new records must re-index
Lists
Record #0123456789
NameBakerDoiRowlingDrewIng ChouMarcuseKawamotoLiebowitzAu
ForwardPointer
53
eol4712860
BackwardPointer
956130847
bol
Tell the computer where to find the text record.
Searching lists
Record #123456789
10
NameBakerDoiRowlingDrewIng ChouMarcuseKawamotoLiebowitzAu
ForwardPointer
64
Eol5823971
BackwardPointer
1067241958
bol
Follow the pointers
Accession 1:Does Baker = Doi?Doi after Baker?Use forward pointer
Searching for Doi
Accession 2:Does Chou = Doi?Doi after Chou?Use forward pointer
Accession 3:Does Doi = Doi?
Balanced treesImplement binary search logic in list form.
Goo
Baker Rowling
Au Chou Ing Tanaka
root
internalnodes
leaves
Direct-access structuresDo not go through index or follow list - use algorithm to yield address where file is stored.
ISBN1234567892/11Remainder 8
4834567891/11Remainder 10
5489234831/11Remainder 5
0 1 2 3
4 5 6 7
8 9 10
Example: Divide sort key value by prime number 11, use remainder as address
Direct access pros & cons
Advantage - fastDo not go through indexes or follow
sequence of a listComputing algorithm faster than
multiple disk accessions
Disadvantage - may hash to same address (collision)