Upload
yasir-gallagher
View
26
Download
0
Embed Size (px)
DESCRIPTION
Hashed Files Text Versus Binary. Meghan Cavanagh. Hashed Files a file that is searched using one of the hashing methods. User gives the key, the function maps the key to the address and passes it to the operating system then the record is retrieved. Hashing Methods. Direct Hashing - PowerPoint PPT Presentation
Citation preview
Hashed Files Hashed Files Text Versus BinaryText Versus Binary
Meghan CavanaghMeghan Cavanagh
Hashed FilesHashed Filesa file that is searched using one of the hashing a file that is searched using one of the hashing
methodsmethods
User gives the key, the function maps the key to User gives the key, the function maps the key to the address and passes it to the operating the address and passes it to the operating
system then the record is retrievedsystem then the record is retrieved
Mapping in a Hashed FileMapping in a Hashed File
Key -> Address=Hash Function ->AddressKey -> Address=Hash Function ->Address
Hashing MethodsHashing Methods
Direct HashingDirect Hashing
Modulo DivisionModulo Division
Digit ExtractionDigit Extraction
Mid-squareMid-square
FoldingFolding
RotationalRotational
PseudorandomPseudorandom
Direct Hashing MethodDirect Hashing Method
the key is obtained without any algorithmic the key is obtained without any algorithmic manipulationmanipulation
Contains a record for every possible keyContains a record for every possible key
Limited situations for this methodLimited situations for this method
Very powerful because it guarantees that Very powerful because it guarantees that there are no synonyms or collisionsthere are no synonyms or collisions
Modulo Division MethodModulo Division Method
(division remainder hashing) divides the key (division remainder hashing) divides the key by the file size and uses the remainder plus by the file size and uses the remainder plus
one for the addressone for the address
Algorithm works with any list size but a prime number Algorithm works with any list size but a prime number produces fewer collisions than other list sizesproduces fewer collisions than other list sizes
The list size in the equation below is the number of The list size in the equation below is the number of elements in the fileelements in the file
address = key % list _size + 1address = key % list _size + 1
Digit Extraction MethodDigit Extraction Method
selected digits are extracted from the key and used as the addressselected digits are extracted from the key and used as the address
For example if you use a six digit employee number to hash to a three digit address you For example if you use a six digit employee number to hash to a three digit address you could select the first, third and fourth digits and use them as the addresscould select the first, third and fourth digits and use them as the address
125870125870 = 158= 158
122801122801 =128=128
121267121267 =112=112
123413123413 =134=134
CollisionCollisionoccurs when a hashing algorithm produces an address for occurs when a hashing algorithm produces an address for an insertion and that address is already occupiedan insertion and that address is already occupied
SynonymsSynonymstwo or more keys the hatch to the same home addresstwo or more keys the hatch to the same home address
Home AddressHome Addressthe first address produced by the hashing algorithmthe first address produced by the hashing algorithm
Prime AreaPrime Areathe memory that contains the home addressthe memory that contains the home address
Collision ResolutionCollision Resolution
Open Addressing Resolution-Open Addressing Resolution- when a collision when a collision
occurs, the prime area addresses are searched for an opened or occurs, the prime area addresses are searched for an opened or unoccupied record where the new data can be placedunoccupied record where the new data can be placed
Linked List Resolution-Linked List Resolution-eliminates the probability of future eliminates the probability of future
collisions where the first record is stored in the home address, but it collisions where the first record is stored in the home address, but it contains a pointer to the second recordcontains a pointer to the second record
Bucket Hashing-Bucket Hashing-uses a location that can accommodate uses a location that can accommodate
multiple data units to reduce collisionmultiple data units to reduce collision
Combination Approaching-Combination Approaching-uses several approaches to uses several approaches to
resolve the collisionresolve the collision
Text FileText File
File of charactersFile of characters
Cannot contain integers, floating point numbers or Cannot contain integers, floating point numbers or any other data structures in their internal memory any other data structures in their internal memory formatformat
In order to store these data types they must be In order to store these data types they must be converted to their character equivalent formatsconverted to their character equivalent formats
The most well known text files are file streams for The most well known text files are file streams for key boards, monitors and printerskey boards, monitors and printers
Binary FilesBinary FilesCollection of data stored in the internal format of the Collection of data stored in the internal format of the computercomputer
Data can be an integer, a floating point number, a Data can be an integer, a floating point number, a character or any other structured data (except a file)character or any other structured data (except a file)
Contains data that is meaningful only if they are properly Contains data that is meaningful only if they are properly interpreted by the programinterpreted by the program
Textual DataTextual Data 1 byte is used to 1 byte is used to represent one characterrepresent one character
Numeric DataNumeric Data 2 or more bytes is 2 or more bytes is considered a data itemconsidered a data item
The EndThe End