11
Hashed Files Hashed Files Text Versus Binary Text Versus Binary Meghan Cavanagh Meghan Cavanagh

Hashed Files Text Versus Binary

Embed Size (px)

DESCRIPTION

Hashed Files Text Versus Binary. Meghan Cavanagh. Hashed Files a file that is searched using one of the hashing methods. User gives the key, the function maps the key to the address and passes it to the operating system then the record is retrieved. Hashing Methods. Direct Hashing - PowerPoint PPT Presentation

Citation preview

Page 1: Hashed Files  Text Versus Binary

Hashed Files Hashed Files Text Versus BinaryText Versus Binary

Meghan CavanaghMeghan Cavanagh

Page 2: Hashed Files  Text Versus Binary

Hashed FilesHashed Filesa file that is searched using one of the hashing a file that is searched using one of the hashing

methodsmethods

User gives the key, the function maps the key to User gives the key, the function maps the key to the address and passes it to the operating the address and passes it to the operating

system then the record is retrievedsystem then the record is retrieved

Mapping in a Hashed FileMapping in a Hashed File

Key -> Address=Hash Function ->AddressKey -> Address=Hash Function ->Address

Page 3: Hashed Files  Text Versus Binary

Hashing MethodsHashing Methods

Direct HashingDirect Hashing

Modulo DivisionModulo Division

Digit ExtractionDigit Extraction

Mid-squareMid-square

FoldingFolding

RotationalRotational

PseudorandomPseudorandom

Page 4: Hashed Files  Text Versus Binary

Direct Hashing MethodDirect Hashing Method

the key is obtained without any algorithmic the key is obtained without any algorithmic manipulationmanipulation

Contains a record for every possible keyContains a record for every possible key

Limited situations for this methodLimited situations for this method

Very powerful because it guarantees that Very powerful because it guarantees that there are no synonyms or collisionsthere are no synonyms or collisions

Page 5: Hashed Files  Text Versus Binary

Modulo Division MethodModulo Division Method

(division remainder hashing) divides the key (division remainder hashing) divides the key by the file size and uses the remainder plus by the file size and uses the remainder plus

one for the addressone for the address

Algorithm works with any list size but a prime number Algorithm works with any list size but a prime number produces fewer collisions than other list sizesproduces fewer collisions than other list sizes

The list size in the equation below is the number of The list size in the equation below is the number of elements in the fileelements in the file

address = key % list _size + 1address = key % list _size + 1

Page 6: Hashed Files  Text Versus Binary

Digit Extraction MethodDigit Extraction Method

selected digits are extracted from the key and used as the addressselected digits are extracted from the key and used as the address

For example if you use a six digit employee number to hash to a three digit address you For example if you use a six digit employee number to hash to a three digit address you could select the first, third and fourth digits and use them as the addresscould select the first, third and fourth digits and use them as the address

125870125870 = 158= 158

122801122801 =128=128

121267121267 =112=112

123413123413 =134=134

Page 7: Hashed Files  Text Versus Binary

CollisionCollisionoccurs when a hashing algorithm produces an address for occurs when a hashing algorithm produces an address for an insertion and that address is already occupiedan insertion and that address is already occupied

SynonymsSynonymstwo or more keys the hatch to the same home addresstwo or more keys the hatch to the same home address

Home AddressHome Addressthe first address produced by the hashing algorithmthe first address produced by the hashing algorithm

Prime AreaPrime Areathe memory that contains the home addressthe memory that contains the home address

Page 8: Hashed Files  Text Versus Binary

Collision ResolutionCollision Resolution

Open Addressing Resolution-Open Addressing Resolution- when a collision when a collision

occurs, the prime area addresses are searched for an opened or occurs, the prime area addresses are searched for an opened or unoccupied record where the new data can be placedunoccupied record where the new data can be placed

Linked List Resolution-Linked List Resolution-eliminates the probability of future eliminates the probability of future

collisions where the first record is stored in the home address, but it collisions where the first record is stored in the home address, but it contains a pointer to the second recordcontains a pointer to the second record

Bucket Hashing-Bucket Hashing-uses a location that can accommodate uses a location that can accommodate

multiple data units to reduce collisionmultiple data units to reduce collision

Combination Approaching-Combination Approaching-uses several approaches to uses several approaches to

resolve the collisionresolve the collision

Page 9: Hashed Files  Text Versus Binary

Text FileText File

File of charactersFile of characters

Cannot contain integers, floating point numbers or Cannot contain integers, floating point numbers or any other data structures in their internal memory any other data structures in their internal memory formatformat

In order to store these data types they must be In order to store these data types they must be converted to their character equivalent formatsconverted to their character equivalent formats

The most well known text files are file streams for The most well known text files are file streams for key boards, monitors and printerskey boards, monitors and printers

Page 10: Hashed Files  Text Versus Binary

Binary FilesBinary FilesCollection of data stored in the internal format of the Collection of data stored in the internal format of the computercomputer

Data can be an integer, a floating point number, a Data can be an integer, a floating point number, a character or any other structured data (except a file)character or any other structured data (except a file)

Contains data that is meaningful only if they are properly Contains data that is meaningful only if they are properly interpreted by the programinterpreted by the program

Textual DataTextual Data 1 byte is used to 1 byte is used to represent one characterrepresent one character

Numeric DataNumeric Data 2 or more bytes is 2 or more bytes is considered a data itemconsidered a data item

Page 11: Hashed Files  Text Versus Binary

The EndThe End