ppt on VSAM

1-1

VSAM

Virtual Storage Access MethodAllows for interactive updates (adds,

changes, and deletes)

1-2

Three Types of VSAM files

KSDSKeyed Sequence Data Set

RRDSRelative Record Data Set

ESDSEntry Sequence Data Set

1-3

KSDS

Allows for users to access records sequentially or randomly

Includes an index and data componentsA CLUSTER consists of both the index and

data components together Index relates a value in a key field to the

actual location of the record on disk Index is only used in random (or dynamic)

processing

1-4

VSAM Data Component Storage Concepts

Control Interval (CI)Fixed amount of storage; Must be multiple of 512Usually 2048 or 4096 bytesData records are stored in a CI

• A CI of 4096 bytes can store forty 100-byte records

Control Areas (CA)One cylinder in sizeSize of a cylinder varies by disk driveContains numerous CIs

1-5

Approximate number of records in a CI

(CI size - 10) / record length

1-6

Records in a CI

Calculate the number of records in a CI formula on previous slide

Determine the percentage of records in a CI to be added or insert in an average CIConcept of FreeSpace

• Space where records can be added “on the fly”

1-7

CI in a CA

A fixed number based on CI size and disk driveFor disk drive with ACD001:

• CI of 2048: 315 CI in a CA• CI of 4096: 180 CI in a CA

Determine the percent of the CI to be left open for additions

Known as CA freespace

1-8

Freespace in a CI

Is used to add records which belong in the CI

Records in a CI are shifted automatically within the CI to accommodate the inserted record

1-9

CI Splits

If there is no freespace in the CI in which the record is to be inserted:CI splitHalf of the records in the CI are moved to a

free CI in the same CAThe inserted record is then inserted in the

proper CIThese happen routinely and are

accomplished quickly

1-10

CA splits

If a CI needs to split and there is no free CI in the CA: CA splitHalf of the CI are moved to a free CA (usually

at the end of the file, there is unused space)Therefore, each of the two CI have 50% free

spaceThe original CI can now splitCA splits have much overhead and should be

avoided!!!

1-11

Avoid CA splits

Reorganization of files Import: Copy the VSAM file to a sequential

datasetExport: Delete and reload the VSAM file from

the sequential data set: Resets the freespace as in the define cluster

Allocate adequate freespaceAnalyze the primary key

1-12

Primary Key Analysis

Is the pattern in the PKExample: The Financial Aid office must

keep three years of data on-line.• Previous year: State reporting• Current year: Distributing aid to students• Next year: Granting/guaranteeing aid for

next year

1-13

Primary Key Analysis

Key options for the financial aid file 1-digit year + SSN

• Little on-line activity on first third of file• Most “adds” are in last third: CI/CS split

likely

SSN + 1 digit year• Activity is spread evenly throughout the file• Recommended

1-14

VSAM Index Component

Every KSDS as an index component for each primary key AND each foreign key

Base cluster: Primary key index and Data components

Foreign key index is known as the alternate index

1-15

Primary Key Index and Data

Two parts of the index:Sequence setIndex set

1-16


Sequence SetLowest level of the index componentContains information that relates key values

to a specific CILinks the highest PK in a CI to the address of

that CI• Stores all the key-address pairs for the Cis in a CA

in one CI of the sequence set• There is a separate CI in the sequence set for each

CA in the data

1-17


Index SetHighest level of the index componentKey-address pairs stored in one CI (can be

stored in main memory for processing efficiency)

• Address Pointer links to address of the appropriate sequence set

• Based on size of data component (number of cylinders or CA needed to store data component), you may need a intermediate index

1-18

Alternate Index and Data Component

Alternate index relates alternate key to primary keyThen uses the primary key index to locate the

dataCan have unique or non-unique AIFigure 2-4 (unique)Figure 2-5 (non-unique)

1-19

Alternate Index

Systems analyst determines whether to update the AI each time the cluster is changed

Can cause much overhead to update all indices especially in the case of a CA split

Other alternative is to re-build the index periodically (every night)

1-20

Relative Record Data Set (RRDS)

Lets the user access each record at random without the overhead of maintaining an index

Instead each record in a RRDS is numbered, starting with 1 for the first record

RRDS consists of a specified number of areas or slotsKnown as the relative record number (RRN)

1-21

RRDS

May need a routine to convert the PK of a record to a relative record numberHashing

• Most common hashing routine: The remainder option on the divide

Can cause empty slots. Can waste storage; But we avoid CI/CA splits

Difficult to HASH if PK is non-numeric

1-22

RRDS

Collision If hashing routine results is same RRN for

two different PKMust set secondary searching technique in

case of collisions• Usually linear probing: check the next

record up to a maximum number of tries. Needed to know if record to be added already

exists Needed to determine if record to be retrieved

exists without reading entire file

1-23

RRDS

AdvantagesNo index overheadDirect relationship between data and

location of the dataPermits both random and sequential

processing If good hashing routine with minimal

collisions: performance efficiency is excellent

1-24

RRDS

DisadvantagesStorage efficiencyCollisionsDifficulty in determining good hashing

technique• Difficulty with alphabetic key

Does NOT support the concept of FK or AINot widely used

1-25

ESDS

Entry Sequenced Data SetThe simplest type of VSAM fileRecords are stored sequentially at time of

entry

1-26

ESDS

Similar to sequential processingDoes allow to OPEN EXTEND to add

records to the end of the file

1-27

ESDS

Author does not recommend it Says it is restricted to sequential processingBUT you can build an AI for a FKBUT as of my current COBOL manuals,

COBOL cannot use the AI and processing must be sequential.

This is my current topic for career day questions

Documents

ppt on VSAM