Text of May 2008John Mycroft â€“ WAVV 2008 VSE/VSAM â€“ Under the covers...
May 2008John Mycroft WAVV 2008 VSE/VSAM Under the covers John Mycroft Product Development Manager CSI International www.csi-international.com email@example.com
May 2008John Mycroft WAVV 2008 Acknowledgement With grateful thanks to Dan Janda, The Swami of VSAM, from whom much of this presentation was stolen To CSI for providing me with Data- Miner, CSI-Sort and a machine to create the examples To my fellow developers at CSI who put up with my hogging the machine for hours on end
May 2008John Mycroft WAVV 2008 Abstract Overview of VSAM & its components. We take a look at what a VSAM file really looks like and how to soup up its performance. We also look at some common mistakes and how to avoid them. This presentation and its materials are copyrighted and developed by John Mycroft from a presentation originally copyrighted by Dan Janda. Permission is granted for WAVV to reproduce this presentation for distribution to its members at no charge. Trademarks: IBM, VSE, VSE/ESA, zVSE, CICS & DL/I are trademarks or registered trademarks of the IBM Corporation The Swami of VSAM is a trademark of Dan Janda.
May 2008John Mycroft WAVV 2008 VSE/VSAM Overview Virtual Storage Access Method For disk files Sequential Entry Sequence Dataset or ESDS Begin at the beginning, go on til you get to the end and then stop Indexed Keyed Sequence Dataset or KSDS Process by key or sequentially or a mixture Direct Relative Record Dataset or RRDS (fixed) or VRDS (variable) Calculate a records location in the file to access it Alternate index (AIX) gives an alternative route to a KSDS Allows unique & non-unique keys
May 2008John Mycroft WAVV 2008 VSE/VSAM Functional areas Catalog Volume & file information Usage statistics Disk space management Space allocation including secondary allocations VSAM and VSAM/SAM files System files Libraries
May 2008John Mycroft WAVV 2008 VSE/VSAM Functional areas Integrity Performance Data transfer size Buffering Backup / restore File sharing between jobs and systems
May 2008John Mycroft WAVV 2008 Processing a VSAM file Sequentially (ESDS) Forward or backward Keyed access (KSDS) Direct by full or partial (generic) key Sequentially, forward or backward Skip sequential, forward or backward Addressed access (RRDS, VRDS) Direct, by record address Sequential & skip sequential Alternate Index Access Same as keyed access Also direct access by non-unique key
May 2008John Mycroft WAVV 2008 How VSAM stores data Were going to look at How VSAM stores records logically on disk Performance considerations How VSAM physically stores data on disk Disk space usage calculations Optimizing disk capacity Performance considerations VSAM jargon Control Interval Control Area CI & CA splits Freespace RDF, CIDF
May 2008John Mycroft WAVV 2008 VSAM Jargon Control Interval (CI) Smallest unit of data transfer between main & disk storage In other words, when you read a record, VSAM reads the whole CI that contains that record Think of it as the same as a block of records in a sequential file if you like (though its laid out differently) A CI can initially contain 1 or more records More can be inserted Some or all can be deleted When you try to add a new record to a CI with no room, a CI split takes place more about that later
May 2008John Mycroft WAVV 2008 Layout of a control interval ALL VSAM FILES ARE VARIABLE LENGTH Even if all the records are the same size Rec 1 Rec n 1 to n logical records of any length FreespaceUnused space in CI for inserting records or making existing records longer RDFs3 byte record descriptor field ESDS/KSDS1 per LRECL, 1 for all consecutive records of same length RRDSone per numbered record slot CIDF4 byte Control Interval Descriptor Field Rec 1Rec 2Rec 3Rec FreespaceRDFsCIDF
May 2008John Mycroft WAVV 2008 Control Area (CA) CA size is the smallest of : One cylinder or The size of the primary allocation The size of the secondary allocation The number of CIs per CA depends on the device and the CI and CA sizes It is generally a good idea to go for the biggest CA possible A CA is a group of CIs. In a KSDS, all the data CIs in a CA are indexed by one index CI CI 0CI 1CI 2CI 3CI 4CI 5CI 6CI 7CI 8CI 9 CI10CI11CI12CI13CI14CI15CI16CI17CI18CI19 CI20CI21CI22CI23CI24CI25CI26CI27CI28CI29
May 2008John Mycroft WAVV 2008 Index Control Interval (Index CI) CI 0CI 1CI 2CI 3CI 4CI 5CI 6CI 7CI 8CI 9 CI10CI11CI12CI13CI14CI15CI16CI17CI18CI19 CI20CI21CI22CI23CI24CI25CI26CI27CI28CI29 A CI in an index containing pointers to The next level in the index or The Data CI in the CA this is referred to as a Sequence Set CI Index CI
May 2008John Mycroft WAVV 2008 Index and data structure Balanced tree Sparse index Always just 1 high-level index CI There can be 0 to many intermediate level index CIs There can be one or more low-level (sequence set) index CIs. If there is only 1 sequence set CI, it is also the high-level index CI
May 2008John Mycroft WAVV 2008 And now the bit youve all been waiting for
May 2008John Mycroft WAVV 2008 Performance rules of thumb Use largest data CI possible, especially for sequential work Use as small an index CI as you can (but not too small!) Use large data CA allocate primary and secondary as at least 1 cylinder Avoid too many extents / allocations
May 2008John Mycroft WAVV 2008 Allocation calculations CI freespace = CI Size * Freespace % Number of records per CI Fixed length: (CI Size -10 Freespace) / LRECL Variable length: (CI Size -7 Freespace) / (Average LRECL +3)
May 2008John Mycroft WAVV 2008 Whats in a CI? Data and control info (end of CI)
May 2008John Mycroft WAVV 2008 CI control information At the end of each data CI
May 2008John Mycroft WAVV 2008 Data records
May 2008John Mycroft WAVV 2008 The CIDF Note (back 2 slides) free space has data in it from earlier CI split
May 2008John Mycroft WAVV 2008 The Index
May 2008John Mycroft WAVV 2008 Allocation calculations Calculate Freespace in each CA Get number of CIs per CA from LISTCAT or device characteristics (3390, 12 x 4K CIs/track, 180/cyl) CA freespace = No of CIs per CA * CA Freespace %, rounded up Number of CIs loaded per CA = CIs per CA CA freespace Number of records loaded per CA = Loaded CIs in CA * No of recs in CI
May 2008John Mycroft WAVV 2008 VSAM Catalogs Exactly one master catalog Assigned at IPL with DEF CAT or DEFINE MCAT IDCAMS command User catalogs 0 to many No more than 1 per volume Catalog can own multiple spaces on a volume Many catalogs can own space on a volume
May 2008John Mycroft WAVV 2008 VSAM Catalogs Catalog contains :- Self-describing records User catalog pointers Volume definitions Space definitions Cluster (file) definitions Component (data, index) definitions AIX & Path definitions
May 2008John Mycroft WAVV 2008 Catalog recommendations Use naming conventions Name Cluster, Data and Index components explicitly Use partition / system independent names where applicable Separate Files seldom defined or deleted Files often defined or deleted Online critical files Batch files Multiple baskets all the eggs wont get broken
May 2008John Mycroft WAVV 2008 More recommendations Dont use recoverable catalogs Hangover from 2314 / 3330 Backup is vastly better IDCAMS, Faver, Maxback, Dr D, user- written
May 2008John Mycroft WAVV 2008 CI & CA splits and freespace You try to insert a record in a CI or extend a record already there If there is enough free space in the CI, everyone moves up, record is inserted and CI rewritten BUT what if there isnt enough free space????
May 2008John Mycroft WAVV 2008 CI & CA splits CI split 4 physical IOs Set Split in progress, write CI Move half of records to new CI & write it Update sequence set, write index CI Erase moved records from old CI, turn off Split in progress, write old CI BUT..
May 2008John Mycroft WAVV 2008 Failure in CI split System failure Corrected next time CI is updated No free CI in the CA CA split is needed Remember 1 physical IO = 30,000 40,000 CPU instructions
May 2008John Mycroft WAVV 2008 CA Split MANY physical reads and writes Set Split in progress, write sequence set CI Maybe get new extent Format new CA at HURBA position Read / write half of CIs to new CA Write new sequence set CI for new CA Update higher level index CIs Erase moved CIs from old CA, write empty CIs Write updated original sequence set CI
May 2008John Mycroft WAVV 2008 Recommendations Dont worry about CI splits Avoid excessive CA splits by defining CA freespace Dont do a reorg just because you have done n CI / CA splits