17
0097-8485186 53.00 + 00 0 1986 Pergamos Journals Ltd. BIN SORT MODULE TO ORDER LARGE LISTS OF SMALL ITEMS: A MODULE FOR SCIENTIFICALLY ORIENTED APPLICATIONS GERARDO CISNEROS Seccih de Graduados, EscttelaSuperior de Ingeaieria Me&nica y El&ctrlca, Institute Polit&cnico National, 07738 M-+xico D. F., Mixico. ENRIQUE POULAIN Invesitgacibn Ebica de Procesos. Institmo Meticano de1 Pctr6leo. Apartado Postal 144305. 07000 Mixico D. F., M.&&x. and CARLOS F. BUNGEt Departamento de Q&mica. Uoiversidad Autdnoma Metropolitana, Unidad Iztapalapa, Apartado Postal 55-534. 09340 Mixico D.F. (Received 5 August 1985) AIlStract -We present a modular FORTRAN program to sort large lists of small items when all keys BE distinct and the range of key values is not much greater than the number of keys. A bin sort method is employed. Modular key-decoding, key-modifying and key-analyzing capabilities. which are transparent to a casual user. considerably enhance the scope of the present code. Applications are discussed where the present program is more efficient than the general external sort program given in the previous paper. 1. INTRODUCTION In the previous article @unge 8c Cisneros, 19861, a general external sort module was presented. Some ap- plications, however, involve large lists of N items (stored in an array A) with N distinct keys (stored in an array K), where the range KRANGE of key values, bounded by KLOW and KHIGH, is not much greater than N. In these cases, it is possible to perform the entire sort without actually sorting in the classi- cal sense, as shown by the following FORTRAN statements: LARGE=2**30+(2**30-1) ! largest I-byte integer KhZF#Ml=KLOW-1 KRANGE=KHIGR-KLOwM1 DO 10 I=l,KRANGE 10 KSCR(I)=LARGE Do 20 .J=l,N KSCR(K(J)-KLOWMll=O 20 ASCR(K(J)-KLOWMll=A(J1 J=O Do 30 I=l,KRANGE + On sabbatical leavefrom Institute de Pzsica. Universidsd National Autbnoma de MMco, Apartado Postal 20-364, 01000 M.&&o D.F., M&&o. IF (KSCR(I) .NE. LARGE) THEN J=J+l K(Jl=I+KLOWMl A(J)=ASCR(I) END IF 30 CONTINUE If all arrays do not fit in the central memory, how- ever, a direct access device or virtual memory must be used. The latter, if available, provides a suitable answer unless the original order is such that zeroes and suc- cessive items in A have to be scattered into successively different pages of KSCR and ASCR in the virtual memory, as is often the case. Thus was conceived the bin sort method (Yoshimine, 1969) whereby both the scatter of successive zeroes and A elements into KSCR and ASCR, and the subsequent sequential gathering of keys and ASCR elements to be moved into their final destinations in K and A, is carried out with the help of a direct access device. Bin sort finds application in quantum chemistry: in four-index transformations to generate the list [@q / r.r)] of two-electron molecular orbital intwals (McLean, 1971; Bagus et OZ., 1973), and in a most efficient procedure to construct the Hamiltonlan ma- trix for large configuration interaction calculations (Yoshiiine, 1973). Yoshimine’s algorithm is known to be part of the ALCHEMY system of programs (McLean, 197 l), which is available through the Quan- tum Chemistry Program Exchange (Bagus er al., 135

Bin sort module to order large lists of small items: A module for scientifically oriented applications

Embed Size (px)

Citation preview

Page 1: Bin sort module to order large lists of small items: A module for scientifically oriented applications

0097-8485186 53.00 + 00 0 1986 Pergamos Journals Ltd.

BIN SORT MODULE TO ORDER LARGE LISTS OF SMALL ITEMS: A MODULE FOR SCIENTIFICALLY ORIENTED

APPLICATIONS

GERARDO CISNEROS

Seccih de Graduados, Escttela Superior de Ingeaieria Me&nica y El&ctrlca, Institute Polit&cnico National, 07738 M-+xico D. F., Mixico.

ENRIQUE POULAIN Invesitgacibn Ebica de Procesos. Institmo Meticano de1 Pctr6leo. Apartado Postal 144305. 07000

Mixico D. F., M.&&x.

and

CARLOS F. BUNGEt Departamento de Q&mica. Uoiversidad Autdnoma Metropolitana, Unidad Iztapalapa, Apartado Postal

55-534. 09340 Mixico D.F.

(Received 5 August 1985)

AIlStract -We present a modular FORTRAN program to sort large lists of small items when all keys BE distinct and the range of key values is not much greater than the number of keys. A bin sort method is employed. Modular key-decoding, key-modifying and key-analyzing capabilities. which are transparent to a casual user. considerably enhance the scope of the present code. Applications are discussed where the present program is more efficient than the general external sort program given in the previous paper.

1. INTRODUCTION

In the previous article @unge 8c Cisneros, 19861, a general external sort module was presented. Some ap- plications, however, involve large lists of N items (stored in an array A) with N distinct keys (stored in

an array K), where the range KRANGE of key values, bounded by KLOW and KHIGH, is not much greater than N. In these cases, it is possible to perform the entire sort without actually sorting in the classi- cal sense, as shown by the following FORTRAN statements:

LARGE=2**30+(2**30-1) ! largest I-byte integer

KhZF#Ml=KLOW-1

KRANGE=KHIGR-KLOwM1

DO 10 I=l,KRANGE

10 KSCR(I)=LARGE

Do 20 .J=l,N

KSCR(K(J)-KLOWMll=O

20 ASCR(K(J)-KLOWMll=A(J1

J=O

Do 30 I=l,KRANGE

+ On sabbatical leave from Institute de Pzsica. Universidsd National Autbnoma de MMco, Apartado Postal 20-364, 01000 M.&&o D.F., M&&o.

IF (KSCR(I) .NE. LARGE) THEN

J=J+l

K(Jl=I+KLOWMl

A(J)=ASCR(I)

END IF

30 CONTINUE

If all arrays do not fit in the central memory, how- ever, a direct access device or virtual memory must be used. The latter, if available, provides a suitable answer unless the original order is such that zeroes and suc- cessive items in A have to be scattered into successively different pages of KSCR and ASCR in the virtual memory, as is often the case. Thus was conceived the bin sort method (Yoshimine, 1969) whereby both the scatter of successive zeroes and A elements into KSCR and ASCR, and the subsequent sequential gathering of keys and ASCR elements to be moved into their final destinations in K and A, is carried out with the help of a direct access device.

Bin sort finds application in quantum chemistry: in four-index transformations to generate the list [@q / r.r)] of two-electron molecular orbital intwals (McLean, 1971; Bagus et OZ., 1973), and in a most efficient procedure to construct the Hamiltonlan ma- trix for large configuration interaction calculations (Yoshiiine, 1973). Yoshimine’s algorithm is known to be part of the ALCHEMY system of programs (McLean, 197 l), which is available through the Quan- tum Chemistry Program Exchange (Bagus er al.,

135

Page 2: Bin sort module to order large lists of small items: A module for scientifically oriented applications

136 G. CISNER~S et al.

1971), and it has reportedly been incorporated (Luc- these et al., 1978) into more recent systems for mo- lecular electronic structure calculations.

Unfortunately, previous work on bin sort has nei- ther been documented nor translated into readily avail- able programs. In this paper we present a thorough analysis of bin sort together with a corresponding gen- eral purpose module. In Section 2 we explain bin sort. Module names are discussed in Section 3. Interfaces are concisely given in the headings of the program listings appearing in the Appendix. In Section 4 we describe a bin sort module. Modular key-decoding options which serve to minimize the range of key values are discussed in Section 5. These, together with the modular key-modifying and key-analyzing capabilities discussed in the previous article (Bunge & Cisneros, 1986) extend the scope of the present module to almost any conceivable scientific application without signifl- cant degradation in efficiency. Execution times are r’B ported in Section 6 together with an assessment of those situations where bin sort is expected to outper- form multiway merge external sort.

2. BIN SORT

In order to describe bin sort, let us first define a few quantities involved in the algorithm (all quantities are integers, and those representing amounts of mcm- ory are given in single precision words):

IUNIT is the size of the smallest addressable unit on a disk,

LARGE is the largest number compatible with the given key type, viz., 2.*31-l for integer keys on a VAX computer (notice that both IUNIT and LARGE arc machine dependent parameters),

BAREA is the size of the area allocated for all bins. PAREA is the size of a partition which is an area

reserved to collect the successive filliigs of a given bin (clearly, BAREA and PAREA are parameters dependent upon the size of the allocated central

memory), KTYPE is the size of a key, equal to one since bin

sort requires integer keys, ITYPE is the size of an item (0, 1 or 2), SZALLB = BAREA/(KTYPE + ITYPE) is the max-

imum number of keys (and items) that fit in the bin area,

SZPAR = PAREA/(KTVPE + ITYPE) is the max- imum number of keys (and items) that fit in a partition,

NBINS is the total number of bins, BAREAl 5 BAREAlNBINS is the size of a single

bin, and is chosen to bc a multiple of IUNIT, SZlB = BAREAl /(K’IVPE+ ITYPE) is the maxi-

mum number of keys (and items) filling a bin, NPARM(IBIN) is the extant number of keys and items

in the partition assigned to bin IBIN, NFILLS is the maximum number of times that any

given bin will be filled, and ISZPAR = NFILLS*SZlB (s SZPAR) is the maxi-

mum number of keys (and items) that will be put into a partition.

In bin sort, the unsorted keys and items are read sequentially from a file and scattered into NBINS bins of total size not exceeding BAREA. Each bin IBIN receives keys (and corresponding items) in the range [(IBIN-1); ISZPAR + 1, IBIN+ISZPAR]. Whenever a bin fills, it is written to a direct a-s file at record (IBIN-l)*NFILLS + NPARM(IBIN)/SZlB. At the end of this process, all partly filled bins are written at records (IBIN- l)*NFILLS + CNPARMUBIN) - 0, SZlB + 1, respectively.

Next, each partition produced for each bin is read into a scratch array containing both keys and items, and sorted by key using distribution (scattering) into arrays KSCR and ASCR, respectively, where KSCR is initially filled with LARGE. For I = 1 to ISZPAR, all keys (and corresponding items) for which KSCR (I) is different from LARGE are collected (gathered) into arrays KOUT and AOUT holding the final sorted list.

In short, bin sort consists of three scatter operations (one scattering keys and items into bins in memory, another scattering bins into a disk file and a final one scattering each bin’s contents into scratch arrays in memory), and a gather operation producing a sorted file.

3. MODULES AND INTERFACE

Our bin sort module, contained in a module file named BSTID.FOR, consists of six subroutines, listed in Table 1. Subroutines in BSTID.FOR require mod- ules from module files GUTIL.FOR (Cisneros & Bunge, 1986) and KHAND.FOR (Bunge & Cisneros, 1986).

The central module, BSTID, has 29 arguments which are described in detail in the heading of its listing (see Appendix). Such a large number of arguments is necessary to allow for key-decoding, -modifying and -analyzing options, logical unit numbers for input. out- put, message and two scratch files, attributes of the input and output files, the number of the record of the input file where the list to be sorted starts, and the number of the last record produced on output (if

any). An important consideration in the building of mod-

ular libraries is the concept of reusability. For a module to bc reusable, it should have no side effects and should emphasize the general aspects of the algorithm it em- bodies. For example, the file-handling subroutines FIDOlB and FIDO2B exhibit the same structure as FIDOlA and FIDOZA contained in module file ES- TID.FOR (Bunge & Cisneros, 1986) and differ only in their non-reusable features, viz., the type of I/O performed.

4. BIN SORT FORTRAN MODULE

Subroutine BSTID is the controlling subroutine of the bin sort module and carries out the following tasks: (1) it sets default values for the output file. (2) it performs validation checks on key and item

types and other parameters.

Page 3: Bin sort module to order large lists of small items: A module for scientifically oriented applications

Bin sort module to order large lists 137

Table 1. Contents of module file BSTID.FOR and other module files required

Module Modules required from BSTID.FOR

Other module files required

BSTID all other GUTIL.FORt,KHAND.FORx SORTST none GUTIL.FOR FIDO4A “One GUTIL.FOR, KHAND.FOR FID05A none GUTIL.FOR. KHAND.FOR FIDOlB none GUTIL.FOR FIDOZB none GUTIL.FOR, KHAND.FOR

+GUTIL.FOR contains mainly I/O modules and is discussed in Cisneros & Bunge, 1986

*KHAND.FOR contains key-kandling modules and is discussed partly in Bunge & Cisneros, 1986, and partly in Section 5.

(3)

(4)

(5)

(6)

(71

(8)

(91 if VAX is .TRUE. the bin file is closed and re- opened as a sequential file with a large block size; otherwise this file continues to be treated as a direct-access file.

(10) it calls subroutine FIWSA to perform the final scattering of each bin’s partition contents into scratch arrays in memory followed by a gather operation producing a sorted scratch file (the first scratch file is reused for this purpose).

(11) it calls subroutine FIDOZB to copy the sorted scratch file into an output file with user-definable attributes and according to key-analyzing op- tions. This step may be merged with the previ- ous one (with elimination of the sorted scratch file) on computers allowing dynamic memory allocation.

it finds the lowest and the highest keys (and thus the range KRANGE of key values). if KRANGE is greater than the numbers of keys NSIZE, it calls subroutine SORTST to decide whether an external multiway merge algorithm is expected to be more efficient than bin sort, in which case it returns to the calling program with 33 S IERR S 34, after writing a warning in the message file IWM. it calls subroutine FIDOlB to copy the file to be sorted into a sequential scratch file of fixed size ( = lO*IUNIT) records. This step could be omit- ted on computers allowing dynamic memory al- location, but this requires use of a nonstandard extension of FORTRAN. it computes parameters BAREA 1, NFILLS, SZlB, NBINS and ISZPAR, described in Sec- tion 2. it opens a second scratch file, in direct-access mode, to hold bin fillings, it calls subroutine FIDO4A to perform the first two scatters (items into bins, full bins into the second scratch file).

5. MODULAR KEY-HANDLING ENHANCEMENTS

Besides key-modifying and key-analyzing capabil- ities discussed in our previous article (Bunge & Cis- neros, 1986), BSTID makes optional use of key-de-

coding and key-range-determining subroutines. (These are listed in the Appendix and are considered to belong to the same module file KHAND.FOR discussed in the previous paper.)

Key decoding may allow a substantial reduction in key size and thus in key range. For example, for keys characterized by four S-bit indices I,J.K,L, the nu- merical value of keys with I = J = K = L is 16,843,009 and 168,430,090 for n = 1 and n = 10, respectively. If I,J,K and L are indices running in- dependently of each other up to a maximum value of 10, for the purpose of bin sort, the same keys may be assigned key values of 1 and 10,000, respectively, thus achieving a factor of 15,000 reduction in key range. None of these key-handling options is mandatory, and they are enabled or disabled through arguments KDEC and KRGE of subroutine BSTID. Correspond- ing subroutine names are listed in Table 2.

In order to speed up the execution of key-analyzing subroutines, we introduce an array IDCD (transparent to uninterested users) containing suitable indices. Array IDCD is optionally generated through sub- routines OPINOl and KOINl belonging to module files GUTIL.FOR (Cisneros & Bunge, 1986) and KHAND.FOR.

6. COMPARATIVE ANALYSIS OF EXECUTION TIMES

Execution times were taken on a VAX- 1 1 / 780 com- puter with 2 Mb of memory under VMS version 3.3,

Table 2. Key-decoding and key-range-determining subroutines in module file KHAND.FOR. All modules need

array IDCD. In present implementation, array IDCD is generated through subroutines OPINOl and KOINl

belonging to module files GUTIL.FOR and KHAND.FOR, respectively

Modules required Other from module

Module KHAND.FOR files required

KDECn, O<n<5 none none VAX-KDECn. 0 <n c 5 ZlOllCZ ncme

KRGEI KDECn, 0 <n < 5 GUTIL.FOR KRGEn, l<n<S none none

Page 4: Bin sort module to order large lists of small items: A module for scientifically oriented applications

138 G. CISNEROS et al.

with an RP07 500 Mb disk having a maximum transfer rate of 1.3 Mb/s. A working set of 200 Kb was used throughout. Elapsed times rather than CPU times are

reported; the latter vary between 85 and 90 per cent of the total elapsed times. Efficient I/O is obtained through the use of a modular package (Cisneros & Bunge, 1986). In all cases we consider the sorting of an [(ub lg)] list into an [(ij/&)] list with 32bit integer keys and 64-bit items as discussed in the introduction to the previous paper (Bunge & Cisneros, 1986).

There are four major steps associated to tasks (5). (8), (10) and (11) of Section 4. Timings are insensitive to the actual key distribution in the original list except for the second step, which accounts for as much as 45% of the total time.

In Fig. 1 we compare total running times of bin sort and a straight scatter and gather program using virtual memory (basically the program given in the introduction supplemented with the input and output steps (5) and (11) of Section 4, and key-handling sub- routines). As expected, the virtual memory scatter- gather program is faster than bin sort for small lists, but it slows down for lists larger than 35,ooO items

on account of virtual memory thrashing during the scattering step.

In Fig. 2 we show accumulated elapsed times for the foti steps of our bii sort module. If N is the number of items in the list, the total elapsed time t (in seconds) satisfies:

t = N/1900

to within one per cent or better which is of the same order as the relative difference of t values for different runs on the same list. All steps in bin sort are imple- mented to run linearly in N. Since I/O transfer rates are extremely sensitive to record size (Cisneros & Bunge, 1986). linearity requires: (i) increasing the num- ber of buffers in the second step, and (ii) using larger block sixes to read partition contents in the third step, as bin size diminishes with increasing N. This effect causes the efficiency of bin sort to dwindle somewhat when bin size BAREAl decreases from twice IUNIT to IUNIT. Of course, BAREAl can be increased at the expense of multiple passes on the original list. In

scatter and / gather

in vir t ual memory

I/

No. of 8 - byte Items Fig. 1. Comparison of elapsed times between our bin sort module BSTID and a scatter and gather program

using virtual memory.

Page 5: Bin sort module to order large lists of small items: A module for scientifically oriented applications

Bin sort module to order large lists 139

1

:

0 2X105 3x105 4x10” 5X105 6X IO’

No. of 8 - byte Items Fig. 2. Accumulated elapsed times for the four steps of our bin sort module BSTID.

such a case, however. the obtained elapsed times are larger than those for single pass bin sort with BAREAl = IUNIT = 128.

In Fig. 3 we show the regions of relative superiority of bin sort and external multiway merge sort (Bunge & Cisneros. 1986) as a function of RATIO = (KHIGH - KLOW + 1)/N and key range, for the stated working set. Our bin sort module is designed to perform a single pass on the original list, and thus it can only be used for lists up to MAXSIZ = BAREA*PAREA/(IUNIT*(KTYPE + ITYPE)) = 1.256.106. For MAXSIZ/2 < N 5 MAXSIZ, i.e. when the bin size is IUNIT. multiway merge external sort is more efficient than bin sort for RATIO > 2.5; for N s MAXSIZ/2, i.e. when the bin size is 2 2*IUNIT, onr general external sort algorithm outper;

forms bin sort only when RATIO 1 5. It must be noted, however. that bii sort will not work when re- peated keys are present.

The modular nature of the present code should greatly facilitate its migration into more powerful com- puters (IUNIT and central memories at least one hundred times larger than used here; vector processors, parallel architecture and magnetic bubble memories). In this connection, the greatest challenge appears to be keeping bm sort elapsed times proportional to N.

Acknowiedgments-We wish to express our appreciation to Ing. Manuel U- for his outstanding dedication as Systm Manager of the VAX-1 11780 computer at Instituto de Fmica. The work of G.C. was partially supported by the Chmisicln de Opemci6n y Foment0 de Actividadcs Acadknicas de1

Page 6: Bin sort module to order large lists of small items: A module for scientifically oriented applications

140 G. CISNEROS et al.

1256106

ESTID

BSTID

2.5 5.0

Key range / No of keys Fig. 3. Regions of relative superiority of our bin sort module BSTID and of our multiway merge external

sort module ESTID.

I.P.N. The authors gratefully acknowledge a fellowship from the Sistema Naclonal de Investigadores.

REFERENCES

Bagus, P. S., Liu, B.. McLean, A. D. & Yoshimine, M. (197lh ALCHEMY computer program, QCPE# 199, may be ob- tained from Quantum Chemistry Program Exchange, De- partment of Chemistry, Indiana University, Bloomington, IN 47401. U.S.A.

Bagus, P. S., Liu, B.. McLean, A. D. 8t Yoshimine, M. (19731, in Energy Structure and Reactivity IEds. D. W. Smith and W. B. M&a& p. 130, Wiley, New York.

Bunge, C. F. & Cisoeros, G. (19861, Sorting large lists of small items: A module for scientifically oriented applica- tions. Comput Chem. 10, 109.

Cisncros, G. & Bunge, C. F. (1986), A modular package for efficient l/O operations. Comput Chem. 10, 153.

Knuth, D. E. (19731, TheAr! ofComputerPmgramming, Vol. 3, Addison-Wesley, Reading, Massachusetts.

Lucchese, R.R., Brooks, B.R., Meadows, J.H., Swope, W.C. & Schaefer, H.F. (19781, A Compur. Pr?Jv. 26, 243.

McLean, A.D. f 197 1). in Potential Energy Surfacer from ab- inirio Computation, Proceedings of the Conference on Po- tential Energy Surfaces in Chemistry (Ed. W.Lester), Pub- lication RA 18. IBM Research Library, Monterey L Cottle Roads, San Jose, CA 95114, U.S.A.

Yoshimine, M. (1969), IBM Research Report RJ555, IBM Research Library, Monterey & Cottle Roads. San Jose, CA 95114, U.S.A.

Yoshimine, M. (1973). J. Comput. Phys. 11, 449.

APPENDIX

C Module file BSTID. FOR

~____________________-__--_----------------_-------__-_______________ SUBfiOUTI~ BSTID (KDEC.KRGE.KHOD.KANA.NOKEYS.IUNI~.KTYPEX.

* ~'TYP~.~SIZE.LUINP.LUOVT.CUSCRl.LUSCR2.1~. * IRLUI.NBUI.NBLI.IRLUO.STATO.ACCTYO.N%UO.NELO. * K.A,IFA.IBECRC.ILASRC.IERR.IDCDl

c_______________________---__________________________________________

Page 7: Bin sort module to order large lists of small items: A module for scientifically oriented applications

Bin sort module to order large lists 141

Title: module BSTID to sort large lists in a scientific environrent.

verm1onr l-001 This version handles integer key6 and double prccirion item. Peoqrar mYdifications in item types are easy to imp1erentx 1. Firstly, modify the IMPLICIT statement introducing lnteqer.

real or double precision where it corresponds (array namee for items beqln with an Al. Do the mua in mUbrOUtines of module file GUTfL.FOR.

2. Secondly, set the appropriate value of vraretsr ITYPE defining the number of words per item (-1 for real lteme. and -0 for "0 items).

3, For key sort only, in addition to 1. md 2.. delete m?r*ys beqlnnlnq vith an A ana all statslents where they OCCUC, end delete pereaeter ISW In subroutines FIDO4A and FIDOSA. AlSO. replace all occurrences of statementa: CALL ME3ID (O....l and CALL RfgID IO....) by CALL HESID (2,...) and CALL RESID (2,.--l, respcctlvely.

In any of these cams. suffixes ID in every subroutine name should be replaced appropriately by IR. II. or I LO IndlCIte the types of key' ana items rI=inteqer. R-real, D-double prcclslon~.

AbEltract 2 A glvan list of distinct Keys and items is assumed to be on M input file LUINP. Sort according to ascending key values 1s performed on USIZE elements starting at record 1BM;RC. (If demcendlnq order is demlred. the signs of the keys mmt be changed before entering &nd after returning from this procedure, and all optional subroutines In rodulc file KHAMD.FOR need revlalonl. The sorted lilt is written in an output flla LUOUT. A bln sort method is used. Attributes of LIJINP and LUOUT are chooen by the User. Optionally. (a) the input keys up be mcdified accordinq to user'e needs. (b) the number of items per .ach record of the output file "y be requlated It ueer'e discretion. and (c) the input file ray be destroyed by setting LUOUT-LUINP.

hvlronmentr Standard Fortran 77 with optional VAX-11 extensions.

Copyright by Gerard0 Cianeros, hclque Poulain and Carlos F. B~nqe. 19B6.

Reference: G. Cisneros. E. Poulain and C.F. Bunge. Cmput. Chew 10.135 (1986).

Umer ilbrrry callmr 1. Subroutinen in the same module files

SORTST,FIDO4A,FID05A,FIDOlB,FIDO2B.

2. Subroutines from other rodule filegr from module file KHAND.FOR.

from module file GUTfL.FOR. OES.RESID.RESX1.MEZSID,WiSXl, RPA.RRAx1.HRA.wRhx1.?OZSSGE.

t4achine dependent parameterat

IlJRIT=12S is the elzc of the smallest addressable unlt on a disk. Prteent value is for a VAX-11/780 computer. and it Is

LARGE equal to the size of a virtual memory page I512 byterr. is the largest Q-byte integer. For a VAX-11 computer LARGE=2**31-l-2147483647. All kev Values muat be smaller

VAX

than LARGE. which is used for inihrtion purposes in subroutine FIDOSA. 1s a parameter whose value is .TRUE. for all machines which accept In the OPEN statement: RECORDTYPE. INITIALSIZE. BUFFERCOUNT. BLOCKSIZE. RECL in sequential access node. and which allow e file created in direct access mode to be reopened 0I EeqUenflal access rode and vice-ver~a. It 1s .FALSE. othewisc.

Pararstcrs dependent on user allocated central memory size:

NPSCR-10 is related to the User record lenqth IRLU of a scratch file LUSCRlr fRLU=IUNIT*NPSCR. For m working set of MKSm Xb pUt Nl?SCR-NPscR*ImSETI2oo.

NPEwlE4 is the number of units of IUNIT reserved for a bin area. Present value is optima for (L working set of 200 Kb on d VAX-111780 coqnltsr. For - warkinq set of IHKSElT Kb put NPBA-NPBA*1wKsET/200.

uFFA-160 is the number of unita of IUHIT reserved for a partition area used to collect all fllllnqa of a given bin. Present value is 0ptim.m for e working aet of 200 Kb on I VAX-111780. For a working set of IMSEl' Kb QUt NPPA-NF+PA*IHKSET/ZOO.

Page 8: Bin sort module to order large lists of small items: A module for scientifically oriented applications

G. CISNEROS et al.

R.&T101 -5. for a wrtinq set Of 200 Kb 0" a VAX-111780 eo*D"teC. saxen the ratio bctwecn the ranqc of key values KRAUGE and the tQta1 m-P Of key6 NsIZ?X 6ati6fi66i

KRANGE/ISIZE .GJ!. RATIO1. muhroutine ESTXII iexternal multi-way reegc algorithm) ia always lQCC efficient than the preasnt one. TeT;;;u; Is obtained alrerdp at RATIOl-3.5.)

( US-lly this

-. for a working set of 100 Kb., -2.5 far a working s6t of 200 Kb on a VAX-11/7BO computer. It i6 the a”alogous of PATIO1 for large NSIZEr NSIZE .GT. HAXSIZIZ. where MAKSIZ is equal to the maxirum rmnqc of key valurr handled with present dimcnalo"sr ~IZ-~UNXT*~PA~~BA/(KTYPE-cITYPE). where all quantities or8 the right side arc flXed par-ters. (RATIO291.5 for a vorkinq set of 100 I#.)

RATI

Parmeter values:

KTYPE =l for inttger keys (only possibilityI.

ITYPE -2 for double PrtCisiO" item6 Lpresent ve?.-SiQn). =l for integer or real items. -0 no items Ikey Sort only).

PO-l arquments:

KDEC may c~usc keys to be decoded by subroutines of the family KDECn (module file KHAND.FOR). which may be l xtendeu in scope by the user. -0. "0 decoding Of keys Ino L&66 Of KDEC" ¶".brOUti"e¶I. -". u6C i6 M&E Of 6ubrOUti”e K,,EC” <0<"<5).

KRGE may cau~c the key range to be calculated by subroutines of the family KRGEn trodulc file KHAND.FORl. Which may be extended in acoee by the user. -0 the key range is assumed to he bound by KLOW -1 srrd

KHIGH=NSIZE. =" "se la -de of subroutine KRGE" iOtn(5). may Cause nodifications in the origins1 kcyo according to subroutines of the familv KMOD” (module file KHAND.FOR). which may be supplied by-the user. -0 individual keYa are left -0dificd.

KANA -n use is made 02 subroutine KMODn <O<n44). may c*ume the "0. of Iteme in output records to be regulated by subroutines of the family KANA" (uodule file ~.FOR), ZiCh .ay be 6Upp11cd by the use=.

norm1 run <no u*e of KANA" subroutinea). -n u*c 13 made Of SubroutIne KANA" <04n<51.

NOKEYS in nor-1 runs. &cEYB .NE. 0 products a" output file without keys.

KTYPES -1 c&l~pSl in th6 Calll”q proqrani lf KTYPEX i6 replaced By 1 in the arqument list. its purpose (type-validation) 13 lost.

XTYPEX 92 lpr666"t VCr610") in the calling PX-oqr-8 If ITYPM is replaced by 2 1" the argument 116t. its purpose itypc validation) is lost.

NSIZE

LUIicP

LUOUT

LUSCRl LUScR2

fE"I NBUI NBLI IRLUO STAT0

ACCTYO

NBUO NBLO

K

A IFA

numDcc Of itemu in tnc file to be Iorted. IF (NSIZE - LE_ 01 a message 1s prlntcd and ZlZRR=129. logical unit for input. which may be of 6aquential or direct access type. This logical u"lt is not closed u"ltns LCFINP-LUOUT* Logical unit for output. vhlch may be of stquential or direct &CC461 tm6. For 6eqUe"tlal aCCeem_ this logical u"lt rust bc rewound before use. Dcfauitx LUOUT=LUIWP_ scratch logical unit of sequential acccme type. another Scratch logical urrit first opened in direct access type. and later reopened in sequential access type. .e*srqc file. Default: I6R+=6. user record 1cnqttl I" file LUINP. "umBer of buffers In LUINP. number of blocks per logical record in LUINP. ~~~&record length in file LUOUT (=IRLUI by default).

' for a scratch output file. ='_' for a ntw output file. =‘SEQ’ for CL scqucntia1 bCCISS output file. -'DIR' for a direct access output file. number of buffers in LUOUT (=NBUI by default). "umber Of BlOCk6 wr 10giCal record 1" LUOUT 4-NBI.1 by default). working array of dimension IUNV=IRLWfCKTYPE+ITYPE)-L. where IRLW is GE. HAX(IRLUI.IRLUO). working array, 616Q Of d%PenSlOn Ium.

array of dlaenaion equal to 10 (transparent to the user! holding file attributeax

IBBORC

ILASRC

La) of the input file. on input. and Cb) of the output file, on output.

0" Input. it ia the record number of the input file marking the beqirvling of the list to be zsorted. If IBEGRC.GT.l. LUOUT must differ from LUINP. 0" output. IBECRC is equal to the number Of the la6t KtCOCd 60rtCd PLUS one. on input. It Is the numb6r of the last record Drevl~USl~

Page 9: Bin sort module to order large lists of small items: A module for scientifically oriented applications

Bin sort module to order large lists 143

written in the output file. On output, ILASRC is the (new) "uaber of the last record written in the output file.

IERR Complctlon status indicator. IDCD arrey (transparent to the “scrl utilized when KANA.NE.0.

Implicit input:

The file to be sorted. m&de up of any number of records. each user record IRECU consisting 1aa far as the user is concerned) of approximately IRLUI single precision words aa follsws: NKI the "umber of keys.

:: ND Intcqer keys ND double preclslo" Items.

NKI .LE. ND may vary from record CO record. where ND-IRLUII(KTYPE+ITYPE)-1.

K(NDb holds keys. Key values cannot erceed LARGE-l. Key valuea may equal zero, but equal key values are not allowed.

A(M)) holds items.

UK1 and the arrays above must be written in logical unit LUINP by means of subrautine WESID ami CALL HESID (O,LUINP.IFWX,IFA.NKE.K.A).

An example is qiven in subroutine IHPINP. and details about subroutine WJ?SID are given in nodule file GUTIL.FOR. given in C. Cfsncrc~ and C. F. Bunge. Comput. Chcn. 10. (1986).

IIpllcit output:

Sorted file of keys and Itemm in loqlcal unit LUOUT, written a*: CALL HESID IIOPT.LUOUT.IRECU.IFA.NIIO.K.Al with IOPT- for normal u~aqe and IOPT=l when no kevs tonly items) arc wanted. - KANA-0 NKO=NKLIH=IRLUOI1K!I'YPE+ITYPE~-1

except for the last record where keya and items are Completed up to NSIZE.

KANA.NE.0 NKO .LE. NKLIM ia Computed in subroutines of the family KANA".

A" exarp1e is given in subroutine IWOUT.

optional input:

Array IDCD. which contsins data pertinent to any key-

analyzing routines the user mfqht introduce. An example is

qiven in subroutines OPINOl and KOINl. belon.+nq to module filer GUTIL.FOR and KHAND.FOR. respectively.

Complctian status:

XERR-O normal 3UCCes~fUl completion. IERR.GT.32 but fERR.LE.64: varninqs; the callinq proqrdl nay chOOBe

an alternative sort method. IERR.GE.129 indicates a fatal error: the calling eroqram must choose

a recovery path and exit. sugqestians far cerreetiva

actions arc prlnted in file IMU.

caveats:

1. Since the output list of keys and Items is written in logical unit LUOUT the calling program must properly assign the value of LUOUT cnsurinq corpatibility with prevloua file assignments.

2. Two ,cratch unit numbers. LUSCRL and LUSCR2. must be assigned ensurinq con~tibllity with prevlou~ file a~aignments.

3. Since bin sort methods preclude the use of equal keys, their existence will produce wronq results. The program does not check for their occ"rre"Ct as it would imply a" actual sort, so It is up to the user to make murc that no equal keys arc present.

IMPLICIT DOUBLE PRECISION (A) REXL RATIOl.RATIO2 INTEGPI PAREA.szPAR.BARFAx. EARJzAl.SZALLB.SZ18 tOCICAL VAX CHARACTER*3 STATO.ACCTYO Pm (IUNIT=l28. LARGE-2**30+(2**30-11, WAX=.TRUE.,

* *

N$CSC::O,IM'~-;84. NPPA-160. RATIOl-5.. RATI02-2.5.

* IRL"S=I;NIT*NP& . PAREA=IUNIT*NPPA. * SZPAR -pARERI(KTYPE+ITYPE), DImSION NPARHrNPBA~.KSCR~SZP~~.ASCR~SZPARl.IF'ASl~lO~.IF~21101.

* IFA(*l.Kl*~.A(+).IDCD(~')

IERR = 0 IF (LUOUT .EQ. OI LUOUT=LUINP IF (IRLUO .EQ. 0) ERLUO-IRLUI

Page 10: Bin sort module to order large lists of small items: A module for scientifically oriented applications

ci. ~SNEROS et Cd.

C

C ****** END IF

IF (LUINP .EQ. LUOUT .AND. XBUZRC .GT. 1) THEW CALL UESSGE <I-.' LUINP-LUOUT W IBEORC .GT. 1-b IERR=

****** E3SD IF IF tKDU2 .EQ. 0 -AND. (KRGE .N!Z. 0 .OR. MA .NE. 031 TJXWN

CALL MESSGE CIrn,' IF KDDC-0 XT MUST HOLD KRGE-FANA-0') IERR-129

****c* MD rF IF IKDEC .NE. 0 - .NE. 01 THmd cgLr;,m&N~&r~~;mM~ .

***c** _ . _ IF (KDEC .NE. KANA) =

IF rmEc .EQ. 1 *AND. KANA .EQ. 21 TwI IF (KRGE .NE. 1 -AND. KRGE NE. 2) m

CALL MESSGE <IU4: INvALIh KDEC-KRGE-KANA'IJ ' COMBINATION') *

C END IF

ELSE IF CKANA .NE. 01 TWEN

CALL HIZSSGE <I-.’ INVALID KDEXZ-KRGE-K-ANA’ / / ’ COHBINATIOcS’> *

C

IERR=129

****** EM) JTF

END IF ELSE

IF 4KRGE .taE.KDEc *AND. KRGE.NE.lITwmJ CALL -SGE rrm.' INVALID n-KRGE-KMu’#/

I ’ cOI(~~IIUTXON’> IERR=

c ****** ElyD XF-

END IF END IF

XF CFSbU4 .EQ. 1 .OR. KAHA .EQ. 3 .OR. KANA .EQ. 41 THEN

%?~;NE. 0) KIFYPET-0 K4=1 IF 4 KANA .EQ. K4) K4-2 IF ,~~WIC~EIZT;~Y”‘-’ .LT. KI~IDCDrIPCD(1~+3!+1-K4) T%ZN

. IRLUO MO 9-L TO SATISM If *, I SUBROUTINE mArI REQUIREKENTS’~

C

END IF END II?

IF 'zzi ;y- 0) !rHEN

KHIGH-WSIZE Mb IF IF tKRGE .EQ. II CALL KRGE1tm. LARcEK.vAx.KTYPM.ITYP~.LUINP,

I IFA.NSI~,K.KLOH.~I~.I~. Im.IDCD)

IP <IERR .GE. 129) REZURt4 ****** c

Page 11: Bin sort module to order large lists of small items: A module for scientifically oriented applications

Bin sort module to order large lists

IF rKRGE .EQ. 1 .AND. IFAt RMIND LUINP DO 10 IDVN-l.IBEGRC-1

READ (LUINPJ EHD IF IF (KRCE .EQ. 2) cALLKRcE2 IF (XRGE .EQ. 3J CALL XRGE3 IF (XRGE .EQ. II CALL KR0E4 xLOml-KLOn - 1 KRANGE=XHIGH-KLO~l IF (IERR .EO. -11 m

1 .) .EQ. 1J TMEN

IDCD,KLOC'l.KHICHJ IDCD.KLcu.XHIMJ IDCD.KLOY,KHIGHJ

145

10

c

20

30

HAXSIZ=JjPPA*~PMk(.I~rr/(KTPPE+ITYPElJ CALL SORTST (KRM3GE .NSIZE.t4AXSIZ.IUTIOl,R IF (IERR .GE. 129) THEN

&TIOZ. ISW. .IERRb

BAFtizu=PAREA /NFILLS BAREAL-BAREAl-flOD~eAREA1.IUNITXJ IF (6AREAl.M). 0) THEM

CXL MESSGE (IM,' MO LARGE NSIZE FOR PRESm NS.'/I * ' SUB. ESTID UIGW TAKE OVER THE SORT',

IERR=

****+I* ENJJ IF

IAS =IRLUS/(XTYPE+ITYPEJ-1 NRU -(NSIZE-lJ/IAS+l CALL OlcS ('IOGN',' '.LUSCRl.IRLUSX.'SCR'.'SEQ'.2.3.NRU.

* KTYPEX.ITYP~.IUNITX.IFASl.I~.IERRJ IF (IERR .GE. 129) RETURN

l *****

CALL FIDOlB ~IUNITx.~SCRx,KTypEx.ITppEx.NSIzE.LUINp.LUScR1. * IFA.IFAS~.K,A.NRSCR.I~.IBOORC.I~I IF (IERR .GE. 129) REI'URN

IF (LUINP .EQ. LUOUTI CLOSE (UUIT-LUINPJ REWIND LIJSCRl NBINS -3AREAXtBAFtEhl NFILLS=PARE?L /BARal SZlB =Em.EA1/~KTYPE+ITYPEI

IF ISZlB .GT. KRANGEJ m SZlB -MANGE

NFILLSll BARJL41=SZlB*(KTYPE+I'J!YPEJ IF ~HOD(EMEAl.IUNITXJ .NE. OJ EAREk1=BAREw1+1cJN1m ~ER1=BAREhl-nOb~BAREAl,IUNITXJ -__

CLJL

IF (NFILLS*SZlB~NBINS .LT. MANGE) %WEN BAREal-BARE&l-IUNITX SZlB =BAFtEWI(XTYPE+ITYPE)

NBINS =BAREAxfBARE%l NFILLS-PAREth /BAREAl GO TO 20

ELSE IF (NFILLS*SZlB*(NBINS-1) -GE. KRANGEJ TkiRl

NBINS-NBINS-1 GO TO 30

END IF END IF

MD IF ISZPAR-NFILLS*SZlB

NBUFF=36*IUNIT/(BAREAl+IUNITJ IF (NBUFF .LT. 2) NBUFF~2 NBLKS-1 NRV = NBINS*NFILLS CALL OES (‘IORA’.’ '.LUSCR2.~l.'~'.'DIR'.~BIIFT.NBLKS.NRU~

* KTPPM.ITYPM.IUNITX.IF~2,I~.IwR, IF (I= .GE. 129) RETURN

lb*****

CAGL FIDOQA (IUNITX.VAX.NPSCRX. BAREAX.XTYPM.ITYPmc.I8ZPAR.KDmz. * iWOD.NSIZE.LUSCRl.IFAS1.LUSCR2.IFAS2.IIFILLS.BAREA1. * SZlB.NBINS.~~.~,IERR.IDCD~

IF (KERR .GE. 129) RETURN

RMIND LIJSCRl IF <VAX) 'RIM

CLOSE (IJNIT=LUSCRl) NBWF=I ~BLKS=24*IUNIT~~~l+IUNITJ

* NBINS*NFILLS !!!& *ES ('IOR&'.' '.LuscR2.~1.'oLD'.'s~'.~.#Bw(s.

* NRU.KTYW(.ITJIPEK.I~ITX.IFA82.I~.IwRJ

Page 12: Bin sort module to order large lists of small items: A module for scientifically oriented applications

146

IF <IERR .GE. 129) REJYJRN C ******

END IF

G. CISNEROS etd.

CALL. FIDOSA , I'UWITX.LARGEX.VAX.NPSCRX.RAREAX .KTYPM.ITPPEX.KDEC. * K~OD.NOKEYS.ISZPAR.KLOW.NSIZE.LUSCR1.IFASI.IRLUSX. * LUSCR2.EFAS2.Ul7ILLS.SZlB.NBINS.NPARU.KSCR.ASCR. * ITJH.IERR.I~D)

IF (IEYRR .GE. 1291 RJ?lURH C ******

REWIND LUSCRl CLOSE (UNIT-LUSCRZ. STATUS-'DELETE',

- ~NSIZE-1~/~IRLUOI~KTYPE+ITYPE~-1~+1 EL OES ('IOGN'.' '.LUQUT.IRLUO.STATQ.ACCTYO.NBUQ.NBLO.NRU.

z KTYPEX.ITYPEX.IUNITX.IFA.I~.IERR~ IF (IERR .GE. 129) REJSIRN

C ******

CALL FIDOPB ~~A.NOKEYS.IUNITX.VAX.NPSCRX,KTYPM.I~PM.NSIZE. * LUSCRI.LUOUT.IFAS1.~FA.IRLUO.K.A.IrJn.ILASRC.IERR. * IDCD) CLOSE (UNIT=LUSCRl)

END C__-----___----___------_-------_-----__----------------------------

SUBRQUTIHEJ SORTST ~KRAHGE.NSIZE.~IZ.RATIO1.RATEO2,FWH.IERR~ C__-------_----------------------------_--------------_------_------

C Purpose: ta Choose between LL bin sort nethcxa and C multi-way merge slgarithm.

REAL RAT101,F?AT102,RAT10 RATIO-~OATIKRANGE)/EZOAT(NSIZE) IF <RATIO .GT. RATIO11 THW

CALL UESSGE tIUM.- KRAWGEINSIZE .GT. RATIOl. * ' EBTID HIGHI? TAKE OVER THE

IERR-

C ****I+* END IF

an external

SUBROuTINE'/I SORT.',

IF (KRANGE GT. ~SIZIZ cALL I&sGE txw. NSI;~.n~~,~G~~~~~'~~TI*2

* ’ SUB. ESTTD MIGHT TAKE OVER +l-IE'SORT.'; 'II

IERR-

C ****** END IF

END P____-----___-----__--__-----------____----------_-__------_-----------

SUBROUTINE FIDOQA (IUNITX.VAX,NPSCRX. BAREAX.KTYPEX.ITYPEX.ISZPAR. * KDEC.KHOD.NSIZE.LUSCRl.IFAS1.LUSCR2.IFAS2. * NFILLS .~I.S21B.MBINS.WPARH.KWM.IERR.IDcD~

C Purpomcr to scatter keys and items into bins in aemory. &nd to C sc*tter bins fron aerorg into a direct aectaa file LUSCRZ.

: Abstract:

Unsorted keys and iterr are read sequentially from a temporary

E file LUSCRl and scattered into NBINS bins of total size not crccccling BAREA. Each bin IBIN raceIves kays (and correspond-

: inq itema) in the rang= CCIBIN-l>*ISZPAR+l, IBIN*ISZPARJ. Hhencver am bin fills. it is written to ra direct access ills

c" LUSCRZ at record ~IBIN-1~*NFILLS+NPARI¶(IBIN)/SZ1B. At the end of this process. all partly filled bins art written at records

C ~IBIN-1~~NFILLS+~NPAM~IB~Nl-l~~SZlB+l. rtsptCtlVely.

I-LICIT DOUBLE PRECISION (A> REAt RA.RBIN INTQzERBAREA.BAREAX. BAREAl.SZlB.BAREAA.BAREAK LOGICAL VAX Pm ( IcmIT=128.

* * KTYPE=l, . ITypE=2. * rRL”S= I1 _ _JNIT*NPSCR,IAS=1RLUSICKTYPE+ITYPE~-1. * SAREA=IVNIT*NPBA. ISZA=BAREAffTYPE. XSZK-BARXV8 DIHFNSION RA( IRLUS) IASl .Ac(IAs),

* RBIN~BAREA~.ASCR~ISW~.KSCRIISWl.~CFILt~NPBA~. * IFASlC~~.IFA52~~~,NP~~~~.IDCD~~l EQUIVALUJCE ~RA~1~.NKI~.~RAI2~.KI1~~.~RACKTYPE*I~+2~.AIl

* <RBIN.ASCR.KSCR)

IF <IUNITX .WE. IUNIT) m CALLraSSGE (IW. CHJ3CK IUNIT IN SUBROUTINE FIDO4A’) IEzRR=129

C ******

.IKTYPE)

)).

END IF IF (NPSCRX .NE. IWSCR

CALL MESSGE (I-.’ &%&K ~Jd!%R~) IGB. FIDO+A')

Page 13: Bin sort module to order large lists of small items: A module for scientifically oriented applications

Bin sort module to order large lists

[ERR=129 RnuI?N

147

C

C

C

C

10

20

30

40

II****

END IF IF (KTYPEX .NE. KTYPE .OR. ITYPEK.NE. ITYPE)

CALL MEXGE ILWII.' CHECKKTYFEAND/ORLTYPE IERR- RETURN ******

END IF IF (HOD~BAREALATYPE) .NE. 0) THEN

CALL MF.SSGE IIrn.' -1 NOT A MULTIPLE OF IERF!=129

****** END IF IF tITYPE.GT. 0 .AND. HOD(BhREAl.ITYPE) .NEt.

CALL MESSGE (IW.' BAREAl NOT A MULTIPLE OF IERR- RElUFtN ******

m IF

IF (ITYPE .GT. 0) BAFtEAA - BAREikl/ITYPE BAFtEAK = BAREAl/K!cYPE

T+w IN SUB. FIDOIA')

K!rYPE'>

0) THm ITYPE',

RBDISP = ~SZlB*ITYFEl/KTYFE IF ( HOD(SZlB*ITYPE. KTYPEI .NE. 0) KFJDISP-KBDISP+l

KXJ 10 IBIN=l.NBINS NPAF&WIBIN~=O ICFILLt IBIN)-O

ISIZE- IRINP-0 IRLNP-IRINP+l

CALL RRA fLUSCRl.IRINP.IFASl.RA) IF CKMOD .EQ. 11 THEN

IF (VAX) THEN CALL VAX_KMODl (NKI,K)

ELBE CALL KMODl (NKI .Kb

EHD IF END IF IF (KMOD .EO. 31 THEN

IF (VAX; THEN CALL VAXJt4OD3 INK1 .Kb

ELSE CALL KHOD3 lNK1.K)

END IF END IF Do 30 1=1.NxI

INDEK=KlI) NXNDEK-INDEX IF (VAX) THEN

IF (KDEC .EQ. I) CALL VAX_KDECl (TNDEK.IDCD.NINDEXr IF (KDEC .EQ. 2) CALL VAX_KDECZ ~INDEK.IlXD.NINDEX) IF IKDEC .EQ. 3) CALL VAX_KDEC3 (INDEX.IDCD.NINDEX) IF tKDEC .EQ. 4) CALL VAX_KDEC4 (INDEX. IDCD.NINDM)

ELSE IF (KDEC .EQ. 1) CALL KDECl (INDEK.I!XD.NINDM) IF (KDEC .EQ. 21 CALL KDEC2 (INDEX.IDCD.NINDEX) IF (KDEC .EQ. 3) CALL KDEC3 IINDEK.IDCD.NIND~I IF (KDEC .EQ. 4) CALL KDECQ IINDEK.IDCD.NIND~)

E?fD IF

NINDEX=NINDEK-KLOWl IBIN =(NINDEK-l)IISZPAR+l fCF&LL(IBIN)-ICFILL(I~INb+l

= ICFILL(IBIN) ISCATA=(IBIN-ll~BAREAA+IRES MCR(ISCATA)=A(I) ISCATK=(IBIN-l)ABAREAK +IRES+KBDISP KSCR(ISCATK)-INDEX IF (IRES .EQ. SZlB) TWEN

NPARWIBIN) =NPARM(IBIN)+SZlB IDXL =cIBIN-l)*BAREhl+l IRBIN*cIBIN-l~*NFILLS+NPARM(IBIN)lSZlB CALL WRA cLUSCR2.IRBIN.IFAB2.RBINlIDXLl) ICFILLtIBIN)=O

mm TP -.- -_ CONTINUE ISIZE=ISIZE+NKI IF {ISIZE .LT. NSIZEJ GO TO 20 DO 40 IBIN-l,NBINS

IF (ICFILL(IBIN) .GT. 0) m NPARn~IBXN~=IOPARn~IBIN~+ICFILL~XBIN~ IDXL. -(IBIN-l,*BAREAl+l IRBIN=(IBIN-~~*NFILLS+~NPARMIIBIN~-~~/S~~B+~ CALL i4RA ILUSCRZ.IRBIN.IF~2.RBIN(IDXL~~

END IF c0NI!1NuE RETURlJ END

Page 14: Bin sort module to order large lists of small items: A module for scientifically oriented applications

14.8 G. CISNEROS et al.

~___________________--____-----------__------__-----_____--_-___--____ SUE!.RO~I~UE FIDOSA cIUNITX,L&RGE.VAX.NpSCRX_ BIIREAX.KTYPEX,ITYPEX. * K~~EC.KHOO.NOKEYS.~~ZP~.KLOH.NSIZE,LUSCR~, #r IFAS1.IRLUSX.LUSCRZ.XFASZ.NFILLS.SZ1B.NBINS. * HPhRn.KSCR,ASCR.LSW.XERR.IDCD~

to ICattCt each bin’s contents into scratch arrays It-l mcmerp. followed by CL qather operation to obtain d sorted list which is uritttn in logical unit LUSCRl.

Each partition produced for each bin in SuBCoutlne FID04A is read sequentially from logical unit LUSCRZ into a scratch array containing both keys and items. and sorted by key us%nq diatrlbutlon tscattering) Into arrays KSCR and ABCR. CCspCCtlVCl~. where KSCR Is filled initially with LARGE. For I==1 to ISZFAR. all keys (and corrcswndinq itemu) for which KSCRtIl is different from LARG& ace collected <gathered) into arrays KOUT and AOUT holdinq a sorted Ilat vhieh IS tcmporarlly

written in loqieal unit LUSCRl.

IMPLICIT WUBLE PRECISION (Al REAL RA.RBIN I- -.BAREAX.SZlB LOGICALVAX PARAMXER (IuNIT=lZB.

* NPSCR-10. NPBA-16lr. * KTYPE-1. ITYPE-2. * IRLUS-IUNIWNPSCR. IAS=IRLUS/(KTYPE+ITYPEI-1. * BAREl&-IUNIT*NPBA. ISZA-BAREA/ITYPE. ISZK-SAREAIKT’YFEI *DIR¶EXWION RA~IRLUS).KOUT<IAS) .AOUTI IAS) p

RBIN(BAREA).ABIN(ISZA),KBIB(ISZK~. * IFASl~~~.IFAS2~~l.~~c*~.IDCD~~I.KSCFI~*~.ASCRf*l EQUIVALEXCE <RA<l~,NKOI. ~RA~Z).KO~(l)~,(RACKTYPE"IA9+2~.AOIIT(l)).

* (RBIN.ABIN.KBXN)

IF (IUNITX _NE. IUNIT) ;AL.& SS"" tIW4.'

w

END IF IF IN-PSCRX -NE. NPSCR

THEN CHECK

.OR. TAL& ESGE cItU4.’ CHECK

rt_lm& C ***I**

END XF IF (KTYPEX .NE. KTYPE OR.

CALL MEZ!SOE ( IW4. ’ &ECK IERR-

C ****** MD IF

10

IUNIT IN SUBROUTINE FIOOSA'l

BllREAx .NE. BAREA)Tl-lEN NPSCR AND/OR NPBA IN SUB. FIDOSA')

ITYPEX *NE. ITYPEI THEN KTYPE AND/OR ITYPE IN SUB. FIDOSA')

KBDISP= (SZlB*ITYPEbIKTYPE IF ( MOD(SZlB*ITYPE. KTYPE) .NE. 0) KBDXSP-KBDISP+% KLow41= KLOW-1 NKL = IRLUSl(KTypE+ITYPE)-1 NKO = NSIZE IF (NSIZE .ciT. NKLI NKO-NXL ISIZE = 0 IROUT - 0 IOUI f 0 IOUT =o DO 60 IBIN-1,NBINZ XF (NPARH<IBIN) .GT. 0) m

INDEXB..(IBIN-ll*ISZPAR-KLOml DO 10 L-l.fSZPAR

KSCR(I)=LRRGE NKPAR =NPARnt IBINl Iu?fLL.S-~NKPAR-1~/SZlB+l IRSCR -(IBIN-ll&NFfLLS DO 30 IFILL=l,MFILLS

IRSCR-IRSCR+l NKSCR-SZlB IF <IFILL . EQ. MF1LI.S) NKSCR-NKPAR-(IFILL-l)*SZlB CALL RRA (LUSCR2.IRSCR.IFAS2,RBIN~ Do 20 3=1.msCR

INDEX=KBIN~J+KBDISP) NINDEX-I- IF (VAX) THEN

;z :iiiz .EQs 1) CALL VAX_KDECl ~INDEx.IDcD.NIwDEio

.Eo. 2) CALL VAK_KDECZ ~XHDRC.IDcD.NINDM~ IF tRDEC .EQ. 3 > CALL VAX_KDEYZ3 I INDEX. IOCD.NI.WDMI IF (KDEC .EQ. 4, CALL VAx_KDEcQ (fNDEK.IDCD,NINIIMl -

ELSE IF [KDEC .EQ. 1) CtiL KOECl tIHDEX.IDCD.NINDEK~ IF (KDEC .EQ. 2) CALL

3) CALL KOECZ "NDEX_f~_~f~; KDEC3 <InDEx.

. 4) CALL KDECQ 1 INDQC. 1DCD:NINDEX)

m IF

Page 15: Bin sort module to order large lists of small items: A module for scientifically oriented applications

Bin sort module to order large lists 149

20 30

RINDm=NINDEX-INDDCS KSCR(NfWDM)=I~ ASCR(NIND~)-ABIN

c0NT1NuE coNTINGE

DO 40 I-1.I32PAFi KEY-KSCR(I,

:%- +EQ. LARGE) Go TO 40

- IOUT + 1 KOUTLIOUT)=KEY AOUT~IOUT~-ASCR(1) IF (IOUT -GE. NKO) TWEN

IROWJ?IROuT+ ISIZE=ISIZE+NXO IF (KMOD .EQ. 21 THEN

IF (VAX) 'RIEN CALL VAK_KMODZ INK0 .KOUTb

ELSE CALL KHODZ lNKO.KOUT)

END IF END IF CALL WRA ~LUSCRl.IROVT.IFASl.RA) IF (ISIZE SEQ. NSIZE> REtVRN

****** NKO = NKL IF ((ISIZE+NKL) .GT. NSIZEI NKO - NSIZE-ISIZE IOUT = 0 IF

END IF I

50

mm

NREZADS-NFILLS-WILLS IF C-S .GT. 0) THEN

NREC =IFA32(4)

r++zzSZS READ (LlJ&R2!

END IF IF

60 c0RT1mE END

c_-______________________________________________________-_________-___

3USROUTINE FIDOlB ~IUNITX.NPAGEX.KTYPM.ITYPP[.N3IZE.LUI~P.LUSCR. +I IFA.IFAS.XL~.AINP,WRSCR.I~,IanaK:,IERR~

Purpose * to transfer the contents of M input file into another file having suitable attributes for e pclrtlculrr FurFoee.

Abstract i A given list of keys and itema is aa#ued to be on M input file LUINP. NSIZE element8 are reed atacting at record IBEGRC and written in a scratch file LUSCR havinq rttrlbutss defined through L prior call to subroutine OE3.

IMPLICIT DOUBLE PRECISION (Al REALRSCR PARAMXER (IGHIT=lZB.

* NF3cR310. * KTYFE*l. ITYFE-2. * IRLUS~IIJNIT*NFSCR. I~-IRLUSI(KTYPE+ITYPE)-1) DEMRJSION K3CR(US),~LSCR~IAS).R3CR~IRtU31.

* IFA~~~,IFAS~~~,KI~~~~.AIWPf~~ EQUIVALerCE (RSCR(l).NKS).(RSCR(Z).KSCR~llr.

* (RSCR(KmPE*IASc2),ASCR~lJl

IF (IUNITX .NEi. IUNITI THRJ :LERf; ESGE IItiH. 'CHECK

a

****** END IF IF (NPAGEX .NE. NFSCR) %HEN

CALL KESBGE (IM,' CHECX ImR-129

MD IF IF (KTYPM.NE. KTYFE -OR.

CALL HE3SGE (IW.' CHECK IERR=

****** EM] IF

KS = IFMl51 - NSIZE

IF (NSIZEZ .GT. ND) NICS=ND

ISCR - 0 KRSCR = 0

IGNIT IN SUBROVPINE FIDOlB'l

NF3CR IN SUBROUTINE FIWlB')

ITYPEX .NE. ITYFE) Ttlm KTYFE AND/OR ITYFE I# SW. FIDOlB'l

Page 16: Bin sort module to order large lists of small items: A module for scientifically oriented applications

150 C+. CISNEROS ef al.

IRINP = IBEGRC-1 ISIZE * 0

10 IF ( ISCR SEQ. NKSJ THEN IRSCR = XRSCR + 1 ISIZE = ISIZE + NKS CALL HRA (LUSCR.fRSCR.IFAS.RSCRJ IF ( ISIZE .EQ. NSIZEI GO TO 20 IF [ISIZE+NKS .GT. NSIZEl NKS-NSIZE-ISIZE ISCR - 0

END IF

IF (IINP .EQ. NKXI THEN IRINP = IRINPt 1 CALL RESID tO.LUINP.IRINP.IFA.NI,KINP,AINP) IINP = 0

END IF

ISCR = ISCR -t 1 IINP = IINP + L

KSCR(ISCRJ=KINP(IINPJ ASCR( ISCR I -AINP( IINP 1 GO TO 10

20 NRSCR =IRSCR IBEZGRC=TRINP+l REZWRN E@m

~____________________________----__-----_-----------_----

C-

C C

C C C

E C C c C C C

C

C

C

C

SUBROUTINE FID02B (KANA.NOKEYS.IVNITX.VAX.NPAGP(.KTYPPT.ITYPEX. * NSIZE.GUSCR,LUOUT.IFAS.IFA.IRLUO,KOUT.AOUT. * IW.ILASRC,IERR.IDCD)

Furposer to transfer a list of keya and items from logical unit LUSCR to anatlxcr logical unit LUOUT with frcsly chosen file attributca.

Abstract: A given list of keys and items II aaacucd to reside In succcsslvc records ISCR of a f+lc tassivcd to logical unit LUSCR) having fixed Length records af size dlrtinctly drtsrrlnad by program parameters NPSCR and IRLUS. An output lint is delivered in auccc111vc records IROUT of logical unit LUOUT. The t-lumber Of itCml raK0 per each output record is equal te NKL. NKL-IRLUO/CKTYPE+ITYPEI-1 (or NKL=IRLUO/ITYFE-1 if no keys are -ted in LUOUT). unless KANA=n .NE. 0. in utaifh case mo -LE. NKL ray vary from record to record am dictated by subroutine KANAn.

IMPLICIT DOUBLJZ PRECISION th) REX& RSCR LOGICAL VAX PARAM3XR (IUNIT-128.

* NPSCR-10. * KTYPE=l. LTYPE=Z. * IRLUS = IUNIT*NPSCR . IAS=IRLUSf (KT%‘PE+ITYPEJ -1 b

DIMENSION KSCRlIASJ.ASCR(IAS).

* RSCR [ fRCUS , . * IF~~*~.IFA~*~.KOUTI*J.AOUT~~l.IDCD~+~

EQUIVALEZNCE ~RSCR~lJ.NKSJ,(RSCR~ZJ~KSCR~l~J, * (RSCR(KTIPE*IAS+Z).ACRll)J

IF (IUNITX *NE. CALL HESSGE IERR- 129 RETURN a+*****

IUNITJ m CIW, 'CHECK IUNIT IN SUBROUTINE FID02B'I

END IF IF (NPAGM NE.

CALL KEi3SGE IERR= RlzcURN ******

END IF IF lKTYPM NE.

CALL &SGE IEmR-129 RJZCURN ******

END IF

IF (KANA .LT. 0 CALL HEBSGE 1ERR=129

****** E3JD IF

NPSCR > CIWn.'

K!l?YPE t1HH.

.OR. (IWH.'

THEN CHECK NPSCR IN SUBROUTINE FIDOZB'J

OR. ITYPM *NE. ITYPEtTHEN kiECK KTYPE ANDfOR ITYPE IN SUB. FIDOZB'J

KANA CT. 4) m KANA OUTSIDE RANGE IN SUeROUTINE FIDOZB'J

NKL. =IRLUOfrKTYPE+ITYPEl-1 IF (NOKEYS .NE. 0 I

NKO =NSIZE fF tNSIZE .GT. NKLb

iWL=IRLUO/ITYPE-1

NKD-NKL IOLD =o IOUT =o

Page 17: Bin sort module to order large lists of small items: A module for scientifically oriented applications

Bii sort module to order large lists

IROVT -1LAsRC NKS -0 ISCR 10

IRSCR -0 ISIZE -0

10 IF (IOUT .GE. NKOI m IROUT = IROUT+ ISIZE - 1s1zE+NKo IF tNOKEYS .EQ. 0,

l CALL HESID ~O.LUO~.IROUT,IFA.NKO,KO~.AOUT~ IF (NOKEYS .NE. 0)

* CALL =ID fl.LUOVT.IROIIT,IFA,NKO.KO~,AOVT~ fF' US&E_I;Ew.oNSf~ GO TO 20

_ IOUT .GT. UK01 ISCR = ISCR - 1 WO = NKL -

IF (IISIZE + NKL) .CT. NSIZE) NKO-NSIPE-ISIZE IOUT = 0

QlD IF

151

CALL RRA (LUSCR,IRSCR,IFAS.RSCR)

ISCR - 0 END IF

IOUT = 1oG-r +1 ISCR - ISCR + 1 KEY - KSCRtISCR)

IF (VAX) THEN IF (KANA .Eg. 1) CALL VAx_KANAl IKEY,IOVT,M(L.IDCf3.IOLD,NKo~ IF (MA .EQ. 2) CALL VAx_KANAZ IKEY.IOuT.NKL. IOLD,NKO) IF tW4NA .EQ. 3) CALL vAx_KANA3 ~KEY,IouT.NKL,IDCD.IoLD.NKo~ IF (WA .EQ. 4) CALL VAx_xANA4 ~KE!Y,IouT.NKL,IDCD.IOLD,NKo)

ELSE

:; :kK -;:- :i :f: KANAl (K?ZY,IOUT.NKL.IDCD.IOLD.NKO~

IF (KANA :EQ: WA2 <KEzY,IOuT.NKL. IOLD.NKO)

3) CALL KANA3 (KEX,IOUT.NXL.IDCD.IOLD.NKO~ IF tKAnA.EQ. 4) CALL KANA4 <KEY,IOUT.NKL.ICCD.IOLD,NKO)

END IF IF (NOKEYS .EQ. 0) KOm(IOm~-KEY

AOUT(IOUT)=ASCR(ISCRl GO TO 10

20 ILASRC-IROUT