Upload
sr
View
218
Download
3
Embed Size (px)
Citation preview
IEEE TRANSACTIONS ON COMPUTERS, DECEMBER 1971
Correspondence-
Comments on "A Nonlinear Mapping for DataStructure Analysis"
JOSEPH B. KRUSKAL
Abstract-A recent paper by Sammon' describes a useful techniquefor analyzing data and helping find clusters. It consists of a nonlinearmapping of the original data into two dimensions (for visual inspection)which is superior for many purposes to use of the first two principal com-ponents. The present correspondence points out that essentially the samemapping can be achieved by a widely used, easily obtainable computer
propram for multidimensional scaling called M-D-SCAL.
In a recent paper Sammon' described a very interesting method fornonlinear mapping of multivariate data into two dimensions in order topermit the discovery of various data structures, particularly clusters.It is now possible to perform a procedure virtually identical to Sam-mon's by use of a modern, easily obtainable, widely used program formnultidimensional scaling. (Incidentally, an application to medical dataof very similar ideas has been published by Thompson and Woodbury[1].)
Briefly, in Sammon's notation dij* is the distance between the givendata points i andj, and dij is the distance between the correspondingpoints i andj in two dimensions. To find these points in two dimensions,he minimizes the expression
E [dij* - dij]2dij* dij*
where each summation runs over i <j, i,j = 1,** , N. (Notice that the4^, are constant, and only the dij vary as the points in two dimensionsare moved around.) In the program M-D-SCAL (version 5M), which hasbeen distributed on request to dozens of computer centers and whoseearlier versions had been requested by hundreds of computer centersall over the world, one of the options permits this expression to beminimized:
Sammon's expression / '
In practice, the denominator under Sammon's expression is so nearlyconstant over the region of interest that it hardly changes the resultingponfiguration. To use M-D-SCAL in this way, it is necessary to use thecontrol phrases WFUNCTION, SFORMI, REGRESSION = POLYNOMIAL =1,
REGRESSION = NOCONSTANT, and to add the Fortran function subroutine
FUNCTION WTRAN(ARGUMT)WTRAN = 1.O/ARGUMTEND
to the program deck.A complete description of this particular type of multidimensional
scaling was first published in Kruskal [2], [3].2
REFERENCES
[1] H. Thompson, Jr., and M. Woodbury, "Clinical data representation in multi-dimensional space," Comput. Biomed. Res., vol. 3, pp. 58-73, 1970.
[2] J. Kruskal, "Multidimensional scaling by optimizing goodness of fit to a nonmetrichypothesis," Psychometrika, vol. 29, pp. 1-27, Mar. 1964.
[31 , "Nonmetric multidimensional scaling: A numerical method," Psycho-metrika, vol. 29, pp. 115-129, June 1964.
Manuscript received March 22, 1971; revised April 2, 1971.The author is with Bell Telephone Laboratories, Inc., Murray Hill, N. J. 07974.1 J. W. Sammon, Jr., IEEE Trans. Comput., vol. C-18, pp. 401-409, May 1969.2The computer program together with a carefully written instruction manual may
be obtained from the Coordinator of Computer Applications, Marketing Science Insti-tute, 1033 Massachusetts Ave., Cambridge, Mass. 02138 (attention: Miss W. H.Tsiang) for approximately $10. An annotated selective bibliography on multidimen-sional scaling, with references to 3 books and 16 articles (excluding application articles),may be obtained from the author.
Comments on "A New Algorithm for GeneratingPrimeImplicants"
S. R.DAS
One of the major areas in switching theory research has been con-cerned with obtaining suitable algorithrns for the minimization ofBoolean functions in connectionwith the general problem of theireconomic realization. A solution of the minimization problem, in gen-eral, involves consideration of two distinct phases. In the first phaseall the prime implicants of the function are found, while in the secondphase, from this set of all the prime implicants, a minimal subset(according to some criterion of minimality) of prime implicants isselected such that their disjunction is equivalent to the function andfrom which none of the prime implicants can be dropped withoutsacrificing equivalence. Many different algorithms exist for solving boththe first and the second phase of this minimization problem. In a recentpaper,' Slagle et al. describe a new algorithm for the generation of allthe prime implicants of a Boolean function. As claimed by the authors,this algorithm is different from those previously given in the literature.The algorithm is efficient, does not generate the same prime implicantmore than once (though the algorithm sometimes generates some non-prime implicants), and does not need large capacity of memory forimplementation on a digital computer. The algorithm works equallywell with either the conjunctive or the disjunctive (both canonical andnoncanonical) form of the function. In the conjunctive case, thealgorithm is applied once to get all the prime implicants, while in thedisjunctive case, the algorithm is first applied to get all the primeimplicates, and is next applied to the set of all these prime implicates toget all the prime implicants of the function. The function being specifiedalgebraically in a sum-of-products or in a product-of-sums form, thebasic approach of the authors consists in first finding the frequencyordering of the different literals appearing in the product or sum terms,respectively, and next carrying out a process of expansion of the func-tion around the different variables in one or more levels through aseries of trees, called semantic trees. A semantic tree is defined by theauthors as a tree to each node of which is attached either a circle or across, called a terminating node, or a set of clauses, called a nonterminat-ing node, and to each branch of which is attached a literal. From thefinal semantic tree, the prime implicates or prime implicants are finallyfound by collecting the sets of all the literals at the branches on thepaths from the top down to all the circled nodes. Slagle et al. alsoshowed how the same algorithm can be used to find the minimal formsof Boolean functions as well. On carefully going through the paper andstudying the algorithm developed by the authors, I would like to makethe following comments.
It may be emphasized first that the algorithm described by theauthors of the aforementioned paper for the generation of primeimplicants is very straightforward and simple, and represents a worth-while contribution in the field, though the basic idea utilized in thedevelopment of the algorithm is not altogether new and has previouslybeen used by other authors [1]-[3] in finding the prime implicants ofBoolean functions in a more or less similar manner. Scheinman [1]developed a tabular technique for the generation of prime implicantsof a Boolean function based on successive expansion around the vari-ables starting from the minterm-type expression. In the method ofScheinman all the prime implicants of the function are not generated
Manuscript received February 16, 1971; revised May 5, 1971. This work was sup-ported in part by the National Research Council of Canada under Grants A-875 andA-I 690.
The author is with the Department of Electrical Engineering, University of Ottawa,Ottawa, Ont., Canada.'
1 J. R. Slagle, C.-L. Chang, and R. C. T. Lee, IEEE Trans. Comput., vol. C-19, pp.304-310, Apr. 1970.
1614
CORRESPONDENCE
sometimes, though the prime implicants necessary for finding the mini-mal solutions are always obtained. Further, this method, like that ofSlagle et al., is directly applicable for finding the minimal forms of thefunction. An algebraic approach based primarily on successive expan-sion to generate all the prime implicants of a Boolean function utilizingthe maxterm-type expression was first proposed by Nelson [2]. Thisbasic idea of Nelson was subsequently utilized by Das and Choudhury[3 ] in developing a tabular method for a more efficient generation of allthe prime implicants of a Boolean function starting from the maxterm-type expression represented in decimal mode. The semantic tree ap-proach of Slagle et al. is almost similar to the method of Das andChoudhury, except that, in the method of Das and Choudhury, theexpansion, unlike that by Slagle et al., is carried out successively aboutall the variables starting from the highest weighted one in differentlevels. The authors also extended their tabular method for generatingprime implicants of functions having many unspecified fundamentalproducts, utilizing a very novel idea suggested in a paper by Mc-Cluskey [4]. In developing their algorithm the authors of the aforesaidpaper failed to mention these existing and closely related techniques.The idea of the present communication is thus to draw the readers' at-tention to the existence of these related papers.
REFERENCES
[1] A. H. Scheinman, "A method for simplifying Boolean functions," Bell Syst. Tech.J., vol. 41, pp. 1337-1346, July 1962.
[2] R. J. Nelson, "Simplest normal truth functions," J. Symbolic Logic, vol. 20, pp.105-108, June 1955.
[3] S. R. Das and A. K. Choudhury, "Maxterm type expressions of switching functionsand their prime implicants," IEEE Trans. Electron. Comput., vol. EC-14, pp.920-923, Dec. 1965.
[4] E. J. McCluskey, Jr., "Minimal sums for Boolean functions having many unspeci-fied fundamental products," AIEE Trans. (Commun. Electron.), vol. 81, pp. 387-392,Nov. 1962.
Comments on "An Algorithm for Finding IntrinsicDimensionality of Data"
G. V. TRUNK
In the above paper,' Fukunaga and Olsen present an alternativemethod of estimating the intrinsic dimensionality of data. Their pro-posed algorithm differs from others in that it relies heavily on operatorinteraction and provides a method of specifying variable local regions.The authors state: "This variability is critical as the practical problemof determining dimensionality depends on the size and number ofsamples in the local regions." This is illustrated in their summaryTable II (B), in which, for local region sizes containing five and tensamples, the indicated dimensionalities are one and three, respectively,when using the 1 percent eigenvalue criterion; and one and two, respec-tively, when using the 10 percent criterion. While the authors may havea decision rule to select the correct answer from the summary table, Idid not see it in their paper; and without such a rule, I do not believethe problem has been solved satisfactorily.
While the size of the local region is critical for Fukunaga and Olsen,it is not nearly as important for the statistical method [1]. In order todemonstrate this, consider the three Gaussian examples presented byFukunaga and Olsen. One hundred cases of each example, an exampleconsisting of 50 20-dimensional vectors, were analyzed by the statisticalmethod; the results are shown in Table I-out of 300 cases, only twoincorrect answers were obtained. For all these cases, the local regionfor each point was defined by its five nearest neighbors. The statisticalmethod is also very fast: the running times for examples 1, 2, and 3were 2.7, 2.9, and 3.1 s, respectively, on a CDC 3800 computer.2
Manuscript received May 4, 1971.The author is with the Radar Division, Naval Research Laboratory, Washington,
D. C. 20390.1 K. Fukunaga and D. R. Olsen, IEEE Trans. Comput., vol. C-20, pp. 176-183,
Feb. 1971.2 While the running times used by Fukunaga and Olsen were 76, 120, and 140 s,
no valid comparison can yet be made since their computer was not identified.
TABLE I
ESTIMATED DIMENSIONALITIES OF THE 100 CASESCONSIDERED FOR EACH GAUSSIAN EXAMPLE
Gaussian Examples Estimated Dimensionalitiesin Fukunagaand Olsen 1 2 3 4
1 100 0 0 02 0 100 0 03 0 0 98 2
The authors state that previous investigators had not considered thenoise problem and then attack the problem by using a large number ofsamples. They estimate very accurately the eigenvalues and note thesmall difference in the eigenvalues due to the parameters and those dueto the noise. However, in 1968, a "filtering" method [2], which doesnot require a large number of samples, was proposed as a solution tothe noise problem. This method defines a pseudo signal-to-noiseratio R:
D
R(K-N) /2(1)
where K is the dimensionality of the vector space, N is the intrinsicdimensionality of the data, o- is the standard deviation of the noise oneach basis vector, and D is the average distance from a point to its(N+ 1)-nearest neighbor. It was shown that when R> 12, the noise doesnot affect the estimation of dimensionality. When R< 12, the data canbe filtered in the following manner. The original M points are randomlydivided into L subgroups, each subgroup containing M/L points.This filtering has increased the signal-to-noise ratio in each subgroupsince the average distance D has been increased. The algorithm forfinding the optimal number of subgroups L and the manner of com-bining the results of the various subgroups is presented in [2]. Whetheror not the filtering method can be used in conjunction with Fukunagaand Olsen's method is not known, since the latter requires a minimumnumber of points in the local region in order to estimate the covariancematrix.
REFERENCES
[1] G. V. Trunk, "Statistical estimation of the intrinsic dimensionality of data collec-tions," Iniform. Contr., vol. 12, pp. 508-525, May-June 1968.
[2] -, "Representation and analysis of signals: Statistical estimation of intrinsicdimensionality and parameter identification," Gen. Syst., vol. 13, pp. 49-76, 1968.
Authors' Reply3K. FUKUNAGA AND D. R. OLSEN
In the statistical method proposed by Trunk [2], [3], he calculatesthe density function p(X| N) where X= [xl . .. x,]T is the observedrandom vector and N( = 1, 2, * - * ) is the intrinsic number of parametersfor the random vector. The most likely N for the observed vectors isdetermined by applying the multihypotheses test to these density func-tions. The number of random variables are reduced by using sufficientstatistics such as the ratios of local distances and certain angles betweenlocal vectors, rather than the original xi, x, .
As far as the estimation of an intrinsic dimensionality is concerned,the statistical method is in its essential nature a more accurate but morecomplex estimate than the local eigenvalue method of [1] where onlysecond moments are considered. Although the statistical method re-
quires the selection of some control parameters, these are readily set.Several assumptions were required in the derivation of the density
' Manuscript received May 25, 1971.The authors are with the School of Electrical Engineering, Purdue University,
Lafayette, Ind. 47907.
1615