6
The International Journal of Museum Management and Curatorship (1987), ~$201-206 CMASS: A Response to the Smithsonian Institution’s Statement of Problem Document ROB DIXON The Smithsonian Institution’s document CMASS-a Statement of Problem was distributed to a wide range of institutions in Autumn 1985, together with a request for comments. It was published in full in this Journal,’ but a copy of the document has also been passed to me by the Tate Gallery, who are users of our specially developed museum computer service, STIPPLE.’ Having created STIPPLE to overcome just the type of problems that the Smithsonian were describing, I replied to them, and followed this up with a successful live demonstration of STIPPLE at the Smithsonian in May 1986. The Smithsonian Institution document defined the problems clearly, and these can be simply divided into two broad areas, complexity and quantity. Theoretically both of these can be overcome in time, after considerable work, but the implementation of a practical solution tends to be delayed or deferred indefinitely by the dynamic nature of the problem, and this was also highlighted. Since museums are dealing with history, and history is, by its nature, fixed and therefore defined, this may seem somewhat surprising. Museums are trying to collect, conserve and publish objects and information about them. They are documenting history, not trying to change it. But their perception of the detailed requirements of such actions is continually evolving, and is therefore dynamic. As many museums have large collections, libraries and archives, and they have achieved at best only a small part of their ambitions for creating systems for cataloguing, collections management, etc., these and the methods for achieving them are continually altered by the changing priorities and individual interests of directors, curators, conservators and administrators, by the evolution of changes in public taste and in sponsorship, by fluctuating resources and many other factors. CMASS is intended to provide computer systems for collections management. It is not possible to apply standard procedures to a very diverse range of object types. In addition, for particular object types, the tasks or functions to be performed may change depending on the individual items and the resources available, and the selection of the functions can also be affected by time and purpose. The reason why the computer industry has so far failed to provide appropriate and timely solutions for its end users lies in the traditional method of creating computer systems. This can be considered in two parts, the specification of the requirements, and the creation of an integrated computer system to satisfy them. Whilst the latter is extremely difficult, the former, in a complex environment, because of the wide range of problems to be solved and tasks to be performed and their interdependence, is far beyond the capabilities of any committee (even of two people) and beyond those of any individual I have ever met. In addition, the end users will not be sure of their real requirement until they have used a computer system of the type to be created. As soon as they learn from this experience, their requirements will inevitably evolve. External factors also require continuing change. 0260-4779/87/020201-06$03.00 01987 Butterworth&Co(Publishers)Ltd

CMASS: A response to the Smithsonian Institution's Statement of Problem document

Embed Size (px)

Citation preview

Page 1: CMASS: A response to the Smithsonian Institution's Statement of Problem document

The International Journal of Museum Management and Curatorship (1987), ~$201-206

CMASS: A Response to the Smithsonian Institution’s Statement of Problem Document

ROB DIXON

The Smithsonian Institution’s document CMASS-a Statement of Problem was distributed to a wide range of institutions in Autumn 1985, together with a request for comments. It was published in full in this Journal,’ but a copy of the document has also been passed to me by the Tate Gallery, who are users of our specially developed museum computer service, STIPPLE.’ Having created STIPPLE to overcome just the type of problems that the Smithsonian were describing, I replied to them, and followed this up with a successful live demonstration of STIPPLE at the Smithsonian in May 1986. The Smithsonian Institution document defined the problems clearly, and these can be simply divided into two broad areas, complexity and quantity. Theoretically both of these can be overcome in time, after considerable work, but the implementation of a practical solution tends to be delayed or deferred indefinitely by the dynamic nature of the problem, and this was also highlighted. Since museums are dealing with history, and history is, by its nature, fixed and therefore defined, this may seem somewhat surprising.

Museums are trying to collect, conserve and publish objects and information about them. They are documenting history, not trying to change it. But their perception of the detailed requirements of such actions is continually evolving, and is therefore dynamic. As many museums have large collections, libraries and archives, and they have achieved at best only a small part of their ambitions for creating systems for cataloguing, collections management, etc., these and the methods for achieving them are continually altered by the changing priorities and individual interests of directors, curators, conservators and administrators, by the evolution of changes in public taste and in sponsorship, by fluctuating resources and many other factors. CMASS is intended to provide computer systems for collections management. It is not possible to apply standard procedures to a very diverse range of object types. In addition, for particular object types, the tasks or functions to be performed may change depending on the individual items and the resources available, and the selection of the functions can also be affected by time and purpose.

The reason why the computer industry has so far failed to provide appropriate and timely solutions for its end users lies in the traditional method of creating computer systems. This can be considered in two parts, the specification of the requirements, and the creation of an integrated computer system to satisfy them. Whilst the latter is extremely difficult, the former, in a complex environment, because of the wide range of problems to be solved and tasks to be performed and their interdependence, is far beyond the capabilities of any committee (even of two people) and beyond those of any individual I have ever met. In addition, the end users will not be sure of their real requirement until they have used a computer system of the type to be created. As soon as they learn from this experience, their requirements will inevitably evolve. External factors also require continuing change.

0260-4779/87/020201-06$03.00 01987 Butterworth&Co(Publishers)Ltd

Page 2: CMASS: A response to the Smithsonian Institution's Statement of Problem document

202 CMASS: A Response to the Smithsonian

This does not mean the problems are beyond definition-merely that a new way has to be found to specify them, one that can be readily understood by the people with the greatest knowledge of these problems. These are not computer experts but, in the case of museums, the administrators, curators, conservators and others. Above all others, these people should be able to define their own problems, but they can only do so in their own (natural) language, and in small steps. Although they can consider the relationships between all steps, a few at a time, like other human beings they are not capable of considering the whole at one instant. Most people know or can easily learn the rules of chess, but even the most skilled player can think only a few moves ahead. None can envisage at any instant the total implications of all possible moves by both players.

Humans are certainly not capable of defining the problems using the traditional tools of the computer, the restrictive unnatural programming languages which are understood only by computer experts. Whilst the computer expert may be capable of translating the natural language problem definitions into a favourite unnatural computer language, the end users of the system being created cannot relate to this, nor should they need to. They cannot check whether the computer expert has understood their definition of their requirements. They cannot check whether their own definition was correct. They have to wait until the computer solution is completed by the expert before they can try and judge its suitability. By this time, vast technical and financial resources will have been used, and these are in limited supply.

The traditional application development cycle has a series of steps which must be undertaken sequentially:

Requirements analysis System specification User sign-off specification Program specification Program coding Program test System test Documentation Implementation

Each computer expert has his own view of the steps involved-the list above is an oversimplification. The traditional method of developing systems requires the DP expert to go sequentially through a large number of steps with the end users being involved in

only some of these. One of the problems is that it is only when the last step-implementation-is reached that about 50 per cent of the weaknesses in the specification will be noticed. To correct these the process has to be started again from the beginning, and the whole series of steps retraced. This results in unacceptably long

development times. The end users are clearly the people best qualified to create systems which fulfil their

requirements, but the computer industry so far has failed to provide them with the

appropriate tools. A lot of publicity has recently been given to fourth generation languages, which speed up the programming part of the development cycle, but they tend to be merely a more productive way of following the traditional method. They do not overcome the communications gulf which exists between the end users and the computer experts. Whilst every effort should be made to improve the communications process, it is impossible to avoid breakdowns in communication, because what is totally obvious to one side is a whole new world to the other.

Page 3: CMASS: A response to the Smithsonian Institution's Statement of Problem document

ROB DIXON 203

The voluminous specifications produced in the traditional development cycle are generally a waste of time and effort, even though much hard, well-intentioned work has gone into them. Traditional methods of system development produce static, rigid solutions, whilst the real world of the end users evolves dynamically. If the traditionally produced systems are wrongly specified in the first place, as so often happens, the problems are further exacerbated, and often the system has to be totally rewritten to include missing features or to change its requirements. Statistics produced by IBM suggest that the cost of correcting a problem after a system has been completed are about 100 times greater than the extra cost of specifying the problem correctly during the original analysis and specification phases. Where, as in the case of the Smithsonian Institution, the system functions have to be changed dynamically, and the problems defy traditional analysis methods, there is little point, as the Smithsonian Institution is well aware, in using such traditional methods.

As our understanding of the problems evolves, so should the solutions. The facilities of STIPPLE have been enhanced both since my earlier articles and even since my reply to the Smithsonian Institution. These enhancements provide solutions to the dynamically changing problems of wide-ranging collections in many museums. STIPPLE was already well able to handle the enormous volume of some collections, and was created from ERROS (Expert Real-time Relational Open Systems) without any program coding or program generation being required. Any application, such as STIPPLE, which has been created from ERROS automatically inherits the properties of ERROS, and thus each application is itself an expert system. ERROS takes a revolutionary approach to the creation of computer systems in that it is a data-driven rather than a program-driven system. Computer programs are the traditional means by which the computer industry has created computer applications, but, except where programs manipulate data by performing mathematical calculations, programs are of no real interest to end users. They are interested only in the data available to them, and programs are merely the traditional method with which they access and change those data.

It may seem a heretical view to suggest scrapping computer programs, since the computer industry always uses them and has convinced both itself and the end users that this is the only way forward. Yet programming is one of the biggest bottle-necks in a computer department, and there will never be enough skills in the industry to write even a fraction of the programs which are really required. The only solution is to find a new way of creating computer systems, with less involvement by the computer department and, most important, a much greater degree of involvement by the end users than is the case with traditionally developed systems.

End users understand the information or data relating to their particular job. In ERROS, and in applications which are developed from it, such as STIPPLE, the definitions of data and the rules with which those data can be accessed and updated are stored, together with application definitions, procedures and menus, in the same database as the user data they define and control. The data definitions, etc., are presented to the users in exactly the same way as their own data, so there is a totally standard and consistent operator interface which is used whatever the task being undertaken by the end users. Since ERROS and its applications use natural language words and phrases for data and application definitions, these can be readily understood by the end users, even though they may have no computer skills. They can therefore create and change such definitions, and the ERROS data-driven approach becomes a user-driven approach with minimal support being required from computer experts.

As an example, consider a catalogue of a collection of prints and engravings. Let us

Page 4: CMASS: A response to the Smithsonian Institution's Statement of Problem document

204 CMASS: A Response to the Smithsonian

assume that, when the catalogue was originally created on a computer system, it was decided that method of engraving was not an item of information which needed to be recorded for each print, perhaps because all the prints were line engravings. At some later point, recording the method of engraving became an additional requirement. This requires the addition of an extra piece of information, which might be described as a field or an attribute or a data element, to each record of a print in the catalogue. In traditionally created systems this would normally involve a total reorganization of the existing print catalogue to expand the records, so that the additional element could be included. Although this is not particularly difficult, it would mean shutting down the computer system whilst it was being done, and it would also involve either some elementary programming or some utility which could undertake the task. Where traditional systems use data dictionaries to define the record length for each print and the length of the data element to be added, these functions would have to be changed. Rather more complex changes would be required for the application programs to allow updating and accessing of the new data element, method of engraving, with traditional system development methods, and these would take some time.

The approach used by ERROS, and thus STIPPLE, is quite different. The method of data storage is totally independent of the application, and extra data elements can be added to records without any need for restructuring existing data. What is more, they can be added whilst the database is in use and without shutting down access to it. The ERROS and STIPPLE approach uses the thesaurus to define and structure data, but in a way that can readily be understood by an end user who does not understand, and does not want to understand, how computer systems are created and how they work.

The steps are as follows (with only more senior staff having the authority to make the changes):

I. The operator selects the thesaurus and types in method of engraving. If it is not there

already, he presses a command function key to add it. He then types in the different words or phrases describing method of engraving, such as etching, line engraving, stipple, mezzotint, wood cut, aquatint, etc., and he adds those which are not already in the thesaurus in the same way that he added method of engraving.

2. He then selects method of engraving, the record he first created, and from the list of attributes about method of engraving he selects narrower range term, and includes there the various terms which he has already entered into the thesaurus. What he is doing is creating hierarchical relationships between method of engraving and the various terms. He cannot include as a method of engraving a word or phrase which is not already in the thesaurus.

3. It is now necessary to define method of engraving as an attribute or data element of the print catalogue, so the user includes method of engraving as a narrower range term of data element.

4. The operator then specifies method of engraving as a data element in the print catalogue.

5. The only remaining step is to decide who has authority to change and access the new data element in the print catalogue called method of engraving. Each operator can be given the whole or a subset of the list of data elements or attributes for the print catalogue. This list can even change where the function he is undertaking changes. For instance, the list of attributes to which he has access when undertaking collections management of prints would presumably be different if he were recording conservation work on a print, or if he were studying the print from a curatorial

Page 5: CMASS: A response to the Smithsonian Institution's Statement of Problem document

ROB DIXON 205

viewpoint. Once authority has been given to all or some operators for method of engraving, the phrase will immediately appear in the list of data elements about each print for those operators. If an operator selects it, STIPPLE will automatically display the first ten engraving methods which are valid, and he can select the appropriate one for each print. Multiple operators can be doing this at the same time without any problems. It is possible to set up the facility so that operators can select more than one method of engraving, where appropriate, such as etching and aquatint, or restrict the choice to one for each print, and perhaps put up a further method called mixed method where appropriate.

At no point in this process have any computer skills been required, nor have any record or field lengths been defined, yet if initially the method ofengruving is entered for only a few records in the print catalogue, no space is wasted in the remaining records. No work by a computer expert has been required. No programs have been changed.

This is only a simple example, but much more complex functions can be created or changed in the same way with ERROS and STIPPLE. The list of methods of engraving can be increased or changed at any point. Perhaps initially it was decided not to catalogue photogravures as they are reproductions, but at some later time a change of policy might result in this becoming an additional requirement. All that has to be done is to add the wordphotogyuvwe to the thesaurus, and include it as a narrower range term of method of

engvuving. All the earlier types of print mentioned will require the name of the engraver in a catalogue, whereas since a photogravure is not hand-engraved, this will not be a requirement. It is possible to set up STIPPLE so that the attributes to be recorded about each print change with the method of engraving, if this is felt desirable. All this can be done without any programming, and no programming code is generated by making changes. It is all done by storing the definition of the facilities required in data rather

than in programs. The method is totally open-ended, and does not make assumptions about what will or

will not be required in the future. As users’ perceptions of their requirements change, so can STIPPLE be changed to meet those requirements. It can be changed for all operators in a particular area, such as print cataloguing, or just for a few. It allows multiple institutions to create and share union catalogues, and yet where they so desire to put their own data into some fields. This is a very powerful facility, and the data in these fields cannot be accessed by operators from other institutions. What it means is that the facilities can be made to fit exactly the requirements of each institution, or operator within an institution, even when they are sharing the same totally integrated database.

The open-ended data-driven approach of ERROS makes changes to computer systems a much simpler task which generally requires no computer skills. But more fundamentally, it eliminates the need for the detailed requirements analysis and system specification which are the first two items in the list above. It is no longer necessary to try and design as one complete exercise a total solution for a problem which cannot readily be defined. The necessary system can be created in small steps. As each addition is totally integrated with all that exists already, the implication of the creation of each extra step can be readily seen, tested and understood by users, both on a stand-alone basis and in relation to all that already exists. Since applications created from ERROS are totally open-ended, there is no requirement to plan the system as a whole. Extra data types, procedures, etc., can be added or changed as required.

The idea of an integrated system which allows an operator to navigate freely throughout all the data recorded without any restriction is anathema to most computer

Page 6: CMASS: A response to the Smithsonian Institution's Statement of Problem document

206 CMASS: A Response To the Smithsonian

experts on the basis that it just cannot be done. STIPPLE has had this facility since it went ‘live’ in 1983, and operators can navigate, for instance, from the catalogue entry of a print to the catalogue entry of a watercolour or drawing, which was the engraver’s preparatory reduction before engraving the plate, and from that drawing to the original oil-painting from which the reduction was taken, and from the catalogue entry for the oil-painting to, perhaps, all other works by the same artist, or which have some common iconographical content, or to other works of art stored in the same location. Since ERROS applications use only a single integrated database, the query facilities of ERROS do not need to be enhanced as different types of data or different attributes are added to the system.

Traditional methods of system development used by the computer industry have failed to provide the solutions, both for the fine art and museum world and for ordinary commercial applications, which give their users the solutions they require at the time they require them. Improving traditional methods would never give the necessary productivity gains nor systems with adequate flexibility to keep up with end user requirements. A major change is required if the computer industry is to make any real progress so that it spends less of its time reinventing the wheel rather inefficiently.

The computer industry is currently enthusiastic about the virtues of so-called fourth generation languages as a considerable improvement on their existing methods of system creation. These, however, still depend on the same analysis and specification procedures which are so faulty. All that the fourth generation languages do is to produce more bad systems more quickly, creating even greater problems of systems maintenance for the future. The data-driven approach of ERROS and the applications which are created from it, such as STIPPLE, represents a major change, and eliminates at least 80-90 per cent of traditional programming work. It is not a fourth generation language in that it does not generate programs. It is an expert system which allows an integrated knowledge base of both user data and definitions of the applications to be created. STIPPLE was created from ERROS without any program coding or automatic program generation. Other ways of creating computer systems may well evolve, but after extensive searches, both in Europe and in North America, it seems that the approach of ERROS is still unique. Thus it is likely to be the best solution available for some time to come, and STIPPLE is outstanding for its response to the problems formulated by the Smithsonian Institution.

Notes and References

1. Michel Vulpe, ‘CMASS: A Statement of the Problem’, The International Journal of Museum Management and Curatorship, fi, 1986, pp. 349-356.

2. See The Tate Gallery Illustrated Biennial Report 1984-86, pp. 120-121. The features of STIPPLE (System for Tabulating and Indexing People, their Possessions, Limnings & Ephemera) have been described by the present author in this Journal: in ‘A Modern Computer Cataloguing and Administrative System for Museums’, 2, 1983, pp. 335-346; and ‘Using Computers for Art History and Collections Management’, 4, 1985, pp. 56-63.