Medical FactNet - Buffalo Ontology Siteontology.buffalo.edu/MFN/MedicalFactNetLong.doc · Web view( Query text Medicine-Worldwide response (translated from the German; responses include

MEDICAL FACT NET

5.7.2023

AbstractIf a medical information system is to mediate between experts and non-experts, then it must understand both expert and non-expert medical vocabulary, and it must be able to map between the two. Much research effort has been devoted to the study of expert medical terminologies. As computers become increasingly important to the delivery and processing of medical information, it becomes ever more urgent to understand also the language used by non-experts.

Natural language processing (NLP) applications that retrieve, classify, and evaluate information are indispensable to online information retrieval systems in the domain of consumer health. We propose to boost the power of such systems by means of a new type of NLP technology, focusing not on single words but rather on the beliefs such words are used to express and on the medical facts to which such beliefs correspond.

Our goal is to construct two very large databases of statements relating to the medical domain, called Medical Belief Net (MBN) and Medical Fact Net (MFN). MBN will consist of statements collected from non-expert human subjects and from a variety of existing corpora, including online sources targeted to consumers. MFN will be a subset of MBN, created on the basis of a rigorous process of validation by medical experts, and representing the common knowledge of medical phenomena shared by experts and non-experts. MBN and MFN will be equipped with a robust formal architecture that is designed to make their content usable as a supplement to existing databases of expert medical and biological knowledge.

In constructing MBN/MFN we will draw on our experiences in the construction of the lexical database WordNet. One sub-goal of the project is to create a consumer health lexicon, called Medical Word Net (MWN), via the thorough revision and extension of the coverage of medical phenomena in the existing Princeton WordNet.

The results of our work will have theoretical and practical implications for medical education and for our understanding of the communication between medical experts and non-experts. It will allow new types of research on consumer health from the perspectives of both psychology and linguistics, for example in exploring individual and group divergences in medical knowledge and in understanding non-expert medical reasoning and decision-making. It will also provide a valuable supplement to existing medical ontologies and terminologies.

Draft Consortium AgreementThe contractual agreement is between SUNY Buffalo and Princeton University as sub-contractor. The work will be carried out be Barry Smith and other members of the working group in medical ontology in Buffalo together with Christiane Fellbaum of the Psychology Department in Princeton.

The Buffalo group has participants from the Department of Philosophy, School of Dentistry, School of Medicine and Faculty of Computer Science and Engineering. Its primary focus is the use of formal tools in the construction, integration and alignment of ontologies and terminologies in the domain of biomedical research. It collaborates closely with the Institute for Formal Ontology and Medical Information Science of the University of Leipzig.

Dr. Fellbaum is one of the two authors of the lexical database WordNet and is a member of the Cognitive Science Laboratory, where WordNet was created and is maintained. Dr. Fellbaum also directs a project "Collocations in the German Language of the 20th Century" at the Berlin-Brandenburg Academy of Sciences, which uses computational methods to study word-sequences that act as single syntactic or semantic units. Smith and Fellbaum have been involved for some time in research at the interface between WordNet and formal ontologies.

The present project responds to calls from the research community to strengthen the medical component of WordNet. The division of labor and responsibility reflects the expertise in Buffalo in the field of medical ontology and terminology and in Princeton in the field of lexical databases.

The responsibilities will be divided as follows:

Shared between the two sites:Compilation of MBN / MFN databases

Buffalo:Expert validation of MFN databaseFormal architecture for MBN / MFN / MWNTesting in query-answering systems and biomedical information retrievalMappings to standard terminology systems (primarily UMLS)

Princeton:Compilation of MWN (lexical database for the medical/consumer health domain)Non-expert validation of MBN / MFN databases

most grant review committees willbe looking for good evidence of direct scientific advancement within thescope of the grant, not just the promise of future science after thedevelopment of a set of resources. You can do this in any of three ways:

- show that you method for creating these resources is itself a scientific process that is applicable to other tasks or that it will serve as a pilot to scale up to some larger expansion of the current task. For example, you might desccribe a new interview method for getting experts to verify large sets of medical facts. You may be doing thisnow, but the innovation of your method doesn't come out in the proposal. You just say you will ask the subjects to answer on a three point scale.

- show that the result you create is verifably "good" - for example, that it has some internal consistency that suggests it is a coherent whole.

- show that the products are good for some external task, such as literature retrieval

Each of the above approaches requires evaluation in order to qualify asscience.

A. SPECIFIC AIMS Should contain testable hypothesesA1. MBN / MFN / MWN: We will assemble a large, heterogeneous, open-source corpus of medical sentences in the English language expressed in the form of grammatically complete statements and assessed by the degree to which they are understandable and assented to by typical non-expert human subjects. Those sentences which receive high marks for understandability and assent will constitute a first sub-corpus, called Medical Belief Net (MBN), designed to represent the (true and false) beliefs about medical phenomena distributed through the population of US English speakers.

The sentences in the MBN corpus will then be assessed for correctness by medical experts. Those entries receiving high marks on the scale of correctnesss will constitute

2

Medical Fact Net (MFN), a subset of MBN designed to represent the knowledge of medicine – the true beliefs – shared by non-expert English-speakers.

Medical Fact Net is in simple terms the intersection of the two classes of non-expert beliefs about medical phenomena and truths validated by medical experts. It will comprehend those sentences in MBN which (a) originate in high-reliability sources such as MEDLINEplus or (b) originate in common discourse, including in electronic mail and on the internet, or (c) are elicited from non-expert human subjects but receive a near-maximal score for correctness upon being validated by medical experts. The goal is to document the entirety of the medical knowledge that is understood and accepted by average adult consumers of healthcare services in the United States today.

MFN and MBN will support the population of Medical Word Net (MWN), a lexical database which will revise and extend the Princeton WordNet by all the medical terms encountered in MBN and including also the non-expert terms employed in the Specialist Lexicon of the UMLS (McCray et al. 1994). MWN will in turn yield new families of candidate sentences for inclusion in MBN and MFN. (See Appendix B1 for a sample of WordNet’s current representation of the medical domain. The inadequacies in this coverage will be addressed as a by-product of this project.) The construction of MWN and MBN/MFN will reinforce each other mutually. Additions to MWN will yield new families of candidate sentences for inclusion in MBN and MFN, and extensions of MBN/MFN to new domains of medical phenomena will in turn yield new lexical entries for inclusion in MWN.

Appendix B3 represents raw data for MWN which contains 2838 terms (WordNet’s current medical coverage is of the order of 300 terms.) We anticipate that MWN will stabilize in a lexicon of the order of 10,000 terms in size. We aim to produce at least 40,000 entries for MBN, of which we estimate that at least 30,000 will qualify for inclusion in MFN.A2. Formal Architecture: We will create for the entire corpus a sophisticated formal architecture, comprehending:1. a formal representation of the linguistic structure of each sentence; 2. a formal representation (ontology) of the entities and relationships referred to in the corpus; 3. a mapping of the MFN entities and relationships into the UMLS Metathesaurus.1. will be created using an off-the-shelf tagger such as (Brill XXXX, Collins or LINK) leaving disambiguation (the computationally most difficult part) to manual intervention. On this basis we will use be in a position to provide for many of our sentences a semantic interpretation by means of a for What is not clear to me is what you do after syntacticparsing. Is your mapping to MWN just based on lexical mappings, or doyou intend to jump into phrasal or even sentential (semantic) interpretation?Also, how do you deal with the standard killer topics, e.g., negation,coordination, quantifier scope, etc.? That matters even for syntacticallysimple English. this will be a syntactically parsed representation, with mappings from nouns, verbs and adjectives to the corresponding entries in the MWN lexical database These mappings will be generated semi-automatically on the basis of a training corpus following precedents described in (Vintar et al., 2003) and in Appendix 2. Disambiguation will be performed manually.2. will be created along the lines described in (Fielding 2004);

This formal architecture will be used both to control for internal consistency of MFN and also to support alignment with other biomedical ontologies. This will add to MFN two extra levels of quality assurance over and above the validation of single sentences by experts. We anticipate that the resulting architecture will provide a useful augmentation to existing ontology and terminology systems in biomedicine in a way that will rectify some of the problems caused by mismatches between existing controlled medical vocabularies and the concepts used in human reasoning (Zhang 2002, Appendix A3). It will also allow MFN to serve as CORE OF

3

FEDERATED DATABASE Note also that, because all entries in MFN have been validated by medical experts, the ontology underlying MFN should be compatible with existing biomedical ontologies and with expert biomedical fact databases in the future. MFN would provide annotations to those terms employed (ubiquitously) in biomedical information sources that are taken over from the non-expert lexicon. MFN will have a robust architecture based on sound ontological principles; it will be subject to a rigorous process of validation; and in its final form it should be consistent with the knowledge that is contained also in other fact repositories both at the expert and the non-expert level.A3. Evaluation: To evaluate the results of our work we will carry out a series of pilot experiments designed to test the utility of the separate components:

1) MFN will be tested by the degree to which it yields benefits to an existing on-line consumer health portal based on term-search technology. We will measure the degree to which using MFN sentences to direct users to information sources results in greater user satisfaction, by setting up an experiment in which customers of the portal are randomly assigned to one of two groups: one to which access to MFN is offered, and other for which simple term-searching is used. We will measure user satisfaction for both groups by asking "are you satisfied with the information you received?" The results will enable us to estimate the effects of the MFN. We may also employ other measures, for example of the type outlined in (Stanford et al. 2002).

2) MFN will be used as a training corpus for machine-learning based parsing (Streiter, 2001). For testing purposes, MFN will be randomly divided into two parts: one for training a machine-learning based parser, and one for testing the performance of the trained parser on sentences it never saw before. By setting up different training models which offer the parser more MFN components (syntactic annotations, semantic annotations, full ontology, MWN) we can assess the impact of each component.

3) We will test the usefulness of MWN for medical information retrieval by following the methodology described in (Jackson and Ceusters 2002): we will take as baseline the performance of an existing information retrieval system on the OHSUMED corpus, and by comparing it with the performance obtained when the system is given additional access to MWN. The OHSUMED corpus also contains a collection of questions together with relevance judgments about documents with respect to these questions (Hersh et al., 1994.). Both these relevance judgments, as well as the results of various research groups that participated in the TREC-2002 competition will be used to compare our results. Since OHSUMED is a corpus of documents collated from MEDLINE abstracts, and thus contains mainly information for medical experts, it can be tested whether or not a MFN/MWN with formal representation can also be beneficial for improved retrieval of expert information.

4) The results of our work will be made freely available to other researchers at every stage, and they will be invited to propose strategies for evaluation. The community of research groups using WN for applications is already very large, and we already have expressions of strong interest in using the MFN database to test existing query-answering and information retrieval systems.

The results of one pilot study, designed to test the degree to which expert medical knowledge bases can be exploited as sources of entries for MFN, are included as Appendix B2.B. Background and Significance this usually reviews similar work by others - for example,construction of other knowledge bases

B1. The Importance of Non-Expert Medical Language The specialist community of medical scientists and practitioners is turning increasingly to computer-based tools for managing knowledge and terminology. and as a means of absorbing the ever more complex data provided by disciplines such as biochemistry and

4

functional genomics. At the same time however, the source of much of the empirically ascertainable medical data is what human beings report in non-expert language – the language used not only by patients but also by family members, advisors, administrators, lawyers and so forth, and to some degree also by nurses and physicians. It is the computer-processing of this non-expert language that is the subject of this proposal.

If a medical information system is to mediate between experts and non-experts, then it must rest on an understanding of both expert and non-expert medical vocabulary and it must be able to map between the two (Patel, Arocha and Kushniruk 2002). The problem is to capture the way terms are used in all their contexts in the form of a clear-cut semantic representation, and this is one of the hardest tasks facing linguistic science today. There is no straightforward way to postulate a certain standardized semantics or a conceptual definition as the one and only effective meaning of a term because the latter is always inseparably tied to the usage of the term – especially to its usage in sentential contexts. (Pustejovsky, 1995)make it clearer that MBN and MFN will provide something crucial that these expert databases do not already provide

Everyday language and technical language do not differ in this respect, though postulating standardized meanings is common practice in work on medical terminologies. Technical terms are used primarily in restricted scientific domains – where it is at least theoretically possible to control their meanings by developing and publishing standards of correct usage. (Lewalle, 2000). + refs to medical sublanguage http://www.cs.nyu.edu/cs/faculty/sager/NLP-RCD.pdf

Conspicuously, however, the lexical meaning and contextually dependent usage of medical terms employed by lay persons are much more difficult to capture theoretically. The difficulties are obvious: where the usage of technical medical terms by professionals is subject to control by standardization efforts, the non-experts who establish usage for the terms in the medical lay vocabulary are under no such influence and such usage is resistant to direct manipulation. The state-of-the-art of how to use a lay term is a matter of convention established more or less ephemerally by everyday talk, not only between experts and non-experts but also among the non-experts themselves.

Non-experts use popular terms (also sometimes used by medical experts) such as mumps instead of the technical terms (Parotitis epidemica). The popular medical vocabulary naturally covers only a small segment of the encyclopedic vocabulary of medical professionals. It lexi -calizes mainly at the level of taxonomic orders. Popular medical terms (flu) are often fuzzier than technical medical terms. Many popular terms also cover a larger range of referent types than do technical terms, others may cover only part of the extension of their technical counter-parts. The lower degree of differentiation in popular language leads to intersections with fami-lies of technical terms such that the popular terms fall short of exact coverage. Many single terms used by non-experts – for example bacteria, colon, cyst, dermatitis, embryo, glucose, hepatitis, melanoma, septic, spasm – belong to much larger families of cognate terms whose remaining members (for example acystia, baeocystin, blastocyst, cysteamine¸ cysteine, poly-cystic) are used only by experts. Non-expert terms often have highly context-dependent meanings; expert terms are at least ideally more precise, and thus also less dependent on context. (Tse, 2002)

Some terms, such as fever or sprain or burn, belong to both the expert and the non-expert languages. Some terms, such as health or pain or consent, are such that the specification of their meanings in the expert language essentially involves a reference to the understanding of non-experts. Some terms, such as boss, felon, messenger, oppressed, plate have specific technical meanings only distantly related to their non-expert usage.B2. Mismatches in Doctor-Patient CommunicationThe taxonomies reflecting popular medical lexicalization are much less elaborate at both the upper and lower levels. For instance, there are no popular terms linking infectious disease and mumps, so that in the popular medical taxonomy of diseases the former immediately

5

subsumes the latter. The precise ways in which the conceptual organization of non-expert knowledge of the medical domain differs from that of experts have not thus far been investigated empirically.

The knowledge-acquisition skills of a physician must comprise the ability to acquire relevant and reliable information through communication with patients, and then it is non-expert language that serves as the medium for knowledge exchange across the linguistic divide. The physician must also have the practical knowledge which enables him to convey diagnostic and therapeutic information in ways tailored to the individual patient, as well as information concerning the consequences of given therapies for the patient’s way of living.

Since the physician, too, is a member of the wider community of non-experts and continues to use the non-expert language for everyday purposes, one might assume that there are no difficulties in principle keeping him from being able to formulate medical knowledge in a vocabulary that the patient can understand. (Slaughter 2002) and (Zeng et al., in press) suggest, however, that there are limits to this competence. (Slaughter 2002) examines dialogue between physicians and patients in the form of question-answer pairs, focus especially on the relations documented in the UMLS Semantic Network. Only some 30% of the UMLS Semantic Relations used by professionals in their answers directly match the relations consumers used in their questions. An example of one such question-answer pair is taken from (Slaughter, p. 224):

Question Text: My seven-year-old son developed a rash today that I believe to be chickenpox. My concern is that a friend of mine had her 10-day-old baby at my home last evening before we were aware of the illness. My son had no contact with the infant, as he was in bed during the visit, but I have read that chickenpox is contagious up to two days prior to the actual rash. Is there cause for concern at this point? [...]Answer Text: (a) Chickenpox is the common name for varicella infection. [...] (b) You are correct in that a person with chickenpox can be contagious for 48 hours before the first vesicle is seen. [...] (c) The fact that your son did not come in close contact with the infant means he most likely did not transmit the virus. (d) Of concern, though, is the fact that newborns are at higher risk of complications of varicella, including pneumonia. [...] (e) There is a very effective means to prevent infection after exposure. A form of antibody to varicella called varicella-zoster immune globulin (VZIG) can be given up to 48 hours after exposure and still prevent disease. [...]

Such examples illustrate the mismatch in communication (which may in part reflect legal and ethical considerations) between experts and non-experts. Professionals often do not re-use the concepts and relations made explicit in the questions put to them by consumers. In our example, the questioner requests a yes/no-judgment on the possibility of contagion in a 10-day-old baby. In fact, however, only section (c) of the answer responds to this question, and this in a way which involves multiple departures from the type of non-expert language which the questioner can be presumed to understand. Rather, physicians expand the range of concepts and relations addressed (for example through discussion of issues of prevention, etc.). [Can you tie this back to the UMLS semantic relations and how they might be improved upon?]

An information resource, whether it be a primary care physician or an online information system, must respond primarily with generic information to requests that relate to specific and episodic phenomena (occurrences of pain, fever, etc.). In our example, all sections besides (c) are of this generic kind. They contain answers in the form of generic statements about causality, about types of persons or diseases or about typical or possible courses of a disease. (Compare Patel et al., 2002.) Our propose is to use MFN as a constraint on the output of an information resource in order to ensure that such answers are intelligible to the non-expert in a way that allows the required information about concrete persons, locations, times, and occurrences can be reliably inferred. B3. Non-Expert Language in Online Communication Understanding patients requires both explicit medical knowledge and also tacit linguistic competence that is dispersed across large numbers of more or less isolated practitioners.

6

This is not a problem so long as this knowledge is to be applied locally, in face-to-face communication with patients. However, as a result of recent developments in biomedical science and technology, including research drawing on genomic date, as well as advances in the domain of telemedicine and internet-based medical query systems, we now face a situation where such dispersed, practical (human) knowledge does not suffice.

First, the potential of biomedical informatics rests on the possibility of analyzing vast amounts of data, including not only (for example) genomic and pharmacologic date, but also data pertaining to patient reactions, traits and behavior which may be available only in the form of reports formulated in non-expert medical language. Some 90% of the digitally available medical information exists only in the form of unstructured natural language text. The ability to extract structured data therefrom, and to analyze this data in conjunction with technical information, would make possible new forms of computer-based medical research of a scale hitherto unimagined.

Second, governments are increasingly investing in e-health services, for example in Denmark, where all physicians are now strongly encouraged to accept medical questions by email and to respond within 48 hours. Studies have shown that clinical questions are expressed in a small number of different syntactic-semantic patterns (about 60 patterns account for 90% of the questions: Ely 2000, Jacquemart 2003). Such questions are typically of the form "Do hair dyes cause cancer?", "Can I use aspirin to treat a hangover?" Given a resource such as MFN, questions such as these can easily be transformed into statements providing correct answers: "Hair dyes can cause bladder cancer", "Aspirin doesn't help in case of a hangover", such statements being linked further to relevant and authoritative sources.

Third, considerable effort is currently being invested in the task of providing internet-accessible medical knowledge that is both reliable and accessible to the non-expert. But the success of systems such as MEDLINEplus® (www.nlm.nih.gov/medlineplus) or Medicine-Worldwide (www.m-ww.de) will rest on advances in our ability to process non-expert medical language. For such systems must be adaptable automatically to a variety of potential users in contexts where these potential users cannot rely on the locally-focused practical knowledge of the physician.

MEDLINEplus® is described in its online documentation as a source of medical information for both experts and non-experts. It is described as “a goldmine of good health information ” and as being such that “Health professionals and consumers alike can depend on it for information that is authoritative and up to date”. Enquirers can use MEDLINEplus® like a dictionary, choosing health topics by keywords. Or they can use the system’s search feature to gain access to a database of relevant online documents selected for reliability and accessibility on the basis of pre-established criteria.

Table 1 shows the problems that arise when the system fails to take account of the special features of the knowledge and vocabulary of typical non-expert users. Here success in finding

Table 1 Online-Inquiry to MEDLINEplus® (http://www.nlm.nih.gov/medlineplus)Query text MEDLINEplus® response (with links to documents sorted by the

following keywords)tremor Tremor, Multiple Sclerosis, Parkinson’s Disease, Degenerative Nerve

Diseases, Movement Disorders intentional tremor Tremor, Multiple Sclerosis, Parkinson’s Disease, Spinal Muscular

Atrophy, Degenerative Nerve Diseasestremble Anxiety, Parkinson’s Disease, Panic Disorder, Caffeine, Tremortrembling Anxiety, Parkinson’s Disease, Panic Disorder, Phobias, Tremorright hand trembles Phobias, Anxiety, Infant and Toddler Development, Parkinson’s

Disease, Diabetesright hand trembles when grasping

Infant and Toddler Development, Sports Fitness, Sports Injuries, Diabetes, Rehabilitation

7

http://www.m-ww.de/

the needed information depends too narrowly on the precise formulation of the query text. Thus tremble and trembling call forth different responses (one lists caffeine, the other phobias), even though the terms in question differ only in a minor morphological affix which should not bear on the semantics of the query in any way. Experienced internet users are of course familiar with the limitations of search engines, and so they are able to manipulate their query texts in order to get more and better results, though even experienced users will not be able to overcome arbitrary sensitivities such as are illustrated by the tremble/trembling case. Moreover, information systems cannot have the goal of bringing non-experts’ ways of using language into line with that of the system.

More elaborate search engines are able to compensate for these problems to some degree with an on-the-fly parsing of the query text. The search engine of Medicine-Worldwide uses at least a morphological analysis, but Table 3 suggests that this does not help improve the query results to a significant degree:

Table 2 Online Inquiry to Medicine-Worldwide(http://www.medicineworldwide.de/suche/index.html)

Query text Medicine-Worldwide response (translated from the German; responses include links to documents generated by query text)

tremor Morbus Parkinson, Tinnitus, Gerstmann-Straeussler-Scheinker-Syndrome (GSS), Creutzfeldt-Jakob-Disease (CJD), New variant Creutzfeldt- Jakob-Disease (nvCJD), Kuru, Alcohol …

intentional tremor (none)tremble (Zittern) Tinnitus, Morbus Parkinson, Caisson Disease, Gerstmann-Straeussler-

Scheinker Syndrome (GSS), Creutzfeldt-Jakob Disease (CJD), Kuru, New variant Creutzfeldt-Jakob Disease (nvCJD), Cocaine and Crack

trembling (Zitternd) Caisson Disease, Gerstmann-Straeussler-Scheinker Syndrome (GSS), Creutzfeldt-Jakob Disease (CJD), New variant Creutzfeldt-Jakob-Disease (nvCJD), Kuru, Tommotis, Cocaine and Crack, Amalgam, Sarin (chemical weapon), Tabun (chemical weapon).

right hand trembles (none)right hand tembles while grasping

(none)

The problem is that the numbers of users at lower education levels are constantly increasing and the system cannot anticipate the way inexperienced users will initiate a query. System designers are thus increasingly called upon to adapt their systems to consumers, instead of forcing consumers to adapt to their systems. Here, we suggest a way to augment language-focused approaches to the problem of supporting online access to consumer medical information. We propose a new kind of communication interface which allows pooling and processing of information on the basis of a constantly growing database of medical knowledge formulated in the vocabulary understood by non-experts.

A resource of this kind will have practical applications not only in the domains of medical natural language processing and query services but also in other domains. Telemedicine systems, for example, will need to be able to anticipate the typical state of knowledge and linguistic (especially lexical) facility of average members of communities of users.B4. Further ApplicationsThe methodology outlined below focuses narrowly on the construction of MFN as a database of factual generic knowledge of medicine on behalf of English speakers. We anticipate (this is an empirical question) that on the basis of this methodology the difference between MFN and MBN will be very small. As the population of MBN sentences and sources of information is extended, this gap will grow larger, and in such a way as to allow new kinds of research.

We envisage systems for automatic patient diagnosis using now standard data-mining techniques such as Bayesian indexing (Vasconcelos and Lippman 1998), latent semantic

8

indexing (Rehder, et al. 1998, Foltz et al. 1998), and support vector machines (Cristianini and Shawe-Taylor 2000). Thus for example we might associate those collections of utterances stored in MBN which describe symptoms sourced to single patients with metadata recording subsequent diagnosis. The system trained on this corpus could establish patterns of association between specific sequences of utterances and specific diseases; one could then test the degree to which such associations are sufficiently strong as to produce usable automatic diagnosis on the basis of patient inputs. [Cimino: not sure what will be mined…

In the fields of medical education and medical literacy we envisage MBN/MFN being used to evaluate the reliability of the medical knowledge of different non-expert communities. On the basis of MFN we can imagine the development of tools to support the face-to-face education of lay people in the fields of medicine and health care, e.g. for the purpose of providing a general orientation guidance about a disease or giving general instructions concerning lifestyle, nutrition, etc. On the basis of metadata pertaining to the sources of entries in MBN it will be possible to keep track of specific kinds of false beliefs as originating in specific kinds of informants. This may prove a valuable source of information for example in targeting specific groups for specific types of remedial medical education.

In addition, we believe that the extended MBN will provide opportunities for a new type of research in the field of consumer health. Specifically, we envisage experiments to investigate how the domain of medical phenomena is conceptualized by non-expert human subjects. Cognitive psychologists and anthropologists such as E. Rosch and others (Rosch 1973, 1975, 1978) have postulated a level of lexical specification that they call “basic level.” Basic level words correspond to basic kinds in the ontology of language using subjects. Such words exist in all semantic domains, but they have been studied predominantly among words denoting natural kinds, such as animals, vegetables, and fruit. For example, tomato is often cited as an example of a basic level word, whereas “vegetable” is a superordinate, and cherry tomato is a subordinate. Basic level words have many striking properties: they are universally lexicalized, characterized by high frequency of occurrence, and they are learned first by children. The concepts they denote have properties that differ maximally from each other (e.g., a tomato is very different from a cabbage or a bean), but the difference between a basic level word and a subordinate (such as between a tomato and a cherry tomato) is less pronounced. The basic level lexicon in the medical domain has thus far not been explored, but such research promises important theoretical benefits. MBN might be used to determine the basic level in the domain under investigation by examining the difference in the frequency of occurrence of synonyms: highly frequent terms are good candidates for basic level words. Following the precedent set by (Rosch 1975) we can then use the results of this work to provide a specification of the non-expert ontology of the medical domain and begin to explore differences between it and the expert ontology of medicine documents. In later iterations we envisage pursuing experiments along the lines described in (Keil et al., 1999), designed to elicit a counterpart of MBN representing the ontology of the medical domain as apprehended by children at various ages.

Note that MBN and MFN have characteristically played different roles in the above. Thus where MBN has been associated with research, for example regarding what people believe about medical phenomena, MFN has been associated with constructing practical tools designed to assist them in coming to believe what is true.

C. Preliminary Results/Progress Report this is intended to highlight the related work of yourown group - it is where you show the reviewers that you are competent andthat you will hit the ground running on the proposed work (for example,that you have experience with collecting evaluations from large groups ofsubjects)

9

C1. From WordNet to Medical Word NetThis project is a radical extension and revision of the medical portions of the Princeton WordNet (Miller, 1995, Fellbaum, 1998). The modifications are threefold: i. WordNet’s coverage will be systematically extended to the medical domain; ii. existing medical terms will be controlled for accuracy; iii. WordNet’s contents, which are presently limited to the lexical (word) level, will be extended to the sentence (propositional) level.

WordNet is a large electronic lexical database of English that has found wide acceptance in areas as diverse as artificial intelligence, natural language processing, and psychology (Agirre et al., 2000; Al-Halimi et al., 1998, Artale et al., 1997, Basili et al., 1997, Burg and Riet, 1998, Cucciarelli and Velardi, 1997, Fellbaum, 1990, Gonzalo et al., 1998, Harabagiu et al., 1996, Magnini et al., 2001, Bewick et al., 1990). Its coverage, comparable to a collegiate dictionary, extends over some 130,000 words. Tthe most common application is in information technology, where it is used for information retrieval, document classification, question-answer systems, language generation, and machine translation. WordNet was originally conceived as a full-scale model of human semantic organization, and its design was guided by early experiments in artificial intelligence (Collins and Quillian, 1969).

WordNet is entirely hand-built, reflecting the team’s conviction that automatically compiled dictionaries and thesauri are fraught with errors as we well as the fact that the machines are necessarily limited in mimicking the semantic intuitions and linguistic judgments of human beings. Indeed, no automatically compiled lexical resource can compete in coverage and quality with WordNet, which accounts for WordNet’s wide acceptance. Unfortunately, when the WordNet project was initiated, no large text corpora were available (the 1968 Brown Corpus was the only existing balanced corpus, but its coverage is woefully inadequate by today’s standards). WordNet was quickly embraced by the Natural Language Processing (NLP) community, a development that guided its subsequent growth and design, and WordNet is now widely recognized as the lexical database of choice for NLP.

The appeal of WordNet’s design is further reflected in the fact that wordnets have been, and continue to be, built in dozens of languages. In total, about 40 WordNets supporting many European and non-European languages are already available (among them BalkanNet, NordicNet, IndianNet) and more are in process of construction. EuroWordNet supports 8 languages (Czech, Dutch, English, Estonian, French, German, Italian and Spanish) with varying coverage. Information on these projects can be found on the website of the Global WordNet Association (http://www.globalwordnet.org). All wordnets are linked to the original English WordNet, which functions as an “interlingual index.” As a consequence, all wordnets can be mapped to one another. This means that the medical terminology that we propose to add to WordNet will ultimately be translatable into dozens of languages with very little additional effort. Our methods for constructing Medical Word Net and validating its formal architecture will also take into account some of the additional features particular to wordnets of other languages, such as cross-part of speech links. C2. Architecture of WordNetThe building blocks of WordNet are synonym sets (‘synsets’), which are unordered sets of distinct word forms, or lexemes. Membership in a synset requires that the lexemes refer to the same concept and be ’cognitively synonymous’ (Cruse, 1986). More formally, synset members must be interchangeable in some sentential contexts without altering the truth-value of the sentences involved. Examples of synsets (marked here by curly braces) are {car, automobile} and {shut, close}. The current version (2.0) of WordNet contains some 115,000 synsets.

The synsets are linked to one another via a small number of binary relations that differ for each of the four syntactic categories covered by WordNet: nouns, adjectives, verbs, adverbs.

10

http://www.globalwordnet.org/

Noun synsets are interlinked by means of the subtype or IS-A relation, as exemplified by the pair poodle-dog, and by means of the part-whole or HAS-A relation, linking noun synsets like tire and car. Verb synsets are connected by a variety of lexical entailment pointers (Fellbaum, 1998, 2002, 2003 = Appendices A1 and A2) that express manner elaborations, temporal relations, and causation (walk-limp, snore-sleep, forget-know, show-see). The links among the synsets structure the noun and verb lexicons into hierarchies, with noun hierarchies being considerably deeper than those for verbs.

Relations like IS-A and HAS-A are called ‘conceptual’ or ‘semantic,’ because they hold among all the members of a linked synset. In addition, WordNet has lexical relations, which hold between specific word forms above and beyond their semantic relations. This is the case with adjectives, which are organized into clusters consisting of a pair of direct antonyms (such as expensive and cheap) together with adjectives that are semantically similar to each member of such a pair (costly and low-cost, respectively). The semantically similar adjectives are said to be indirect antonyms of one member of the direct antonym pair. Thus, low-cost is an indirect antonym of expensive, and costly is an indirect antonym of cheap. Although the semantic relation of contrast holds between direct and indirect antonyms, direct antonym pairs stand out by virtue of the strong association between the members (Fellbaum 1995).

WordNet’s appeal for NLP applications stems from the fact that the semantic relations can be exploited for word sense disambiguation; this has been the major stumbling block for NLP applications developed thus far for purposes such as information retrieval, machine translation, question-answer systems, text summarization, and language generation. Although most word forms in English are monosemous, the most frequently occurring words are highly polysemous. The ambiguity of a polysemous word in a context can be resolved by distinguishing the multiple senses in terms of their links to other words. For example, the noun club can be disambiguated by an automatic system that considers the superordinates of the different synsets in which this word form occurs: association, playing card, and stick. C3. Medical Coverage in WordNet 2.0WordNet was designed as a general-purpose lexical resource rather than as a dictionary aimed at specific technical applications. It was constructed by lay people, who added only those terms about whose meanings they were confident. As a result, WordNet’s coverage of domains like medicine, physics, and geology is very limited. The most recent version of Word-Net (2.0) contains domain labels attached to thousands of synsets, a feature which allows the automatic extraction of all words that are associated with this domain. One such label is medicine. (See Appendix B1) Currently, when asked to output terms associated with medicine the browser returns several hundred nouns, verbs, and adjectives But it is obvious that there are many hundreds more that will need to be entered to construct the Medical Word Net data -base on the basis of our work on MBN/MFN. Examining WordNet’s coverage of medical terms also shows that it represents a mixture of folk and expert vocabulary. Some synsets contain only folk or only technical terms, others both. The idiosyncratic and incomplete cover-age reflects the lack of expertise on the part of the lexicographers and will be rectified with Medical Word Net.

WordNet’s design allows users with specific technical applications in mind to augment the database, mostly by adding terms as “leaves” to the existing branches. Such enriched word-nets retain all of the original information, and the added words are semantically specified in terms of WordNet’s relations (see, e.g., Turcato et al., 2000). C4. Uses of WordNet in Medical InformaticsBuitelaar and Sacaleanu (2002) describe the extension of the German wordnet with synsets pertaining to the medical domain. They use automatic methods, in particular the detection of semantic similarity from co-occurrence patterns in a domain-specific corpus. Their results, while good, are hampered by problems of lexical polysemy and by the characteristically Ger-man problem of the need to analyze compounds. One clear conclusion from their study is that

11

fully automated lexical acquisition provides inadequate results, and that much of the work must be performed manually. Our proposal reflects this by combining both approaches.

(Bodenreider and Burgun, 2002) and (Burgun and Bodenreider, 2002) characterize the definitions of anatomical concepts in WordNet and in the UMLS. They found that anatomical definitions are characteristically of the form: superordinate + distinguishing feature (the latter expressed through some form of adjectival modification or relative clause, etc.). This way of defining words is in fact the canonical one (for nouns), which lexicographers observe as much as possible. WordNet did not always observe this standard because of its history (see above) and because it was not compiled by professional lexicographers. Clearly, the presence of the superordinate in a word’s definition makes the entry amenable to automatic disambiguation, and brings other benefits (Smith and Rosse, Appendix A5). Hence one component of the pro-posed work is the augmentation and standardization of the definitions in WordNet’s medical sublexicon. C5. Lexicographic IssuesPolysemy is a feature of human language that does not pose problems for speakers; how-ever, it has proved not fully tractable in NLP. Of special interest to our proposal are those words that are polysemous along the medical/non-medical axis. For example, calculus can re-fer either to a branch of mathematics or a concretion of mineral salts that has formed in an or -gan, as in renal calculus. Another case is felon, which in ordinary language refers to a crimi-nal but in the medical language denotes a purulent infection in a finger or toe. A lexical data-base must include and clearly differentiate all meanings. For while in communication between laypersons and medical practitioners there is always the possibility of follow-up in cases of misunderstanding, computerized medical information systems must be safeguarded from the outset against the potential consequences for miscommunication represented by such terms.

Another type of lexical item that requires special attention are collocations (or “pre-coordi -nations”), such as nape of the neck – sequences of words separated by blank spaces which constitute single syntactic and semantic units. MWN will recognize collocations as lexical units.

The Medical Fact Net project emanates from the idea of storing propositional knowledge as a counterpart to the lexical knowledge stored in WordNet. Since propositions, judgments, statements etc. are all affected by the issue of truth, the goal of Fact Net is to reveal information about the usage of words in contexts where truth is an issue. Without representing propositions, WordNet’s design is grounded in the notion of truth, as the truth-preserving interchangeability of word forms in a set of contexts is one of the conditions on synonymy and on the construction of synsets, and thus on the delineation of WordNet’s conceptual units. Fact Net can help to make the contexts of the usage of word forms explicit. It can thereby provide empirical evidence for determining the contextual constraints relevant to WordNet both explicitly in the glosses and implicitly in the set-up of synsets.

WordNet represents semantic information about concepts primarily in terms of semantic relations such as parthood or hyper-/hyponymy. Ontology-based approaches use or plan to use these representation as a traceable and thus translatable circumscription of common-sense knowledge. Besides this explicit information, however, a large part of WordNet’s non-linguistic information is stored in the mere structure and the definitions of concepts or synsets. The source of this information is propositional non-linguistic knowledge of two sorts encyclopedic (i.e. factual) knowledge, represented in WordNet’s glosses, and lexical (i.e. verbal) knowledge, stored in WordNet’s synset architecture. Propositional knowledge involves a large variety of ontological categories, however, and this variety by far exceeds what WordNet represents in its current architecture. Extracting these ontological categories from the corpora will provide us with some insights into how the hidden content of the non-linguistic (encyclopedic and lexical) knowledge in WordNet can be interpreted and, in some cases, re-modeled. WordNet’s definitions and glosses have sometimes been criticized for being

12

constructed largely ab ovo. MFN will rectify this defect since it will be systematically rooted in naturally occurring utterances.

WordNet does not contain any propositions. However, some of the relations that link verbs and verb synsets allow one to move from word to proposition. Statements can be derived both from the hyponymy and meronymy relations among nouns in WordNet and from the arcs linking verbs and verb synsets. Most of the latter are "manner" arcs, that is, they relate two verbs when one refers to a specific manner of the other (Fellbaum, 2002). For example, (to) burn is (to) hurt in some specific manner. When burn (in the relevant sense) is used in a sentence such as: The skin on my arm burns, we can infer that The skin on my arm hurts is also true, too (though the reverse does not hold).

Similarly, WordNet codes verb pairs with an arc labeled "backward presupposition," as in the example "heal" and "treat." On the level of statements, this relation says that if someone (such as a medical practitioner) heals someone (such as a patient), then it must be true that she treated the patient.

Another entailment relation that does not involve manner or backward presupposition is exemplified by the pair "snore" - "sleep." Thus, if someone snores, he necessarily also sleeps (again, the reverse does not hold).

WordNet also links verb synsets via a cause relation. For example, “show” is linked to “see” and marked as its causing event, since showing (something to someone) causes (someone) to see (something).

Finally, some verbs are linked by an opposition relation, as is the case with the pair "survive" and "succumb." They mutually exclude each other, so that only one of the two statements can be true: The burn victim survived. The burn victim succumbed. Contrast relations among verbs allow inferences involving statements that contain such verbs.

There are thousands of labeled arcs in WordNet's verb lexicon, and deriving sentences involving the verbs they links will be relatively straightforward.SHOULD BE C2?C6. Related workMFN will constitute a large database of facts with associated formal architecture. In this respect it is comparable to the CYC project (Guha et al. 1990, Lenat 1995), which reflects in part the project of a fact database for ‘naïve physics’ outlined in (Hayes, 1985) and pursued also in the Botany Knowledge Base (Clark and Porter 1996) and in DARPA’s current Rapid Knowledge Formation project. The database of CYC (short for “encyclopedia”) is a collection of hundreds of thousands of statements about the world, such as: The earth is round, Deer live in the woods, Albany is the capital of New York. These sentences were entered over many years by CYC employees in the symbolism of first-order logic. They constitute a family of separate micro-theories devoted to different domains, with new theories being added at intervals. Our project differs in a number of ways from CYC: (i) we focus on one single (albeit very large) domain; (ii) while CYC incorporates machinery for parsing natural language text, it does not store English sentences but rather only formal representations thereof; (iii) CYC does not care about consistency of its separate micro-theories, whereas we will ensure that all the sentences in MFN have been rigorously validated as being true and mutually consistent; (iv) CYC’s coverage is uneven, and its coverage of the medical domain is strong for anatomy and (for example) for vocabulary governing infections such as anthrax, but superficial in relation to medical processes; (v) CYC is largely proprietary and only a reduced part, OpenCyc, is publicly available. ON CYC’s medical coverage see Bodenreider and Burgun, in press). Compare also expert fact database projects such as the Belstein Biochemistry fact database,

FrameNet (Baker, 1998; Fillmore, 1982) is a database containing not only lexical information, but information about the meaning and use of lexical items in contexts. The building blocks of FrameNet are Frames, conceptual units such as commerce, religion, and health. With each frame Frame Elements (FEs) are associated. For example, the Cure frame involves FEs like patient, caregiver, medicine, and illness. These FEs can be mapped onto

13

different words in a variety of statements. For example, in Quinine cures malaria, quinine is mapped to medicine and malaria to illness. In John went to the orthopedist, John is the patient and orthopedist the caregiver. A major insight of FrameNet is that semantic constituents (FEs) are independent of specific syntactic categories like subject, object, etc. In the first sentence, quinine (the medicine) is the subject, whereas in the second sentence, John (the patient) is the subject. A frame exists as one of several possible constellations of FEs. The FrameNet database contains several hundred frames, each with a multiple of FEs; frames are exemplified via sentences drawn from the British National Corpus. The sentences are shown each with its specified FEs. A lexicon shows the FEs to which a given lexical item can be mapped. If a lexical item can be mapped to several FEs (in distinct frames), this means that it is polysemous. For example, cure can be an FE in a health frame and in a cooking frame. FrameNet contains a variety of frames related to medicine and heath issues (see Appendix B4), however its lexical coverage is uneven, and it is accompanied by a statement-oriented corpus of the sort which MFN requires. Penn’s PropBank is similar. Both can be used to help semantic tagging.

D. Research Design and MethodsD1. Sources: The primary sources for MWN will be lexical knowledge bases such as:

a. the relevant general lexical information contained in WordNet (see Appendix B1);b. lexical knowledge-bases of lay medical vocabulary (Tse, 2002; Appendix B3);c. medical dictionaries and large medical terminology and ontology systems such as the UMLS Specialist Lexicon, the Foundational Model of Anatomy.

MBN will utilize in addition fact or statement knowledge bases, such as:d. open-source linguistic corpora, public health documents, internet resources such as MEDLINEPlus, the Mayo Clinic, Medfacts, Merck, as well as internet sites maintained by non-experts for purposes of discussing medical questions;e. the relevant example sentences of the FrameNet corpus (Baker et al., 2003), the Penn Proposition Bank (Kingsbury and Palmer 2002);f. free text sources of the sort described for example in (Slaughter 2002);g. the results of transforming the content of lexical knowledge bases (especially WordNet) into statements (see Appendix B2). [[how many sentences?]

In deriving MBN and MFN sentences from lexical databases and similar resources, including medical ontology resources, we use the following procedure. We regard the database as a set of links tLt', between terms (where L ranges over 'is-a', 'part-of', 'is-caused-by', etc.). We form the subset of this set by restricting the values of t and t' to those which occur in MWN, our lexical database of non-expert medical terms. Some members of the resulting class of tLt' formula can then be transformed into English sentences automatically. For example each t is-a t'-formula can be transformed into a sentence of the form ' a t is a type of t' ' . Other tLt' formula can be converted by hand into English sentences, for example "forearm HAS-PARTIAL-MATERIAL-OVERLAP wrist" can be transformed into "the forearm overlaps with the wrist" and "the wrist overlaps with the forearm". The resulting list of sentences is then subject to post-processing in the usual manner (a) by non-experts for understandability and assent, and (b) by experts for correctness. In this we can do much to rectify the problem of supplementing the informative statements found in corpora and in normal discourse with those elementary statements required for the completeness of MFN, which would in normal contexts count as uninformative.

We shall maintain an internet portal through which these sources will be made available online as raw data for use by other researchers.D2. Selection and ValidationIn this initial phase of the MFN project we are interested primarily in generic statements with a relatively simple syntax such as: The heart pumps blood. People need food and water to

14

survive. Sugar is bad for the teeth. Blood flows through the body. Taking an aspirin a day can reduce the risk of heart attack. We will subject all corpora to an initial linguistic preprocessing, filtering out syntactically simple sentences conveying generic knowledge in a self-contained way, and storing statements of other types for analysis at a later stage.

Generic statements which survive this preprocessing will be assessed by a representative sample of non-expert and expert human subjects, as follows:

We will gather data from human participants in several stages of the work. In the first experiment, laypersons will be recruited at Princeton to rate the generic statements for understandability and agreement. In the second phase, medically trained participants (“experts”) will rate the statements in Buffalo. For the first experiment, we use the sentences from those sources listed under D1. For the second experiment, in Buffalo, we use those statements that the Princeton participants rated as understandable and assented to as correct.

Participants: For the first experiment, Princeton undergraduates will be recruited. The students will have backgrounds in all areas of study, but pre-medical majors will be excluded. 150 participants will be asked to fill out a questionnaire, including information about their major and possible course work related to the medical domain. Each will be paid $8 for this task, which we do not anticipate will require more than one hour.

Method: We will elicit judgments in two successive stages: 1. Rating for understandability. Each of 400 statements, randomized across the questionnaires, will be rated first of all for understandability by two participants, making for a total of 120000 (400 x 150 x 2) ratings. Rating will be done by ticking the appropriate answer box next to each statement. Raters will be asked to assign one judgment to each statement from: “I understand this statement”, “I do not understand this statement” Raters will be encouraged not to reflect on successive statements but to pass immediately onto the next statement (if they are unsure they are to leave the corresponding checkbox blank). Only those statements which are judged as understandable by both subjects will be used as input for the second stage:

2. Rating for assent. Raters will be asked to evaluate 200 statements in terms of their correctness. Raters will be asked to assign one judgment to each statement from: “I think the statement is false;” “I am uncertain about the correctness of the statement;” “I think the statement is true.” Raters will be encouraged to reflect upon their answers if necessary.

Assent judgments will be translated into numerical values, 0.0 for a judgment that the sentence is false, 0.5 for an uncertain judgment, and 1.0 for a judgment that the statement is true. This numerical score will make it easier to record and manipulate the judgments in the database where they will be collected. Statements receiving from 2 raters a total score of at least 1.5 will be stored as Medical Belief Net (MBN).

Some of the sentences in MBN will derive from sources in D2 which have already been pre-validated by experts. These sentences will be stored as the core of Medical Fact Net (MFN) and can be used to validate both expert and non-expert raters. Individuals with very high scores Those sentences in MBN which do not derive from such sources will be assessed for correctness by medical experts recruited from the Buffalo Medical School. The method for this experiment will similar to that for the second stage of the Princeton experiment. However, only those sentences which receive a score of 1.0 from each of two raters will be added to the MFN database. [how will they judge? Literature resources?]In relation to those sentences which receive a score of 0.5, raters will be encouraged to propose one or more alternatives which qualify for a 1.0 score and which would be intelligible to non-experts. These proposals will then return to the 1st phase for assessment by non-experts.

Each entry in MBN/MFN will be stored with metadata recording scores on these three axes together with: information concerning source of data (including information pertaining to evaluators, to symptoms and diagnosis for entries derived from patient records or patient utterances, and to documentation for entries derived from medical literature).

15

D3. Formal ArchitectureThe formal architecture for MBN/MFN/MWN will be constructed in three steps:Linguistic annotation of MBN: the sentences in MBN will be linguistically processed by both automatic and manual methods. First, the corpus will be part-of-speech tagged with the Brill tagger and parsed using the Collins or the Stanford Lexicalized Parser (LKB). [references needed] Finally, all nouns, verbs, and adjectives in MBN will be linked to the appropriate synsets in Medical WordNet. This last step cannot be performed fully automatically, as many words will be polysemous. We expect this to be the case for verbs and nouns more so than for adjectives. After automatically tagging the monosemous words, polysemous words will be linked manually.MFN Ontology: ADD CEUSTERS STUFF On the basis of the knowledge expressed in MFN we shall develop a detailed determination of its associated medical ontology drawing on our own recent work in (Appendices A1-3, and also in (Smith, Papakin and Munn, 2003; Ceusters Smith and Fielding 2004, Grenon, Smith and Goldberg 2004). ADD STUFF ON SACHVERHALTEMapping to the UMLS Specialist Lexicon: It will not be possible to map non-expert to expert terms in a one-to-one fashion. First, many non-expert times are themselves used by experts and do not have an equivalent in the expert lexicon. Where expert equivalents exist, we must proceed via the intermediary ontology through which the relevant meaning or meanings of the relevant non-expert term are specified along the lines illustrated in (Appendix B2). Typical examples are layman terms such as “brain” (not caring about the existence of two very distinct organs: the cerebrum and the cerebellum), or “throat” (very fuzzy term for a region overlapping with the head and the neck, and for which no layman can give an adequate description or definition since he or she is not aware of the expert distinction between pharynx, hypopharynx, epipharynx, oropharynx, nasopharynx, and so forth). We do not know whether it is possible to maintain the existing WordNet synsets, which contain both lay and expert tems. This in itself is an important research question.

Polysemy in MWN is expected to be high for consumer health language. If we consider this language as the incarnation of a very coarse-grained window on medical reality, we will see that many different (though closely related) entities will be referred to by the same terms. Polysemy will be high also for expert language because of the phenomenon which allows language users to be less precise when certain terms are used in certain contexts: for example knee for knee joint (and not the surface region), diabetes for diabetes mellitus (and not diabetes insipidus). We will supply uniform definitions for the medical terms in MWN designed to support automatic inference of statements for example concerning sub-/superordination and other semantic relations for incorporation into MFN. [walking through a whole text example would help

D4. Problems to be AddressedTo address the problem of providing interfaces for medical information systems tailored to the needs of non-expert users will require a determination of the common basis of background knowledge about medicine and health care which (as we shall argue) non-experts can be presumed to share. Unfortunately almost every term in a phrase like “generic medical knowledge of (non-expert) adults” raises problems.

Genericity: Much generic medical knowledge relates to what holds for the most part or in most cases or in a statistically significant fraction of cases (consider: smoking causes cancer). The medical experts assigning correctness scores to MBN entries will be instructed to pay special attention to such sentences, and to propose alternatives which will be assessable as correct yet still be intelligible to non-experts.

Medical knowledge is inextricably intertwined with knowledge of other domains (for example with knowledge of food and drink, fire and smoke, climate and weather, and so on),

16

so that it is far from being a trivial matter to draw a boundary around the domain of ‘non-expert medical knowledge’ that would distinguish it from a database of facts targeting common-sense knowledge in its entirety.

Initially we will limit ourselves to statements that are understandable and acceptable as single statements. We anticipate that many sentences will be such that subjects find it difficult to judge their acceptability in isolation from a surrounding context, and we may need at a later stage to address the issue of including also multi-sentence texts in our corpus.

Knowledge is problematic not least because much of the medical knowledge of experts and non-experts alike takes the form of knowledge of specific cases (Aunt Mary’s arthritis is always worse in the winter). Many candidate facts in MFN will moreover describe regularities which hold in the typical case but which are known, even by non-experts, to have exceptions. MFN should be a repository of medical knowledge that is generic and context-independent, the counterpart of the theoretical knowledge of the sciences. Note that lexical knowledge of the sort stored in WordNet, too, is both generic and context-independent. Terms in everyday language are used in highly contextually dependent ways. WordNet’s synsets resolve this.

A related problem pertains to elementary facts. If MFN truly is to represent the knowledge of medical phenomena shared by experts and non-experts then it must include also representations of facts such as: People have two eyes. Babies are born. Arms move. Because such elementary facts are part of the implicit background knowledge which subjects do not verbalize in spontaneous discourse the corresponding sentences are also not usually included in linguistic corpora. We shall solve this problem in two ways: first by carrying out suitable prompting-task- and questionnaire-based human subjects experiments and second by exploiting the resources already contained in medical terminology systems (for example as illustrated in Appendix B2). WordNet, too, contains considerable coverage of elementary facts of the forms A is type of B / A is a part of B. [work through an example]

Expertise is problematic, since it is obvious that a crisp separation of expert and non-expert sentences is impossible. Due to the influence of modern media, medical education, and personal experience of medical treatment, non-experts, too, will occasionally use expert terms in non-expert statements, just as experts will use non-expert terms in expert statements. This effect seems to be part of a general mutual accommodation of the two groups as medical information as passed back and forth across the linguistic divide. question is, will the non-experts use the expert terms as an expert would, and would the experts use the non-expert terms as a non-expert would? I wonder if you’ll have any way of telling in many cases …

This general difficulty is aggravated when subjects are asked to produce non-expert statements, since then, every subject has to decide what is expert and what is non-expert. It should be expected that the subject’s dealing with these problems in a self-conscious and intentional way produces certain artifacts (e.g. an excessively simplified language).

We can address this problem in part by means of careful experimental structure. In addition we can turn to recent studies on professional and non-professional medical vocabulary (Tse and Soergel 2003, Tse 2003), which provide a methodology and an empirical basis for distinguishing between expert and non-expert medical language. [better describe it] http://lhncbc.nlm.nih.gov/cgsb/staff/tse_tony/CMV2.pdf

A final problem is one of size: the domain of medical knowledge is vast, and while the domain of non-expert medical knowledge is more restricted, its scope would until very recently have rightly been considered overwhelming. The success of WordNet gives us confidence that this problem, too, can be solved. One part of our task will be to transform (the medical sublexicon of WordNet) into a large corpus of generic beliefs about the corresponding phenomena in reality. This we do in effect by turning WordNet on its side; that is, (in the simplest possible case) we transform a relation such as {t1, …, tn} IS-A {t´1, …, t´m} into n x m sentences of the form: ti IS-A t´k. Many of these sentences will not survive the filter of acceptability to non-experts. (Those which do not survive the filter of correctness will correspond to errors in WordNet itself.) [work through an example]

17

http://lhncbc.nlm.nih.gov/cgsb/staff/tse_tony/CMV2.pdf

The development of MFN should be seen as part and parcel of recent advances in the biomedical sciences of similar scope, above all in the development of large lexical resources such as SNOMED and the UMLS and of fact-repositories such as KEGG or Swiss-Prot.

Literature

Literature

Agirre E, Martinez D. Exploring automatic word sense disambiguation with decision lists and the Web. Proceedings of the Semantic Annotation and Intelligent Annotation Workshop organized by COLING, Luxembourg, 2000.

Al-Halimi R, Kazman R. Temporal indexing through lexical chaining. Fellbaum C. (ed.), WordNet: An Electronic Lexical Database, MIT Press, Cambridge, Maryland, May 1998.

Artale A, Magnini B, Strapparava C. WordNet for Italian and its use for lexical discrimination. Proceedings of the 5th Congresso dell’Associazione Italiana per l'Intelligenza Artificiale, Rome, September 1997; 16-19.

Baker CF, Fillmore CJ, Cronin B. The structure of the framenet database. International Journal of Lexicography, 2003; 16.3: 281-296.

Baker CF, Fillmore CJ, Lowe JB. The Berkeley FrameNet project. Proceedings of the COLING-ACL, Montreal, Canada, 1998.

Basili R, DellaRocca M, Pazienza MT. Contextual word sense tuning and disambiguation. Applied Artificial Intelligence 1997; 11 (3): 235-262.

Beckwith R, Fellbaum C, Gross D, Miller G. WordNet: A lexical database organized on psycholinguistic principles. In Zernik U (ed.), Using On-line Resources to Build a Lexicon. Erlbaum, Hillsdale, NJ, Erlbaum, 1990; Chapter 9: 211-231.

Bodenreider O, Burgun A. Characterizing the definitions of anatomical concepts in WordNet and specialized sources. Proceedings of the First Global WordNet Conference, January 2002; 223-230.

Bodenreider O, Burgun A. Ontologies in the biomedical domain. Part II: examples. Journal of the American Medical Informatics Association (in press).

Bodenreider O, Burgun A, Mitchell JA. Evaluation of WordNet as a source of lay knowledge for molecular biology and genetic diseases: a feasibility study. Studies in Health Technology and Informatics 2003; 95: 379-384.

Buitelaar P, Sacaleanu B. Extending synsets with medical terms: ranking and selecting synsets by domain relevance. Proceedings of the First Global WordNet Conference, Mysore, India, January 2002.

Burg JFM, van de Riet RP. COLOR-X: using knowledge from WordNet for conceptual modeling. Fellbaum C (ed.), WordNet: An Electronic Lexical Database, MIT Press, Cambridge, Maryland, May 1998.

18

Burgun A, Bodenreider O. Comparing terms, concepts and semantic classes in WordNet and the Unified Medical Language System. Proceedings of the NAACL Workshop on WordNet and Other Lexical Resources, Pittsburg, 2001; 77-82.

Ceusters W, Smith B, Fielding JM. LinkSuite™: Formally robust software tools for ontology-based data and information integration. Proceedings of DILS 2004 (Data Integration in the Life Sciences), (Lecture Notes in Computer Science), Springer, Berlin, 2004, in press.

Clark P, Porter B. Building domain representations from components, Technical Report AI96-241, Department of computer science, University of Texas at Austin, 1996.

Collins AM, Quillian MR. Retrieval time from semantic memory, Journal of Verbal Learning and Verbal Behavior 1969; 8: 240-248.

Cristianini N, Shawe-Taylor J. An introduction to support vector machines (and other kernel-based learning methods), Cambridge University Press, Cambridge, UK, 2000.

Cruse DA. Lexical semantics. Cambridge University Press, Cambridge, UK, 1986.

Cucchiarelli A, Velardi P. Automatic selection of class labels from a thesaurus for an effective semantic tagging of corpora. Proceedings of the 5th Conference on Applied Natural Language Processing, Washington, 1997; 380-387.

Ely JW, Osheroff JA, Gorman PN, Ebell MHl, Chambliss ML, Pifer EA, Stavri PZ. A taxonomy of generic clinical questions: classification study. British Medical Journal, 2000; 321: 429-432.

Fellbaum C. Co-occurrence and antonymy. International Journal of Lexicography 1995; 8 (4): 281-303.

Fellbaum C. Distinguishing verb types in a lexical ontology. Proceedings of the Second International Workshop on Generative Approaches to the Lexicon. ISSCO, Geneva, 2003.

Fellbaum C. English verbs as a semantic net. International Journal of Lexicography 1990; 3 (4): 278-301.

Fellbaum C. On the semantics of troponymy. The Semantics of Relationships: an Interdisciplinary Perspective. Green R, Bean CA, Myaeng SH (eds.), Dordrecht, Kluwer, 2002; 23-34.

Fellbaum C (ed.). WordNet: an electronic lexical database. MIT Press, Cambridge, MA, 1998.

Fielding JM, Simon J, Ceusters W, Smith B. Ontological Theory for Ontology Engineering, in KR 2004. Proc Ninth Internat Conf Knowledge Representation and Reasoning.

Fillmore CJ. Frame semantics. In: Linguistics in the morning calm, Hanshin Publishing Co, Seoul, 1982; 111-137.

Fogg BJ, Soohoo C, Danielson D, Marable L, Stanford J, Tauber ER. How do people evaluate a Web site's credibility? Results from a large study. A Consumer WebWatch research report, prepared by Stanford Persuasive Technology Lab, Stanford University, Stanford, Connecticut, 2002.

19

Foltz PW, Kintsch W, Landauer TK. The measurement of textual coherence with latent semantic analysis. Discourse Processes, 1998; 25: 285-307.

Gonzalo J, Verdejo F, Chugur I, Cigarran J. Indexing with WordNet synsets can improve text retrieval. Proceedings of the COLING/ACL Workshop on Usage of WordNet in Natural Language Processing Systems, Montreal, 1998.

Good BJ. Medicine, Rationality, and Experience: An Anthropological Perspective. Cambridge University Press, Cambridge, 1984.

Good BJ, DelVecchio, Good MJ. "Fiction" and "historicity" in doctor's stories: social and narrative dimensions of learning medicine. Mattingly, Garro (eds.), 2000; 50-69.

Grenon P, Smith B, Goldberg L. Towards a dynamic biomedical ontology. Pisanelli D (ed.), Ontologies in Medicine: Proceedings of the Workshop on Medical Ontologies, Rome, October 2003, IOS Press, Amsterdam, forthcoming.

Guha R, Lenat D, Pittman K, Pratt D, Shepherd M. Cyc: A midterm report. Communications of the ACP 1990; 33 (8).

Harabagiu SM, Moldovan DI. A marker propagation text understanding and inference system. Stewman JH (ed.), Proceedings of the 9th Florida Artificial Intelligence Research Symposium, Key West, 1996; 55-59.

Hayes PJ. The second naive physics manifesto. Hobbs JR, Moore RC (eds.), Formal Theories of the Common-sense World, Ablex, Norwood, 1985; 1-36.

Hersh W, Buckley C, Leone T, Hickam D. OHSUMED: An interactive retrieval evaluation and new large test collection for research. SIGIR '94, Proceedings of the 17th Annual International ACM IST Project 2001-33052 WonderWeb, 1994.

Horton, R. Tradition and modernity revisited. Hollis M, Lukes S (eds.), Rationality and Relativism, Blackwell, Oxford, 1982; 201-260.

Jackson B, Ceusters W. A novel approach to semantic indexing combining ontology-based semantic weights and in-document concept co-occurrences. Baud R, Ruch P (eds), EFMI Workshop on Natural Language Processing in Biomedical Applications, Cyprus, March 2002; 75-80.

Jacquemart P, Zweigenbaum P. Towards a medical question-answering system: a feasibility study. Le Beux P, Baud R (eds.), Proceedings of Medical Informatics Europe, IOS Press, Amsterdam, 2003; 463-468.

Keil FC, Levin DT, Richman BA, Gutheil G. Mechanism and explanation in the development of biological thought: the case of disease. Atran S (ed.), Folkbiology, MIT Press, Cambridge, 1999; 285-343.

Kingsbury P, Palmer M. From TreeBank to PropBank. Proceedings 3rd Int Conf Language Resources and Evaluation (LREC-2002), Las Palmas, Spain.

Kirmayer LJ. Broken narratives: clinical encounters and the poetics of illness experience. Mattingly, Garro (eds.), 2000; 153-180.

20

Kleinman A. The illness narratives. Basic Books, New York, 1988.

Lenat D. Cyc: a large-scale investment in knowledge infrastructure. Communications of the ACM 1995; 38 (11).

Lewalle P. Terminology: harmonizing health-related terminology – a task of many methods. ISO (International Organization for Standardization) Bulletin, October 2000; 14-16.

Magnini B, Strapparava C. Using WordNet to improve user modelling in a web document recommender system. Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources, Pittsburgh, June 2001.

Masolo C, Borgo S, Gangemi A, Guarino N, Oltramari A. Ontology Library, WonderWeb Deliverable D18 Ontology Infrastructure for the Semantic Web IST Project 2001-33052, 2003, http://wonderweb.semanticweb.org/deliverables/documents/D18.pdf.

Mattingly C, Garro LC (eds.) Narrative and the Cultural Construction of Illness and Healing. University of California Press, Berkeley, Los Angeles, London, 2000.

McCray AT, Srinivasan S, Browne AC. Lexical Methods for Managing Variation in Biomedical Terminologies. SCAMC ‘94, 1994; 235-239.

Miller GA. WordNet: a lexical database for English. Comm ACM 38, 11, November 1995; 39-41.

Patel VL, Arocha JF, Kushniruk A. Patients' and physicians' understanding of health and biomedical concepts: relationship to the design of EMR systems. Journal of Biomedical Informatics, 2002; 35(1): 8-16.

Pustejovsky J. The generative lexicon. MIT Press, Cambridge, 1995.

Rehder B, Schreiner ME, Wolfe MB, Laham D, Landauer TK, Kintsch W. Using latent semantic analysis to assess knowledge: some technical considerations. Discourse Processes 1998; 25: 337-354.

Rosch E. Cognitive representations of semantic categories. Journal of Experimental Psychology, General 1975; 104: 192-253.

Rosch E. On the internal structure of perceptual and semantic categories. Cognitive Development and the Acquisition of Language, Moore TE (ed.), Academic Press, New York, 1973.

Rosch E. Principles of categorization. Cognition and Categorization. Rosch E, Lloyd BB (eds.), Erlbaum, Hillsdale, NJ, 1978.

Schulze-Kremer S, Smith B, Kumar A. Revising the UMLS Semantic Network (manuscript) (See Appendix A3).

Slaughter L. Semantic relationships in health consumer questions and physicians’ answers: a basis for representing medical knowledge and for concept exploration interfaces. Doctoral dissertation, University of Maryland at College Park, 2002.

21

http://wonderweb.semanticweb.org/deliverables/documents/D18.pdf

Smith B, Köhler J, Kumar A. On the application of formal principles to life science data: a case study in the Gene Ontology. Proceedings of DILS 2004 (Data Integration in the Life Sciences), (Lecture Notes in Computer Science), Berlin, Springer, 2004, in press (See Appendix A4).

Smith B, Papakin I, Munn K. Bodily systems and the modular structure of the human body. Proceedings of the 9th Conference on Artificial Intelligence in Medicine (Lecture Notes on Artificial Intelligence 2780), Springer, Berlin, 2003; 86–90.

Smith B, Rosse C. The role of foundational relations in the alignment of biomedical ontologies. http://ontology.buffalo.edu/medo/isa.pdf (manuscript) (See Appendix A5).

Stanford J, Tauber ER, Fogg BJ, Marable L. Experts vs. online consumers: a comparative credibility study of health and finance Web sites. Consumer WebWatch Research Report, Sliced Bread Design, Stanford, 2002.

Streiter O. Corpus-based parsing and treebank development, ICCPOL, 19th International Conference on Computer Processing of Oriental Languages, 2001.

Tse AY. Identifying and characterizing a "consumer medical vocabulary." Doctoral dissertation, College of Information Studies, University of Maryland, College Park, Maryland, March 2003.

Tse T, Soergel D. Procedures for mapping vocabularies from non-professional discourse: a case study: "consumer medical vocabulary". Proceedings of the Annual Meeting of the American Society for Information, 2003.

Turcato D, Fass D, Tisher G, Popowich F. Fully automatic bilingual lexical acquisition from EuroWordNet. Proceedings of NAACL 2001 Workshop on WordNet and Other Lexical Resources, Pittsburgh, June 2001.

Vasconcelos N, Lippman A. A bayesian framework for semantic content characterization. Computer Vision and Pattern Recognition, 1998; 566-571.

Vintar S, Buitelaar P, Volk M. Semantic relations in concept-based cross-language medical information retrieval. Proceedings of the ECML/PKDD Workshop on Adaptive Text Extraction and Mining (ATEM), Cavtat-Dubrovnik, Croatia, September 22, 2003.

Zeng Q, Kogan S, Ash N, Greenes RA, Boxwala AA. Characteristics of consumer terminology for health information retrieval: A formal study of use of a health information service. Methods of Information in Medicine, in press.

Zhang J. Representations of health concepts: a cognitive perspective. Journal of Biomedical Informatics 2002; 35 (1): 17- 24.

22

Appendices

A: Publications

A1. Fellbaum C. On the Semantics of Troponymy. In: The Semantics of Relationships: An Interdisciplinary Perspective. Eds. R. Green, C.A. Bean, and S.H. Myaeng. Dordrecht, Kluwer, 2002; 23-34.

A2. Fellbaum C. Distinguishing Verb Types in a Lexical Ontology. Proc Second International Workshop on Generative Approaches to the Lexicon. ISSCO, Geneva, 2003.

A3. Schulze-Kremer S, Smith B and Kumar A. Revising the UMLS Semantic Network (manuscript).

A4. Smith B, Köhler J and Kumar A, On the Application of Formal Principles to Life Science Data: A Case Study in the Gene Ontology, Proc DILS 2004 (Data Integration in the Life Sciences), (Lecture Notes in Computer Science), Berlin: Springer, 2004, in press.

A5. Smith B, Rosse C. The Role of Foundational Relations in the Alignment of Biomedical Ontologies (manuscript)

B: Data

B1. Medical Word Net 0.0: current representation of the medical lexicon in Princeton WordNet

B2. Portions of Data-Base Derived Corpus of Elementary Sentences of MFN

B3. Lexicon of Terms Derived from L&C’s Medical Natural Language Processing Software

23

Preliminary WorkTwo ad hoc studies have been carried out so far:

1. 55 generic statements were collected in German-speaking discussion groups on medical and health-care topics (cf. section V.1, p. 25). The main source was http://www.wer-weiss-was.de.

2. Staff members of IFOMIS were asked to contribute simple sentences which they think a non-expert would believe to be true (examples were: the heart pumps blood; we breathe with our lungs, etc.). 10 people returned 111 (generic) sentences in total (cf. section V.2, p. 25).

The analysis of the two mini-corpora shows:- a vocabulary of popular names for entities of the medical domain, mainly settled at the

taxonomic base level, and a real difference between lay and expert vocabulary with respect to terminology, size, and denotation of the terms

- crucial differences between the spontaneous language found in internet discussion groups and the prompted language produced by IFOMIS staff members (e.g. people do not speak about elementary facts in spontaneous language)

- difficulties in dissociating experts from non-experts, and expert statements from non-expert statements

- a low rate of generic sentences in spontaneous language, even in advisory contexts (< 5% estimated), as an expression of the Wittgenstein problem

- the obligatory dependency of genericity on the context (e.g. the text genre, the communicative intention, etc.)

- almost no completely false (i.e. false in all contexts) statements in prompted language, and only a few completely false statements in spontaneous language

- only a few completely true statements in both prompted and spontaneous language (due to the weakness of formulation chosen, or due to opaque contextual presuppositions)Both the people posting in discussion groups and the subjects of the prompting task

preferred popular terms. The latter did so as they tried to imitate the examples given as prompts and to keep out technical terms as instructed. In spontaneous language, in contrast, the boundary between popular and technical terminology is constantly in flux, depending on the individual’s previous knowledge about medicine and his/her interaction with professionals.

Ontological vs. Semantic RelationsThe two ad hoc studies already showed that we have to deal with many more ontological relations than are covered by the semantic relations of WordNet. The following list gives a rough sketch of how such ontological relations could be involved:- causation (found in a large part of the sentences and always interconnected with a

complex network of other relations and categories): Essen beeinflusst die Stimmung / Natron ist gut gegen Sodbrennen, subsuming cases of:- agent causation (with human and non-human agents): The heart pumps blood- event causation (with acts, states, or processes as causal triggers): Taking an aspirin a

day can reduce the risk of heart attack- inherence (bearing a property, disposition, power, role, status, etc.): Zucker ist ungesund /

Ständiges Kiffen ist sehr gefährlich / Vögel können Chlamydien übertragen- parthood: Das Skelett besteht aus Knochen / The body has veins and arteries throughout- location (may but does not have to involve parthood): Memory is based in the brain / The

stomach is in the abdomen

24

- existence (as a relation of particulars to a time period, as a relation between entities of a certain type to an ontology, or as a relation of a universal to an ontology): Es gibt keine Kohlenstoffallergie

- context-substance relations (necessarily always involves many further ontological relations such as causation, parthood, location, existence, etc.): Bei einer Diät verliert man Muskelmasse / Ohne Kehlkopf keine Stimme

- formal relations (e.g. comparison): Die Abhängigkeitswirkung von Haschisch ist geringer als die von Alkohol / Unsere Haut ist leicht sauer

Sentences Collected from IFOMIS Researchers

The heart pumps blood.We breathe with our lungs.The brain is inside the skull.The thigh bone is connected to the hip bone.Bacteria cause disease.Your organs are inside your body.Your body can heal itself.People need food and water to survive.Everybody dies some day.Bones can be broken.Organs can be torn, swollen, and disfunctional.Vitamins make your body work better.Too much of anything can kill you.People choke because something is stuck in the throat.The liver is important when drinking alcohol.If you are punched in the kidneys, you may urinate.A spinal injury can lead to paralysis.Muscle is good for your body, and fat is not.Smoking can cause cancer.Liver degrades alcohol.Whole sense of taste is located on the tongue.Memory is based in the brain.Sugar is bad for the teeth.Blood flows through blood vessels.Food ends up in stomach.One needs sleep.One needs food.One needs fluid.When drinking a lot one has to excrete more urine.When doing physical exercise one gets thirsty, breathes more frequently and the heartrate increases.

25

Die Augen befinden sich im Kopf.Das Skelett besteht aus Knochen.Die Haut kann atmen.Das Blut fließt in den Adern.Die Wirbelsäule hält den Körper.Ohne Kehlkopf keine Stimme.Starve a fever, feed a cold.You are what you eat.Smoking causes lung cancer.Smoking marijuana can lead to addiction to bad drugs.Taking an aspirin a day can reduce the risk of heart attacks.Too much stress can make you sick.The children of close relatives, brother-sister, first cousins, have a good chance of being deformed.Blood transports oxygen.Hurting the head is dangours.You need to eat regularly.We cannot survive without water.The stomach is in the abdomen.The liver cleanses the blood.High blood pressure is dangerous.The knee joints are very sensetive.Body temperature higher than 39°C is very dangerous for an adult.The lungs are inside the chest.When the heart stops we’ll die.Only women can give birth.You cannot walk without legs.You cannot talk without a tongue.Without water and food you will die.Without eyes you cannot see.It takes both sexes to make a child.If you lose too much blood you will die.Without air you will die.You will die sooner or later.Some bacteria cause disease.Some bacteria are beneficial.Bones give structure.Joints allow movement.Muscles allow movement.The heart is a muscle.Pains are indicators.

26

The immune system needs training.Teeth need cleaning.Smoking kills.The pulse is an indicator.The body is full of nerves.The body has veins and arteries throughout.Blood flows through the body (through veins and arteries).There are a lot of bones in the body.Bones provide support for the body.Broken bones can mend.Muscles allow movement and exerting force on things.Muscles are attached to bones.Muscles and ligaments can be torn.Eating the right foods helps promote health.Exercise helps promote health.Blood circulates inside the body through veinsHumans have a digestive system.Viruses do not belong to the body.Some bacteria are good and reside normally in the body.Fever is not normal.Fever causes discomfort.High persisting fever can be dangerous.Getting wounded may cause infections.Food is good for your body.Starvation is bad for your body.Your body occasionnally needs to be helped with medication.Vaccination is intended to protect against certain diseases.Drug have side effects.Certain people can be allergic to certain drugs.Severity of a disease may vary from person to person.Efficacy of a drug may vary from person to person.Vitamin deficiency may cause serious deseases.Bones may break.We cannot live without a skin.Fever signifies an infection.Viruses abuse cells of a host to reproduce.Healthy food needs to contain a balanced mixture of carbohydrates, proteins, fat, fibers, and trace elements.Acalcinosis may damage your skeleton.Persisting pain is a reason to consult a physician.Too much alcohol may cause a headache.

27

Germs can be drug-resistant.

28

1 Data from Error: Reference source not found

Ten most frequently occurring Consumer Medical Vocabulary [rank, normalized forms, and frequency]

1 doctor 2132 pain 1083 diagnose 884 test 825 symptom 806 surgery 677 cause 668 problem 619 treatment 4610 drug 44

Concepts with the greatest expressive variability (i.e. the greatest number of form types) [word form, rank by concept frequency, and consensus form]

Severe pain 36 53 severe paingood health 20 94 healthyDiagnosis 19 4 diagnosisDyspnea 19 123 breathing difficultyDecrease 17 118 lowerFatigue 17 13 fatigueFeels unwell 16 166 sickIncreased 15 150 raiseLassitude 15 162 weaknessTherapeutic procedure 15 6 treatment

29

POSSIBLE APPLICATIONSuse MFN for query-answering developing a theory of non-expert reasoning about medical phenomena

Existence of very large ontologies of domain concepts and extensive corpora for the medical domain has grounded the work toward refinement, integration and comparison of concept-based retrieval meth-ods and corpora-based approaches.

ON MUCHMOREDatamining Technologies such as are illustrated by Autonomy, MuchMore project of the European Union (REF), the MadBoKs project of Language and Computing nv (REF) have shown the effective-ness of current tools for combining heterogeneous text-based resources for information access and management in the field of technical medical literature. Such projects have developed also sophisti-cated methods for performance testing.Methods used by MuchMore include concept-based approaches (semantic annotation of terms and relations, including disambiguation and filtering), corpus-based approaches (for example similarity thesauri) as well as combinations of these.

The MuchMore prototype is a cross-lingual document retrieval system that enables users to retrieve documents which are relevant to a given query document. In the current version of the MuchMore sys-tem, query documents are assumed to be electronic patient records and documents to be retrieved are medical scientific abstracts. The MuchMore prototype is implemented as a meta-search engine that pro-vides access to a merged/ranked list of relevant documents from three different search engines and a query construction tool that provides a user interface for extracting and refining structured queries. MuchMore’s thesaurus generation and document retrieval system uses a corpora-based statistical anal-ysis of the distribution of terms in documents to generate a representation of terms and documents from in a mathematical space. A cross-lingual document retrieval system leveraging parallel training cor-pora in the relevant languages. Semantic annotations based on both domain-specific (UMLS) and gen-eral language (EuroWordNet) semantic resources, based on linguistic analysis of domain-specific cor-pora (i.e. the underlying document collection) is central to the concept-based CLIR approach described in this section. At the core of semantic annotation are UMLS terms and MeSH codes. For the example sentence the words w20 and w21 point to the concept with a preferred name ”Space Perception”, which corre-sponds to the CUI code C0037744 and TUI code T041 (i.e. “Mental Process”). In addition, this concept is linked to two MeSH codes, which stand for two positions of the term ”Space Perception” in the MeSH tree of concepts, the first under the node ”Perception” and the second under ”Visual Percep-tion”. Finally, word w26 (optic) triggered the concept ”Optics” (with one corresponding MeSH code). <umlsterm id="t7" from="w20" to="w21">

<concept id="t7.1" cui="C0037744" preferred="Space Perception"

tui="T041"><msh code="F2.463.593.778"/><msh code="F2.463.593.932.869"/>

</concept></umlsterm>

<umlsterm id="t8" from="w26" to="w26"><concept id="t8.1" cui="C0029144" preferred="Optics" tui="T090"><msh code="H1.671.606"/>

30

</concept></umlsterm>

The most specific information is on the semantic relations that are derived from the UMLS Semantic Network. For example, it indicates that ”Space Perception” is an issue in ”Optics” which is coded in the following manner. Note that the XML attributes term1 and term2 point to the UMLS concepts introduced in the example above. <semrel id="r7" term1="t7.1" term2="t8.1" reltype="issue_in"/>

3.1.3 Sense Disambiguation (CSLI, DFKI) Obviously, terms may correspond to more than one concept in the semantic resources used, which is of particular importance in the CLIR context. For instance, the English word drug when referring to medi-cally therapeutic drugs would be translated as medikamente, while it would be rendered as drogen when referring to a recreationally taken narcotic substance of the kind that many governments prohibit by law. The ability to disambiguate may therefore be crucial to applications such as CLIR, since search terms entered in the language used for querying must be appropriately rendered in the language used for re-trieval. Because of this potential importance to cross-lingual language and information applications, sense disambiguation has been one of the areas of focus of the MuchMore project. Evaluation Corpora An important aspect of sense disambiguation is the evaluation of different methods and parameters. Unfortunately, there is a lack of test sets for evaluation, specifically for languages other than English and even more so for specific domains like medicine. Given that the focus of the project is on German as well as English text in the medical domain, we had to develop a number of manually annotated eval-uation corpora (lexical samples10) to test the different disambiguation methods developed within the project with EuroWordNet (or rather GermaNet) for German, and with UMLS for both German and English. To support manual annotation we developed a lexical sample annotation tool based on the ANNO-TATE tool that has been developed in the context of the NEGRA project on syntactic annotation (Plaehn and Brants, 2000). In selecting an ambiguous occurrence to be manually annotated (i.e. disam-biguated), the annotator is presented with the extended context (left/right neighbor sentences) and the senses for this particular word. By selecting one or more of these, the annotator tags every occurrence of the word with the appropriate sense(s). If the lexical semantic resource does not contain an appropri-ate sense for the corresponding context, the annotator can choose to annotate with unspec (unspecified). To further assist the annotator, there is access also to corresponding hierarchies (hypernymy in Ger-maNet or broader term in UMLS). Selection of ambiguous terms for the GermaNet evaluation corpus proceeds by compiling a list of terms with high domain relevance, at least 100 occurrences in the medical corpus and with more than one sense in GermaNet. From this list we selected 40 terms, for each of which we then automatically extracted 100 occurrences at random. Three annotators, a medical expert and two linguistics students, were assigned the task of annotating the selected occurrences for these ambiguous terms. We also em-ployed non-experts, as they would not have much difficulty in tagging occurrences in a medical corpus, because most of the terms express rather commonly known (medical or general) concepts.

The process of selecting terms for the UMLS evaluation corpora (English and German) is based on automatically generated lists of ambiguous UMLS terms. From these we selected

set of 70 frequent terms for English (token frequencies at least 28, 41 terms having token frequencies over 100). For German, only 24 terms could be selected (token frequencies at least 11, 7 terms having token frequency over 10011), as the German part of UMLS (or rather MeSH) is rather small. The level of ambiguity for these UMLS terms is mostly limited to only 2 senses; only 7 English terms have 3 senses. In the case of UMLS, medical experts were involved in the manual annotation, two for the Ger-man part and three12 for the English part. In order for an automatic system to decide which sense is more appropriate in a given context, it is a

31

prerequisite that at least human annotators agree between them on this. We therefore computed the in-ter-annotator agreement (IAA) between the annotators involved in the various manual annotation tasks13. The IAA for the GermaNet evaluation corpus varies from very low to very high, but is on aver-age at 70%. The agreement scores for the UMLS evaluation corpora vary also highly, with an average of 65% for German and 51% for English. Methods Methods for disambiguation can effectively be divided into those that require manually annotated train-ing data (supervised methods) and those that do not (unsupervised methods). In general, supervised methods are less scalable than unsupervised methods because they rely on training data, which may be costly and unrealistic to produce, and even then might be available for only a few ambiguous terms. The goal of our work on disambiguation in the MuchMore project is to enable the correct semantic an-notation of entire document collections with all terms, which are potentially relevant for organisation, retrieval and summarisation of information. Therefore a decision was taken early on in the project to focus on unsupervised methods, which have the potential to be scaled up enough to meet our needs. The methods developed are: bilingual, dictionary-based, domain-specific and instance-based learning The presupposition of the above is that there is a coherently demarcated realm of what we might call ‘natural’ or ‘pre-theoretical’ cognition, the type of cognition engaged in by human subjects outside the realm of technical expertise Here we are interested in natural cognition as a system of beliefs expressed in the sentences of what is commonly called ‘natural language’, and more specifically in those beliefs which pertain to medical phenomena (roughly, the phenomena demarcated by the scope of the UMLS). We are interested also in the ontology determined by these beliefs. The non-expert ontology of the medical domain differs from the ontology embodied in, for example, the UMLS Metathesaurus in a number of ways:1. granularity2. Issue of basic kinds Part of our work will thus involve extend existing research on naïve physics REF, naïve geography REF, to the realm of medicine. We seek not naïve theories ourselves, but rather sophisticated (scientific) theories of the beliefs shared by non-experts and of the entities and relations constituting the ontology reflected in these beliefs.

Our assumption that natural cognition constitutes a coherent object of scientific investigation is part and parcel of the assumption, now accepted by all linguists, that natural language constitutes a coherent object of scientific investigation.

Studying natural cognition, like studying natural language, is hard: non-expert beliefs are marked by a dependence on context that highly nuanced. Linguistic and cultural diversity is also a problem, though linguistic research has pointed to the existence of universals of natural language that are common to all cultures. We shall focus here on beliefs (and corresponding statements) of English-speakers in the United States. However, one potential for research in the future is the investigation of the degree to which medical knowledge is a cultural universal.

Tentatively, we hold that the medical belief-systems of non-experts in different cultures will be ontological commensurability in the sense that they can be conceived as systems of delineations of the same objective world (described by medical science). Ontological commensurability as thus defined is compatible with skew classification of a single subject-matter. In his 1987 (p. 322) Lakoff lists five different ways in which two conceptual systems can be commensurable: intertranslatability, understandability (one person can understand both alternatives), common use (the same concepts are used in the same ways), framing (situations are ‘framed’ in the same way and there is a frame-by-frame correspondence between the two systems), and organization (the same concepts organized in the same way occur in both systems). Much incommensurability in any of these five senses may be compatible with ontological commensurability.

32

We shall presuppose that non-expert beliefs about the medium sized reality that is given in perception and described by the sentences of natural language are in large part true to the world as it actually is. (Support for this hypothesis can be derived not least from the fact that such beliefs have arisen through interaction with this world and survived many thousands of years of empirical testing.)

Our natural cognitive experiences are of course in many cases non-veridical, and thus we must confront the fact of error, perspective bias, and the like. Even in spite of such phenomena, however, the world to which we are directed in natural cognition manifests a stability and internal coherence. Thus for example while the experiences of hearing and touching are vastly different from those of seeing and smelling, the world of what we hear and touch is nonetheless identical with the world of what we see and smell. Moreover, in virtue of what psychologists have dubbed the phenomena of constancy (with respect to colour, angle, distance, illumination, etc.), we have the capacity spontaneously to override the perspectival features which pervade our experience. Thus we have no difficulty in grasping the enduringly identical colours, shapes and sizes of material bodies even under quite radical changes in lighting conditions and under quite radical changes in perspective and distance. Material bodies themselves are similarly grasped spontaneously as retaining their identities even when quite radically deformed or occluded.

Perception as DiscriminationWe hold that perception is a source of veridical information about the world. The putative information supplied by perception is always partial, and sometimes erroneous, but it can in every case be supplemented and corrected by the gathering of further information about the sides of objects we cannot see, about the future behaviour of objects, and so on.

Expert and Non-Expert Medical OntologyWhat is the relation between the world of medical entities and relations that is apprehended by non-experts and the world that is described in textbooks of medical science? Here we shall assume that the two overlap substantially. Could we discover that non-expert beliefs about medical phenomena are false? As Wilfrid Sellars points out, standard physical science and the hard sciences in general have their origins in common sense. Thus as Sellars expresses it, ‘the scientific image cannot replace the manifest image without rejecting its own foundations’ (Sellars 1963, p. 21).

How to demarcate the realm of the ‘medical’ The set of non-expert beliefs about medicine is of course part of a wider totality, which includes also non-expert beliefs about behavior, diet, heredity, and much else, and it is by no means easily detachable from this wider background. The total set of non-expert beliefs is further a complex jumble of many different sorts of beliefs, ranging from transient and culture-dependent prejudices to trivial universally accepted truths (such as: ‘human beings have bodies’). We shall bring some necessary order into this jumble by means of systematic validation:1. of acceptability (by non-experts)2. of correctness (by medical experts)The results of such validation will be stored as metadata, along with other facts pertaining to source of the statements in question (including facts pertaining to associations with other statements). Those statements which score highest marks for both acceptability and correctness will form MFN itself, a core of MBN (or medical belief net).

33

Each culture has its own set of culture-specific beliefs pertaining to medical reality. Anthropologists have, however, established that there is a non-trivial core of such beliefs which is, modulo variations in emphasis and calibration referred to above, common to all societies. Such beliefs belong to what the anthropologist Robin Horton calls ‘primary’ theory, as contrasted with the ‘secondary’ theories of a religious, mythical or scientific nature which pertain to what lies beyond or behind the world that is immediately given in perception and action. As Horton puts it:Primary theory really does not differ very much from community to community or from culture to culture. A particular version of it may be greatly developed in its coverage of one area of experience, and rather undeveloped in its coverage of another. These differences notwithstanding, however, the overall framework remains the same. In this respect, it provides the cross-cultural voyager with his intellectual bridgehead. Primary theory gives the world a foreground filled with middle-sized (say between a hundred times as large and a hundred times as small as human beings), enduring, solid objects. These objects are interrelated, indeed interdefined, in terms of a ‘push–pull’ conception of causality, in which spatial and temporal contiguity are seen as crucial to the transmission of change. They are related spatially in terms of five dichotomies: ‘left’/’right’; ‘above’/’below’; ‘in-front-of’/’behind’; ‘inside’/’outside’; ‘contiguous’/’separate’. And temporally in terms of one trichotomy ‘before’/’at the same time’/’after’. Finally, primary theory makes two major distinctions amongst its objects: first, that between human beings and other objects; and second, among human beings, that between self and others. In the case of secondary theory, differences of emphasis and degree give place to startling differences in kind as between community and community, culture and culture. For example, the Western anthropologist brought up with a purely mechanistic view of the world may find the spiritualistic world-view of an African community alien in the extreme. (Horton 1982, p. 228) From the anthropological perspective, moreover, we can understand why this universal primary theory exists: the remarkable facility which humans manifest in reasoning and acting on the level of everyday experience can be accounted for precisely by the existence of stable structures on the side of reality to which their thoughts and actions are attuned.

Developmental PsychologyThis term refers to children’s development—we don’t want to use it here, I think. The idea that there is a non-trivial collection of beliefs of medical reality that is shared across cultures can be defended also by pointing to results of developmental psychology supporting the existence of a common core of beliefs that is shared by all cultures. (REF)

Primary theory is a matter of beliefs relating to the objects of direct perceptions. This means: 1. perceptions which do not involve the interpolation of any theory or interpretation, perceptions which are integrated directly (physiologically), rather than via some conceptually mediated process of deduction or inference. And it means 2. perceptions which are typical or generic, in the sense that they do not involve special instruments or apparatus or special circumstances – as contrasted e.g. with perceptual experiences in the cinema or in the psychology laboratory or under special chemical influence. Such special cases are not significant from the point of view of the specification and delineation of the common-sense world (a state of affairs parallel, in some ways, to that which obtains in the field of research into linguistic universals). For the common-sense world is delineated by our beliefs about what happens in mesoscopic reality in most cases and most of the time. It is oriented, in other words, about the focal instances of the phenomena of the everyday world, rather than about non-standard or deviant phenomena.

34

MFN can be used as a vehicle for research by psychologists, linguists and others as concerns learning and development and also as concerns cross-cultural universals in the domain of medical knowledge and vocabulary. We also envisage the development of Fact Nets in other domains, and a further set of research challenges will present themselves when it comes to creating associations between the different Fact Nets in ways which will enable the resources of these other Fact Nets to be exploited in enhancing the abilities of MFN for example in medical query services.

2) We make a linguistic pre-analyses of the contexts from which the statements originate (text genre, context, communicative sense, etc.), providing criteria e.g. for distinguishing episodic and generic readings and thus for determining the relevant syntactic forms.

3) We perform a lexical mapping relating the lexical meaning of the medical terms used in the statements to the way they are used in the associated source contexts. On this basis, it will be possible to determine a corpus of lexicalized medical concepts and a large portion of the semantic relations which link these concepts to one another (many of which can be generated automatically from larger medical ontology sources).

4) An analysis of the ontology of the medical domain underpinning the sentences in the MBN corpus enables us to determine the ontological relations that the sentences express or refer to. These ontological relations can also be regarded as the nexus between single terms and the corresponding facts, since they explicate how the concepts lexicalized by the single terms connect to the structures represented by the complement terms.

5) At the same time we can use methods to generate semi-automatically from the corpora information concerning ontological relations not explicitly represented in the text. Thus for example we can look for phrases like Xs and other Ys from which we can infer that X is a kind of Y.

6) We link the results to the semantic structure and to the ontological underlay of the medical segment of WordNet.

In this way we will filter out for example those items from technical medical sources which fall outside the range of statements intelligible to non-expert.MWN should be neither too small nor too large.If it is too large (i.e. if it contains terms which are too many technical), then its value as a generator of text accessible to non-experts will be diminished. If it is too small (i.e. if it lacks terms which non-experts use), then its value for text understanding will be diminished.This is a difficult trade-off, and inevitably there will be problems in the area of the border between MWN and the total medical vocabulary. Here are some examples to give an idea of how we have resolved this trade-off:

Included: diffusion, dialysis, embolism, flatulence, fungicide, nape

Not included: dilate, effusion, flavonols, furuncle, lesion

Here is a list of non-expert words which have technical meanings which differ from their non-expert meanings:

calculuscolonydonor

35

drainageeruptionfelonfitfloodgirdleglobegripehammerhemlaborlabyrinthlapmalignancemassmessengermonsteroppressedorbitpadpieplaterecessionrigorsscrewsshaftsectionshuntslough/sloughingspillagestifletopic/topically

Rules for MWN (= medical word net): no terms containing numbers, no proper names except common remedies (Aspirin, Prozak, Viagra)

One characteristic of MWN is that there are many Latin? families of words in expert medical vocabulary for which MWN has just one or two members, e.g.:

Does Tse deal with any of the above?

SIZE OF WORDNETThe latest version WordNet v2.0 comprises 203.145 word-meaning pairs.The EuroWordNet project was finished in 1999. At that point of time it contained about 300.000 word entries.

IS LANGUAGE IN PHYSICIAN-PATIENT COMMUNICATION IN MEDICAL ANTHROPOLOGYThe last decade of research in Medical Anthropology has shown a growing interest in the role of natural language and language-based interaction in the physician-patient relationship. The focus of this research, however, still lies on the examination of episodic statements and narrative texts (Mattingly C,

36

Garro LC (eds.) (2000), Good BJ (1994), Kleinman A (1988)). Patients use episodic statements to communicate their story of illness, medical treatment, and cure (a sample text is given in (i)).(i) Passage from a transcript of an interview of a patient (Kirmayer LJ (2000), p. 158)

OK. I have suddenly, I have been having a tremendous amount of … of stomach and heartburn and chest pains, I've been to two gastroenterologists, I've been to St. Elizabeth's Hospital, I've been taking antibiotics, I have had all sorts of X rays, from gastroscopic to large and small intestine. They have mentioned … these gastroenterologists, they've mentioned that it's merely a case of nerves, acidity, anxiety, mainly, that wraps it up. […]

Physicians use the narrative text genre to convey expert knowledge about their experiences in the clinical domain. They do so, for instance, when they teach students by referring to exemplery cases (Good BJ, DelVecchio Good MJ (2000)). In contrast, little efforts have been made within Medical Anthropologicy to explore the role of non-expert background knowledge of patients in physician-patient communication. Nor has there been any systematic analysis of those sentences uttered by patients in the interaction between physicians and patients that express generic facts or generalized beliefs about medicine. (The function of narration for the patient in verbalizing his concerns, which we explicate above, rightly suggests that generic sentences are a rather marginal phenomenon among the questions and statements patients utter in their part of the physician-patient discourse.)

Criteria for inclusion in / exclusion from MWN:

Inclusion criteria

1. exceeds a certain frequency (difficult, since statistic distributions always involve artifacts, such as that a common word incidentally occurs only once or twice in a corpus)

Exclusion criteria

frequency factors:1. a term is below the frequency requirement2. (a term is in the gray area wrt. the frequency requirement) and there is a synonymous term which is significantly more frequent3. comparison to frequency in other corpora (Tse's)

contextual factors:4. a term is low-frequent and is used within a context (sentence or text), in which, more often than average, such terms are used5. a term is high-frequent in an expert-corpus (Tse has such corpora) and is used within a context (sentence or text), in which, more often than average, such terms are used

social factors:6. a term is low in frequence in the non-expert corpus and is used by a person who, more often than average, uses such terms7. a term is frequently used in an expert-corpus and is used by a person who, more often than average, uses such terms

37

Documents

Medical FactNet - Buffalo Ontology Siteontology.buffalo.edu/MFN/MedicalFactNetLong.doc · Web view( Query text Medicine-Worldwide response (translated from the German; responses include