View
216
Download
0
Category
Tags:
Preview:
Citation preview
GSC Associates
EDCS Concepts
Steve Carson
GSC Associates
for JTC 1/SC 24 WG 8 Palm Springs, CA USA
August 2001
GSC Associates
Some thoughts
There are more things in heaven and earth, Horatio,
Than are dreamt of in your philosophy.
William Shakespeare, Hamlet
The task of classifying all the words of language, or what's the same thing, all the ideas that seek expression, is the most stupendous of logical tasks. Anybody but the most accomplished logician must break down in it utterly; and even for the strongest man, it is the severest possible tax on the logical equipment and faculty.
Charles Sanders Pierce, letter to editor B. E. Smith of the Century Dictionary
GSC Associates
More Thoughts
We must be systematic, but we should keep our systems open.
Alfred North Whitehead, Modes of Thought
The art of ranking things in genera and species is of no small importance and very much assists our judgment as well as our memory. You know how much it matters in botany, not to mention animals and other substances, or again moral and notional entities as some call them. Order largely depends on it, and many good authors write in such a way that their whole account could be divided and subdivided according to a procedure related to genera and species. This helps one not merely to retain things, but also to find them. And those who have laid out all sorts of notions under certain headings or categories have done something very useful.
Gottfried Wilhelm Leibniz, New Essays on Human Understanding
GSC Associates
Outline
• Challenges• Foundations• Alternative approaches to formalizing ECs• Dealing with concepts with many values
GSC Associates
Challenges
• Some concepts are easy to define; others are hard to define
• Removing the UML model while retaining meaningful definitions
GSC Associates
Concepts close to implementation
• Observation: EDCS concepts that are closer to design or implementation are easier to define
EAs:
FARMING_METHOD
FLOODLIT
FREEZING_HEIGHT
YOUNG_MODULUS EEs:
FARMING_METHOD of GRAZING
FENCE_TYPE of CHAIN_LINK
PARTICLE_ENERGY_RANGE of gt1r0_MEV
EUs:
AMPERE
JOULE_METRE_SQD
RADIAN
GSC Associates
Abstract concepts
• More abstract concepts (such as sets of characteristics) are harder to define.
CABIN
CULVERT
HEDGEROW
HOTEL
RACE_TRACK
CHARACTERISTIC_SET
ATM_ANALYSIS_CHARSET
OC_COLUMN_SCATSTRN_CHARSET
Easier: Harder:
GSC Associates
Definitions in the absence of a model
• If we based the EDCS explicitly on a well-defined model (for example, a UML model such as the SEDRIS DRM), then sub-typing and other abstraction mechanisms would make consistency in EDCS definitions easier to achieve.
• Current definitions have implied data models. For example:
COVERAGE: The type of mixture of environmental objects that have been grouped.
AIRSPACE_BOUNDARY_SGMT A single segment of an airspace boundary.
GSC Associates
Background on definition technology
• Analysis and design• Concepts• Words, objects and concepts• Classification• Generalization and specialization• Aggregation
GSC Associates
Using language to express a mental model
GSC Associates
Analysis and design
• Analysis is a process that maps from perception of the real world to a representation
• Design is a process that maps from an analysis representation to an expression on implementation, that is, from a problem to a solution.
• Analysis and design are different because perception and its eventual implementation may, and usually do, differ.
GSC Associates
Concepts
• A concept is an idea or notion that we apply to classify those things around us.
• For instance, if we were to list all those things to which the concept of “being mortal” applies, we would form the set of all mortals.
• Concepts have two important aspects:– The intension of a concept is its meaning, or its complete
definition.
– The extension of a concept is the collection, or set, of things to which the concept applies.
“We are consciously and deliberately abstracting the form from all the things which have it. Such an abstracted form is called a concept.” -- Susan Langer, An Introduction to Symbolic Logic, 1967
GSC Associates
The meaning triangle
Concept
Symbol Referentstands for
sym
boliz
es refers to
GSC Associates
Concept, intension, thought, idea
Symbol, word
label, codeReferent, object, extension
stands for
sym
boliz
es refers to
The meaning triangle
GSC Associates
Words and objects
“Spoken words are symbols of experiences in the psyche; written words are symbols of the spoken. As writing, so is speech not the same for all peoples. But the experiences themselves, of which these words are the primary signs, are the same for everyone, and so are the objects of which those experiences are likenesses.” – Socrates, On Interpretation
• Perception maps extensional objects into intensional objects.
• Speech maps concepts to words.• The relationship between a word and an object is an
indirect mapping derived from the two direct mappings of perception and speech.
GSC Associates
Example and notation
• The intension of the concept BARN is “A farm building used to store hay, grain, and farm implements, or to house farm animals.”
• The extension of the concept BARN is the set of all (models of) barns.
• If C is a concept then, let int(C) denote the intension of C and ext(C) denote the extension of C.
• For example, if C is MORTAL then:– int(MORTAL) = { x | x must die}
– ext(MORTAL) = {Socrates, Plato, …}
GSC Associates
Objects and concepts
• To say that an object O is classified as a concept C means that both:– the intension of C is true for O– the object O is a member of the extension of C.
• If we use the symbol to represent the ∈classification relationship, O C means that “C ∈classifies O” or “O is an instance of C.”
• We can write Socrates ext(MORTAL) or just ∈Socrates MORTAL (said less formally, ∈“Socrates is mortal.”
• The concept that an object is an instance of is called its type. (Type is sometimes called class.)
GSC Associates
Generalization and specialization
• If a type A has the property that each object instance in A is also an object instance in type S, we say S is a specialization (or subtype) of A and write S A. In this case we also say A is a ⊂generalization or supertype of S.
• For example, HUMAN MORTAL.⊂
GSC Associates
Classification
• Classification is the act or result of removing certain distinctions between objects so that we may see the commonalities.
GSC Associates
Generalization and specialization
• Generalization is the act or result of distinguishing one type of object that is more encompassing than another.
• Specialization is the opposite of generalization.
GSC Associates
Aggregation
• Aggregation is the act or result of forming an object configuration from its component parts.
• Composition is the act or result of forming an object that is immutably configured from its component parts.
• There are many different types of aggregation based on:– Configuration: do the parts bear a particular functional or structural
relationship to one another or to the object they constitute?
– Homeomerous: are the parts of the same kind of thing as the whole?
– Invariance: can the parts be separated from the whole?
GSC Associates
Classification, generalization and aggregation
GSC Associates
Example: EC Definitions
• History of EC definition• Special words• ECs for special words
This motivates the slides that follow on alternatives for improving definitions.
GSC Associates
EC definition cleanup history
• In producing WD5 the editors sought to make the definition more consistent by uniformly using only one word for a certain meaning rather than many synonyms.
• This avoided potential confusion about “synonyms”.
• The next slide lists the special words now defined in EDCS Clause 3 Definitions and acronyms.
• Following that slide, we propose some further enhancements that can lead to even more consistency in the EDCS.
GSC Associates
Special “words”Name Definition
building a fixed structure built for human use and occupancy
expanse a wide and open extent of surface, water, land, or atmosphere
facility the physical means, including the buildings and equipment, required to accomplish something
parcel a portion or piece of land, usually a division of a larger area
region a portion of the earth's surface that is distinguished from adjacent lands by one or more characteristics or features
site the location of an object or structure, where a structure will eventually be built, or where a structure once stood
station a place where a phenomenon can be observed and studied in its normal or natural environment
structure a combination of elements that are fabricated and interconnected in accordance with a design and intended to support vertical and horizontal loads
tract any definable region or stretch of land
GSC Associates
ECs for special words
• Only BUILDING is now an EC in its own right.
• But the current definitions imply that the rest of the special words are fundamental concepts useful for defining ECs.
• Consistency will be enhanced and enforced if the other eight special words are also ECs in their own right.
• Therefore we ask that the following eight new ECs be added: EXPANSE, FACILITY, PARCEL, REGION, SITE, STATION, STRUCTURE, TRACT.
• Alternatively, if EXPANSE, PARCEL, REGION, SITE, and TRACT are really the same concept (differing only by the extent associated with them), use only one word (e.g. REGION) and make only that word an EC.
GSC Associates
Choices for EC definitions
• Add a formal semantic model to the EDCS. This might be defined in UML.
• Use concepts from mathematical and computer science to improve the definitions.
• Use structured English language.• Use normal English language.
Decreasing
rigor
GSC Associates
English definitions
• If an English word is just used in a definition, then nothing further can be assumed beyond its dictionary meaning.
• But the English language is imprecise and words may have many meanings.
• Consider this definition: “A set of characteristics describing an object.”– What does “set” mean? Is it just the first OED definition “a group or
collection of things belonging or used together or resembling one another”?– Or do we intend a more rigorous mathematical meaning for set, implying
for example, that the set may be empty and that a member may not be included twice?
– If the words “set”, “characteristic” or “object” are used in another definition, do they have the same meaning as in this definition?
GSC Associates
Some observations and history
• Reasoning about intensions depends on logic.• Two concepts with the same intension have the
same extension.• Intensions should include concepts and
relationships from some domain of discourse:– 2 + 2 = 4
– A dog is an animal with …
– Everything green is extended.
• Intensions of complex concepts are often defined in terms of simpler concepts.
Example: Man is a rational animal. -- Aristotle
GSC Associates
Primitive concepts
• Aristotle defined a set of primitives called categories: substance, quality, relation, time, position, state, activity and passivity.
• Some have thought that everything can be defined in terms of irreducible primitives.
• Leibnitz (1679): Universal Characteristic, representing primitive concepts by prime numbers
• Wittgenstein (1921): proposed that compound propositions were made up of simpler ones and eventually atomic facts.
• Masterman (1961): semantic networks, a dictionary of 15,000 words defined in terms of 100 primitives.
GSC Associates
Problems with full reduction of intensions to primitives
• Wittgenstein (1953) repudiated his earlier work; he never found a truly un-analyzable atomic fact.
• Dictionaries are not limited to fixed categories.• Dictionaries contain circular definitions.• No linguistic or psychological basis found for a
universal set of primitives.• Languages contain families of synonyms, each
with a slightly different meaning.• Ordinary words such as “game” have no
properties that characterize all their uses
GSC Associates
Biological science approach
• In biological science, concepts are often defined by a genera and differentia approach.
• Species are defined by describing a typical member, genus by describing a typical species.
• For example: A rose is a tree where the ovules are very rarely erect and the stigmata are usually simple, …
GSC Associates
Three views on definitions
• Classical: A concept is defined by a supertype (genus) and a set of necessary and sufficient conditions that differentiate it from other subtypes (species of the same genus).
• Probabilistic: A concept is defined by a collection of features – everything that has a preponderance of those features is an instance of the concept.
• Prototype: A concept is defined by an example or prototype. An object O is an instance of C if it resembles the characteristic prototype of C more closely than the characteristic prototypes of other concepts.
For most concepts, meaning is determined not by definition, but by family resemblance, characteristic prototype, or actual use. Three views on definitions are:
GSC Associates
Structured English definitions
• One way to improve the consistency of a set of definitions is to build them up in a logical, structured way from simpler concepts.
• The simpler concepts must be somehow identified in a recognizable way in the definitions.
• The obvious way to accomplish recognizability is to define the simpler concepts themselves in a set of definitions and then refer to them by their formal names or labels.
• Example:– Define SET and OBJECT as ECs– Then define OBJECT_ATTRIBUTE_SET as “A SET of EDCS Attributes
describing an OBJECT.”
GSC Associates
Some examples
• BOARDWALK: A walkway made of wooden planks, e.g., a boardwalk.
• GANTRY: A frame structure raised on side supports so as to span over or around something.
• GRANDSTAND: A usually roofed structure for special viewing of outdoor events and having tiers of seats or standing room for spectators.
GSC Associates
Conceptual graphs
• A conceptual graph is a finite, connected, bi-partite graph. The two sets of nodes are called concepts and relationships.
• Example: A BOUNDARY is a line of demarcation between two REGIONs.
REGION
REGION
BETW LOD
BETW = “between”
LOD = “line of demarcation”
GSC Associates
Example of a conceptual graph: UML
• UML and similar OOA&D modelling languages are special cases of conceptual graphs.
• The concepts are object types (called classes).• The relationships are:
– is-a (subtyping) (a nurse is a kind of person)
– has-a (aggregation) (a polygon has vertices and edges)
– association or role (a customer may rent videos)
– dependency
– realization
– etc.
GSC Associates
Data types
• A data type is an interpretation applied to a string of bits. – Examples: real, integer, logical, character
• More complex data is constructed of primitive data elements called atoms and construction operators.– Examples: integer array, set of real
• Structured data types are a collection of data values, the relationships among them, and the functions and operations that can be applied to the data.– Examples: array set list
tree graph
GSC Associates
Abstract data types
• The purpose of abstract data types is to allo9w the modular construction of systems from simpler components whose behavior is understood.
• A specification gives an abstract description of the behavior of instances of the type, independent of any implementation.
• A key principle is information hiding.
GSC Associates
Object concepts
• Active object: a uniquely identifiable collection of data (attributes or properties) defining its state and optional functional methods defining its behavior
• Data object: a uniquely identifiable collection of data (attributes or properties) defining the state of an active object
ActiveObject
DataObject
Externalize
Internalize
GSC Associates
EDCS concepts in OO terms
EDCS concept Corresponding object technology
termClassification Type (class) of an
object
Attribute (Part of the) state of an object
No corresponding concept
Operations on an object
GSC Associates
Supporting legacy concepts
• There are legacy coding systems with some poorly designed codes (e.g. X_OR_Y codes)
• There are at least three alternatives:– Propagate the mistakes: Add all the concepts from all
legacy systems into the EDCS.
– Carry the mistakes: Add a system of “legacy concepts” with identification of the legacy system (perhaps as an enumeration), and then just wrap up and carry the legacy codes and labels.
– Fix the mistakes: add a concept that is one of a set of alternative concepts.
GSC Associates
Pseudo-coded types for legacy data
LEGACY_EC {
legacy_system LEGACY_SYSTEM_ID,
legacy_system_data {code, label}
}
ONE_OF_A_SET_OF_ECS SET of PROBABLE_EC
PROBABLE_EC {
classification EC,
probability REAL }
a condition on the type that the sum of the probabilities over the whole set be 1
Carry the mistakes:
Fix the mistakes:
GSC Associates
Example
EC BARN_OR_SHED is modelled as:
probable_barn PROBABLE_EC {BARN, 0.5}
probable_shed PROBABLE_EC {SHED, 0.5}
BARN_OR_SHED ONE_OF_A_SET_OF_ECS {probable_barn, probable_shed}
ONE_OF_A_SET_OF_ECS might be defined informally as: “An object that can be classified as being exactly one of a non-empty SET of ECs. The probability that the object is of each EC is also specified.”
GSC Associates
When are two definitions “the same”?
• A light silvery ductile and malleable metal, not readily tarnished by air, which is a chemical element, atomic no. 13 (Symbol Al) - The New Shorter Oxford English Dictionary
• A silvery-white, ductile metallic element, the most abundant in the earth's crust but found only in combination, chiefly in bauxite. Having good conductive and thermal properties, it is used to form many hard, light, corrosion-resistant alloys. Atomic number 13; atomic weight 26.98; melting point 660.2°C; boiling point 2,467°C; specific gravity 2.69; valence 3. - The American Heritage® Dictionary of the English Language, Fourth Edition
• A lightweight silvery metal used either pure or as an alloy for making utensils or fittings where lightness is an advantage. - Oxford American Dictionary
• A silvery light and malleable metal, not tarnished by air. - The Pocket Oxford Dictionary
• A silvery lightweight metal, that is a chemical element. It does not rust. -Webster's New World Dictionary for Young Reader
Aluminum:
GSC Associates
Suggestions for NR and QR
• If the EDCS definition is intended to represent the same concept as a definition from an external reference, then we can make a NR to that definition.
• If the EDCS definition is derived from but does NOT represent the same concept as the external definition then we can make a QR to that definition.
• Examples: A_OR_B mapping to A and to B separately
GSC Associates
The wrong way to integrate concepts
• Include all the of concepts from different domains into a single standard, so that standard can “express” all these concepts.
Integrating standard
Concept A
Concept B
Concept C
Concept D
A
B
C
D
GSC Associates
The right way to integrate concepts
• Define a small set of abstract concepts and add value by mapping between concepts in individual domains to the single concept.
Integrating standard
Concept A
Concept B
Concept C
Concept D
I
GSC Associates
The right way to integrate the wrong way
• Rather than duplicating the set of concepts from each of several domains, provide a single concept that “carries” a concept designation from an external domain.
type Carried_Concept {
Domain domain_choice;
Concept_designation code;
}
GSC Associates
Concepts with many values
• There are many environmental concepts where the identification of specific instances uses values from a large name space.
• Some examples:– Personal identification numbers for individuals (e.g., a US Social Security
Number, 645-93-7395; UK National Insurance Number, ZX 13 02 06 A)
– Motor vehicle identification numbers (e.g., 1B3EJ46X9XN596822)
– Motor vehicle license plates (e.g., US, State of Kentucky, 461FZH)
– Aircraft serial numbers (e.g., 1967 Bell 206A, serial number 8)
– Aircraft registration numbers (e.g., C-GQKV)
– Watercraft hull identification numbers (e.g., EXC317060378)
– Watercraft registration numbers (e.g., US, State of New Mexico, NM 4002 AZ)
– Religions and denominations (over 10,000 distinct religious groups, including 33,830 Christian denominations)
GSC Associates
Approaches to incorporating concepts with many values
• Two extreme approaches:– Use Enumerated type with one value per entity.
– Use String type with the specific identification encoded in the string.
• Subtleties– The name space owner or naming authority is needed to make sense of
identification numbers, so a structured type is actually needed.
– Identification can be encoded in strings in many different ways
• Compromise approaches– Let the standard itself state how the information is encoded into a string
– Let a registered profile specify the encoding. If we allowed Bit String as a type, such a profile could even take a long string and code it in very few bits.
GSC Associates
Abstract specification vs. coding
• We must be careful not to confuse abstract specification that may appear to use many bits to represent something from a more compact coded representation of that concept used in interchange.
Instance of type:{USA, KY, 461FZH}
type Motor_vehicle_ID{country string;division string;license_plate string }
More compact (bit string) representation: 01101111
GSC Associates
References1. Conceptual Structures: Information Processing in Mind and Machine by
John F. Sowa, Addison Wesley 19842. Advanced Object-Oriented Analysis and Design Using UML, James J.
Odell, Cambridge University Press, 19983. UML Distilled, Second Edition, Fowler and Scott, Addison Wesley 20004. Object Oriented Software Engineering: The Professional Developer's Guide,
George Wilke, Addison Wesley 19935. Software Engineering: A Practitioner's Approach, Roger S. Pressman,
McGraw Hill 19976. The Art of Computer Programming, Vol. 1: Fundamental Algorithms,
Donald Knuth, Addison Wesley 19757. Fundamentals of Data Structures, Horowitz and Sahni, Computer Science
Press, 19758. Data Structures and Algorithms, Aho, Hopcroft and Ullman, Addison
Wesley 19839. Encyclopedia of Computer Science, Ralston, Reilly and Hemmendinger,
editors, Nature Publishing Group, 2000
Recommended