19
Copyright © John Owens 2009 All Rights Reserved IMM TM INTEGRATED MODELLING METHOD Data Structure Modelling John Owens The development of IMM™has brought Business Modelling into the 21 st Century A business modelling method for professional analysts and business personnel alike .

Data Structure Modeling eBook Extract

Embed Size (px)

DESCRIPTION

Data Structure Modeling eBook extractDescribes in detail how to identify and model the structure of the data used by Business Functions. An essential activity before building any computer system no matter how big or small.This is an extract of the eBook found at http://www.integratedmodelling.co.nz/imm-bpm-business-process-modeling-store/data-modeling-ebook

Citation preview

Page 1: Data Structure Modeling eBook Extract

Copyright © John Owens 2009

All Rights Reserved

IMMTM

INTEGRATED MODELLING

METHOD

Data Structure Modelling

John Owens

The development of IMM™has brought

Business Modelling into the 21st Century

A b us i ne s s m o de l l i ng meth o d f o r p r o fes s i o n a l an a l y st s an d bus i nes s

p er s o nne l a l i ke.

Page 2: Data Structure Modeling eBook Extract

Copyright © John Owens 2009 No part of this document may be reproduced, photocopied, stored

for retrieval by electronic means or made available to (or transferred to) any third party without the express written

permission of the author Trademarks

The term IMM™ and the IMM™ Logo are both registered trademarks.

Copyright © 2009

Page 3: Data Structure Modeling eBook Extract

IMMTM

INTEGRATED MODELLING METHOD DATA STRUCTURE MODELLING

Copyright © John Owens www.integratedmodelling.co.nz Index Page 1

Page 4: Data Structure Modeling eBook Extract

IMMTM

INTEGRATED MODELLING METHOD DATA STRUCTURE MODELLING

Copyright © John Owens www.integratedmodelling.co.nz Index Page 2

Page 5: Data Structure Modeling eBook Extract

IMMTM

INTEGRATED MODELLING METHOD DATA STRUCTURE MODELLING

Copyright © John Owens www.integratedmodelling.co.nz Index Page 3

Page 6: Data Structure Modeling eBook Extract

IMMTM

INTEGRATED MODELLING METHOD DATA STRUCTURE MODELLING

Copyright © John Owens www.integratedmodelling.co.nz Index Page 4

Page 7: Data Structure Modeling eBook Extract

IMMTM

INTEGRATED MODELLING METHOD DATA STRUCTURE MODELLING

Copyright © John Owens www.integratedmodelling.co.nz Index Page 5

Page 8: Data Structure Modeling eBook Extract

IMMTM

INTEGRATED MODELLING METHOD DATA STRUCTURE MODELLING

Copyright © John Owens www.integratedmodelling.co.nz Page 1

1.11.1 IMMIMM The Integrated Modelling Method is an approach to business modelling, that I have developed over many years, as a means of empowering analysts and business managers alike to develop models that bring real business benefits. The method brings together the best practices in business systems modelling across a whole range of practical techniques. The purpose of IMM™ is to enable elegant, accurate, integrated models to be produced for all or part of a business quickly with accuracy and rigour and, at the same time, avoid the shortcomings and pitfalls of conventional modelling methods. Because each of the models in IMM™ is built using elements from the Function Catalogue, all of the models are fully interrelated providing a richness, rigour and consistency that is not offered by any other modelling technique. <Break in Extract>

1.31.3 FIRST THINGS FIRSTFIRST THINGS FIRST The starting point for all modelling in IMM™ is the Function Catalogue as it acts as the unique catalogue of business functions that will be used in all other models. Does this mean that you have to model all of your business and build the function catalogue in its entirety before you can start any other models?

The answer to this is very big emphatic NO! The Function Catalogue lies at the core of IMM™ but it does not have to be created in its entirety in advance of all other models. The more you can do on the Function Catalogue prior to starting other modelling the easier your task will be. But the ‘I’ in IMM not only stands for ‘Integrated’ but also for ‘Interactive’. Whatever facet of IMM™ you are using you will always be interacting with the Function Catalogue. This interaction will not be limited simply to using functions from it in your models but will include adding to and modifying it. <Break in Extract>

44 BASIC DEFINITIONSBASIC DEFINITIONS This section will give some basic definitions that you need to know before you set out on the activities described in Section 4. More detailed definitions will be given throughout the book for each element when they are required.

4.14.1 WHAT IS DATA?WHAT IS DATA? A value in a particular format is called a datum. However, we seldom use this singular form of the word but instead use the plural data. Examples of items of data are:

Page 9: Data Structure Modeling eBook Extract

IMMTM

INTEGRATED MODELLING METHOD DATA STRUCTURE MODELLING

Copyright © John Owens www.integratedmodelling.co.nz Page 2

Datum Description 1 Integer with one significant figure. red Character string, three characters in length, lower case. 3.9 Number with two significant figures and one place of

decimals. 22 Oct 96 Date with numeric day number, first three characters of

the month name (initial letter capitalised) and last two digits of the year.

In business and in business modelling data by itself has no intrinsic value. It is only of significance if it supports business functions.

4.24.2 WHAT IS INFORMATION?WHAT IS INFORMATION? Data on its own has little meaning. For example, K3P3 is a datum, but what is it? Is it a cipher in a secret code or a foreign car registration? If we know the context of the datum then it can make sense. So, if we are reading a knitting pattern, K3P3 is very obviously Knit 3, Purl 3!! (Well obvious to knitters!)

Data in a context is INFORMATION!

4.34.3 WHAT IS DATA STRUCTUWHAT IS DATA STRUCTURE MODELLING?RE MODELLING? Data Structure Modelling is the process of identifying and describing all the elements of data that are required to support the business functions performed by a business and the relationships between these elements. This will entail identifying and describing all of the following: • data entity types (normally called ‘entities’) • attributes of entities • relationships between entities • volumes for entities • how entities are used by functions • how attributes are used by functions

Contrary to what many analysts would have you believe, data structure modelling is not a dark science. It is a simple craft and, like all crafts, if practiced with the right set of rules and tools, can be easily understood and practiced by any intelligent person. What we call Data Structure Modelling in IMM™ is often referred to simply as ‘data modelling’ outside of IMM™. But, because in IMM™ we also have Data State Modelling, we use the term Data Structure Modelling to differentiate between the two; it is also a more accurate term for what the practice entails.

Page 10: Data Structure Modeling eBook Extract

IMMTM

INTEGRATED MODELLING METHOD DATA STRUCTURE MODELLING

Copyright © John Owens www.integratedmodelling.co.nz Page 3

4.44.4 WHY DO DATA STRUCTURWHY DO DATA STRUCTURE MODELLINGE MODELLING?? If you are going to build any type of automated information system of any size from a Microsoft Access database to a mainframe accounting system for a multinational company you need to do data structure modelling. Without knowing what elements of information need to be held, the facets of each element that need to held and the way in which the various elements are associated it is impossible to build any effective database. Data structure modelling in IMM™ allows you to define all that you need to know about information and data in order to be able to build effective, efficient, robust and resilient databases for any size of computer application.

4.54.5 DATA ENTITY TYPEDATA ENTITY TYPE A data entity type is anything (real or abstract) of significance to the business about which information must be known or held in order to support the business functions. Data entity types are usually called ‘entities’ which is the term we shall use throughout this book. Typical entities for a company might be ‘Customer’, ‘Product’, ‘Sales Transaction’.

Entities are always created, used or transformed by business functions.

4.64.6 ATTRIBUTES OF AN ENTATTRIBUTES OF AN ENTITYITY An attribute of an entity is any piece of information that identifies or describes the entity. Typical attributes for the entity PERSON would be:

First name, Surname, Date of Birth We will expand on this definition for attribute in Section Error! Reference source not found..

4.74.7 OCCURRENCE OF AN ENTOCCURRENCE OF AN ENTITYITY An occurrence of an entity, also called an ‘instance’ of an entity is best described by an example. If the entity is EMPLOYEE then occurrences of EMPLOYEE might be the employees John Jones, Mary Pollard, Andrew Spinx, etc. For the entity PART in an engineering company, typical occurrences might be: • 22 mm diameter shaft, 6 metres in length • 10 mm diameter shaft, 2 metres in length

4.84.8 SYNONYMSSYNONYMS Sometimes the same thing can be known by different names in different parts of the business. For example, what is known as EMPLOYEE in Head Office might be called WORKER on the factory floor and CONTRACTOR out on the construction site.

Page 11: Data Structure Modeling eBook Extract

IMMTM

INTEGRATED MODELLING METHOD DATA STRUCTURE MODELLING

Copyright © John Owens www.integratedmodelling.co.nz Page 4

If these really are the same thing then the names EMPLOYEE, WORKER and CONTRACTOR are said to be ‘synonyms’. We will return to synonyms in more detail in Section Error! Reference source not found..

55 IDENTIFYING ENTITIESIDENTIFYING ENTITIES The first step in Data Structure Modelling is to identify all possible entities that are needed to support the business functions in the business area being modelled. In IMM™ we have developed a technique for doing this that gives consistent high quality results and that removes the mystique and confusion from this activity.

5.15.1 SOURCES OF ENTITIESSOURCES OF ENTITIES There are many sources from which potential entities can be identified and all of these sources should be used in order to ensure that no entities are missed. The sources used for identifying entities are (and always should be) the same as those used for identifying functions when building the Function Catalogue. Entities are inextricably linked to functions as they are created, used and transformed by them. The sources for entities include: • Transcripts of taped analysis interviews with senior business

managers. • Typed up notes of supplementary information from these interviews.

These should be merged with the transcript of the taped interview where one exists.

• Function titles and descriptions developed during function modelling. • Data flow diagrams produced in analysis workshops.

<Break in Extract>

88 ENTITY RELATIONSHIP ENTITY RELATIONSHIP DIAGRAMDIAGRAM The most effective way of displaying the structure of data required by a business is the Entity Relationship Diagram. Commonly referred to as an ‘ERD’, this is a diagram showing all of the entities needed to support business functions and the associations between these entities. ERD’s are most easily drawn using CASE (Computer Aided Systems Engineering) tools. These are computer applications designed to aid in documenting and modelling all of the elements of a business and, if required, to aid in designing and building computer systems to support the business functions. ERD’s can also be drawn manually or using a diagramming package. However, these are time consuming ways of doing it and are fraught with many shortcomings that can seriously impair accuracy and quality. We strongly recommend that you use a recognised CASE tool whenever you are doing any serious business modelling.

Page 12: Data Structure Modeling eBook Extract

IMMTM

INTEGRATED MODELLING METHOD DATA STRUCTURE MODELLING

Copyright © John Owens www.integratedmodelling.co.nz Page 5

99 ENTITY RELATIONSHIPSENTITY RELATIONSHIPS When we were building our lists of candidate entities we talked about ‘associations’ between the candidate entities and we gave them names. Now that we are going to build an entity relationship diagram we must convert these ‘associations’ to ‘relationships’. These relationships will give us more information about the ways in which entities are linked; they represent business rules. These rules tell that one entity is associated in a particular way with another entity and what that association is. They might, for example, tell us that: • a delegate must always be booked on a course. • an employee must always be assigned to a department. • a sale must always be of a product. • an employee may be manager of an employee.

9.19.1 DEFINING RELATIONSHIDEFINING RELATIONSHIPSPS In order to define a relationship fully all of the following elements must be specified: • the name of the relationship • its optionality • its degree

NAMENAME Relationship names describe the associations between entities. Relationships must be named in both directions because, like associations, they are always two way - they say how entity A is related to entity B and how entity B is related to entity A. Each relationship name must be in a form that can be preceded by the term ‘must’ or ‘may be’ – the optionality of the relationship – and still make sense.

OPTIONALITYOPTIONALITY Relationships must be defined as being either mandatory or optional. Must the relationship always be created? Each time you create an occurrence of one entity must you always associate it with an occurrence of the other entity? If the answer is ‘yes’ then the relationship is mandatory, otherwise it is optional.

Page 13: Data Structure Modeling eBook Extract

IMMTM

INTEGRATED MODELLING METHOD DATA STRUCTURE MODELLING

Copyright © John Owens www.integratedmodelling.co.nz Page 6

DEGREEDEGREE Degree (sometimes called Cardinality – but not in IMM™) is the term used when defining the number of occurrences of one entity associated with each occurrence of another entity. The ‘number’ does not refer to a specific number but is limited to answering the question: “Is each occurrence of the first entity linked to ‘one and only one’ occurrence of the second entity or to ‘one or more’ occurrences of that entity?” Because relationships are always two-way and because the degree may be different in both directions, there are three ways in which degree can be combined:

One-to-one Sometimes written as 1:1. This is where the degree is ‘one and only one’ at both ends of the relationship.

One-to-many Sometimes written as 1:m. This is where the degree is ‘one and only one’ at one end of the relationship and ‘one-or-more’ at the other end. The reverse, but equivalent, of this is the many-to-one (m:1) where the degree is ‘one-or-more’ on one end of the relationship and ‘one and only one’ at the other end.

Many-to-many Sometimes written as m:m. This is where the degree is ‘one-or-more’ on both ends of the relationship.

These are all described in more detail in later Sections.

9.39.3 SHOWING RELATIONSHIPSHOWING RELATIONSHIPS ON ERD’S ON ERD’ss We have built relationships in words and these are useful but the most powerful way of demonstrating relationships is drawing an Entity Relationship Diagram (ERD). In this Section we will look at the conventions for doing this.

ENTITIESENTITIES Entities are represented on an ERD by ‘soft boxes’ (boxes with rounded corners) as shown below.

STUDENT

Entity names are shown as singular nouns and are written in CAPITALS inside the soft box.

RELATIONSHIPS LINESRELATIONSHIPS LINES A mandatory relationship is represented on an ERD by a solid line an optional relationship by a dashed line.

Mandatory, i.e. ‘must be’ is shown like this

Optional, i.e. ‘may be’ is shown like this

Page 14: Data Structure Modeling eBook Extract

IMMTM

INTEGRATED MODELLING METHOD DATA STRUCTURE MODELLING

Copyright © John Owens www.integratedmodelling.co.nz Page 7

EXAMPLEEXAMPLE Let’s take the relationship we built in Section Error! Reference source not found. above and see how it would be represented on an ERD.

The above diagram represents two relationships, one from VEHICLE to EMPLOYEE, which is read as:

Each VEHICLE must be assigned to one and only one EMPLOYEE and the other, its reverse, from EMPLOYEE to VEHICLE, which is read as:

Each EMPLOYEE may be responsible for one and only one VEHICLE This is an example of a ‘one-to-one’ relationship. Each occurrence of EMPLOYEE is associated with ‘one and only one’ occurrence of VEHICLE and vice versa.

9.49.4 ONE TO MANY RELATIONONE TO MANY RELATIONSHIPSSHIPS In a one-to-many (1:m) or many-to-one (m:1) relationship each occurrence of one entity may be associated with one or more occurrences of another entity. This ‘one or more’ degree is represented on an ERD by the symbol , known as a ‘crows foot’. This is placed at that end of the relationship connected to the entity of which there can be ‘one or more’ occurrences. The ‘one and only one’ degree is shown by the absence of a ‘crows foot’ at the end of the relationship attached to the entity that will have just the one occurrence. A diagram is the best way of showing this.

In the above example we have a 'many-to-one' relationship from EMPLOYEE to DEPARTMENT and a 'one-to-many' relationship from DEPARTMENT to EMPLOYEE. The relationship going from EMPLOYEE to DEPARTMENT is read as: Each EMPLOYEE must be assigned to one and only one DEPARTMENT

EMPLOYEE

VEHICLE

assigned to

Solid line = ‘must be’

Dashed line = ‘may be’ Name going from VEHICLE to EMPLOYEE

Name going from EMPLOYEE to VEHICLE

EMPLOYEE

VEHICLE

assigned to responsible for

VEHICLE EMPLOYEE

EMPLOYEE

assigned to place of work for

Crows foot = ‘one or more’ EMPLOYEE

No crows foot = ‘one and only one’ DEPARTMENT EMPLOYEE DEPARTMENT

Page 15: Data Structure Modeling eBook Extract

IMMTM

INTEGRATED MODELLING METHOD DATA STRUCTURE MODELLING

Copyright © John Owens www.integratedmodelling.co.nz Page 8

Element Explanation

must be This tells us the optionality for the relationship with regard to EMPLOYEE; it is mandatory. An EMPLOYEE must be assigned to a DEPARTMENT. Another way of saying this is that each time an occurrence of EMPLOYEE is created it must be linked to an occurrence of DEPARTMENT.

assigned to This is the name of the relationship going from EMPLOYEE to DEPARTMENT. The name is placed close to the relationship line and alongside EMPLOYEE.

one and only one

This tells us the degree of the relationship at the DEPARTMENT end of the relationship. For each occurrence of EMPLOYEE there will be ‘one and only one’ occurrence of DEPARTMENT. Another way of saying this is that an EMPLOYEE can only be shown as being ‘assigned to’ one DEPARTMENT.

This relationship from DEPARTMENT to EMPLOYEE is read as: Each DEPARTMENT may be place of work for one or more EMPLOYEES

Element Explanation

may be This tells us the optionality for the relationship with regard to DEPARTMENT. It is optional. A DEPARTMENT may be the place of work for one or more EMPLOYEES, but an occurrence of a DEPARTMENT can be created and exist without being linked to an occurrence of EMPLOYEE.

place of work for

This is the name of the relationship going from DEPARTMENT to EMPLOYEE. The name is placed close to the relationship line and alongside DEPARTMENT.

one or more This tells us the degree of the relationship at the EMPLOYEE end of the relationship. Each occurrence of DEPARTMENT may be associated with ‘one or more’ occurrences of EMPLOYEE.

Notice that when read from a ‘one’ end to a ‘many’ end that name of the entity at the ‘many’ end is given in the plural, e.g. EMPLOYEES. The ‘one-to-many’ (or 'many-to-one') is the most common type of relationship. <Break in Extract>

Page 16: Data Structure Modeling eBook Extract

IMMTM

INTEGRATED MODELLING METHOD DATA STRUCTURE MODELLING

Copyright © John Owens www.integratedmodelling.co.nz Page 9

9.69.6 DEAD CROWS RULEDEAD CROWS RULE When drawing ERD’s it is a good convention to arrange the entities in such a way on the diagram that the open end of the crows feet are to the top or to the left on each relationship. Using this convention has two main advantages: • It prevents people – including analysts – thinking of ‘one-to-many’

relationships as hierarchies - a common mistake. (How to model hierarchies is described in Section Error! Reference source not found.)

• It brings all high volume, ‘volatile’ entities (those that are most often created, read and updated) to the top and left of the diagram, (this is where we would normally start reading in European languages) and puts the low volume, stable entities at the bottom and to the right.

When the ‘one or more’ relationships point to the top of the diagram they look like upside down ‘crows feet’ – the feet of ‘dead crows’ (lying on their backs of course!). When they point to the left they look like this - an are referred to as ‘dead crows flying east’!! So the rule for a good layout of an ERD is ‘dead crows flying east’!!!

Dead Crows Rule The optimum layout for an ERD is achieved by having the open ends

of all ‘crows feet’ pointing up or to the left. Remember – ‘dead crows’ and ‘dead crows flying east’!

<Break in Extract>

Page 17: Data Structure Modeling eBook Extract

IMMTM

INTEGRATED MODELLING METHOD DATA STRUCTURE MODELLING

Copyright © John Owens www.integratedmodelling.co.nz Page 10

1818 UNIQUE IDENTIFIERSUNIQUE IDENTIFIERS An essential step in data structure modelling is to identify and define for each entity the elements that make each occurrence of it uniquely identifiable from both a business and human perspective. This is known as its unique identifier (UID). The elements that might make each occurrence of an entity unique could be: • one or more of its attributes • one or more of its relationships with other entities • a combination of its attributes and relationships

Making sure that each occurrence of an entity is unique is vital in a business in order to prevent the replication of data. Many businesses hold such things as CUSTOMER multiple times which results in the customer being billed several times for a single event or not at all! The same thing can occur with PRODUCT where a single product can be held under many different codes (see Section Error! Reference source not found. for more on this). This results in the product seeming to out of stock when there are perhaps thousands in the warehouse – but under a different name or code! So making the effort to define the UID’s for each entity is not an abstract modelling practice but an essential business activity that all businesses should do.

18.118.1 UID OF ONE OR MORE AUID OF ONE OR MORE ATTRIBUTESTTRIBUTES Many entities can be uniquely identified by an attribute or a combination of attributes. Typical of these are:

Entity UID Attribute Examples DEPARTMENT Name Finance

Production Sales

COUNTRY Name England France Portugal

PLANET Name Jupiter Saturn Venus

Other entities require more that one attribute to identify them uniquely:

Page 18: Data Structure Modeling eBook Extract

IMMTM

INTEGRATED MODELLING METHOD DATA STRUCTURE MODELLING

Copyright © John Owens www.integratedmodelling.co.nz Page 11

Entity UID Attributes Examples PRODUCT Name Jack-in-the-Box

Version 12

BOOK Title IMM Process Modelling

Edition 3rd

HOLIDAY PERIOD Start Date 12th January 2003

End Date 16th February 2003

18.218.2 UID COMBINING ATTRIBUID COMBINING ATTRIBUTES AND UTES AND RELATIONSHIPSRELATIONSHIPS Sometimes it takes more than attributes to identify an entity uniquely. In the diagram below there are two entities, TEAM and DEPARTMENT.

DEPARTMENTTEAM

made up from

part of

This cross line indicates the relationship from TEAM

to DEPARTMENT is part of the UID of TEAM.

# Name # Name

The above structure tells us that the UID of DEPARTMENT is the attribute Name. It also tells us that the UID of TEAM is the attribute Name plus the relationship to DEPARTMENT. In IMM™ the short bar across the relationship indicates this. But what does this mean? It means that the name of the team is not unique across the business. For example the TEAM.Name might be ‘Business Improvement’ and several DEPARTMENTS might have TEAMS with this name. So the UID of TEAM cannot be the attribute TEAM.Name on its own. However, the name of the TEAM is unique within the DEPARTMENT as it would make no sense to have two TEAMS of the same name within the one DEPARTMENT. This tells us that the Name of the TEAM combined with the Name of the DEPARTMENT is unique. Another way of saying this is that the UID of TEAM is the attribute Name plus the UID of the entity at the other end of the relationship, i.e. the UID of DEPARTMENT. The short bar on the relationship showing that it is included in the UID is very powerful structure IN IMM™ because it means that you do not have to ‘move’ attributes from DEPARTMENT into TEAM as some of the more traditional modelling methods used to do.

Page 19: Data Structure Modeling eBook Extract

IMMTM

INTEGRATED MODELLING METHOD DATA STRUCTURE MODELLING

Copyright © John Owens www.integratedmodelling.co.nz Page 12

18.318.3 UID OF ONE OR MORE RUID OF ONE OR MORE RELATIONSHIPSELATIONSHIPS Some entities are not uniquely identified by their attributes at all but solely by their relationships to other entities. To be more precise, they are identified by the unique identifiers of the entities to which they are related. This is especially true for intersection entities.

SALES AREA

PRODUCT AREA DISCOUNT

subject of discount for

a product via

definition of

discount for a

product in

PRODUCT

discounted in an

area according to

definition of

discount in an area

for

# Name # Name

* Percentage Discount

The diagram above indicates that the entity PRODUCT AREA DISCOUNT is uniquely identified by its relationships to PRODUCT and to SALES AREA. This is indicated by the short bars across the ‘many’ ends of the relationships. This means that PRODUCT AREA DISCOUNT is uniquely identified by the combination of the unique identifiers of PRODUCT and of SALES AREA. The order in which these are stated does not matter, they will still uniquely identify the entity. The main thing to notice is that PRODUCT AREA DISCOUNT does not need to itself hold attributes representing the unique identifier PRODUCT or the unique identifier AREA. This demonstrates the power and simplicity of the structure of the ‘UID bar’ across the relationships as no false attributes need to move into the intersection entity. It is important to check that any CASE tool you are considering using to support you in data structure modelling supports the convention of the bar across the relationship to show that it is part of the UID. Older CASE tools move false attributes into the intersection entity. This is a highly undesirable modelling practice and is not part of IMM™.