10
Nitor Data Modeling Data Standards, Guidelines and Approach

Nitor Infotech - Data Modeling Best Practices

Embed Size (px)

Citation preview

Nitor Data Modeling Data Standards, Guidelinesand Approach

Data Modeling Process and approach

Business Analysis

Conceptual Data Model

Logical Data Model

Physical Data Model

Implementation

Maintenance

Data Modeling Process and approach

Category In Puts Feature Designing Steps

CDM Use Cases, Process flow diagrams , Business Meeting

• Important entities & relationships •No attribute

• Understand Business• Identify subject areas • Identify entities & relationships

LDM CDM, Use Cases, Process flow diagrams, Business Meeting

•Includes all entities and relationships •Each entity Attributes•each entity primary key•Identifying relationship•Normalization

•Create subject areas •Find attributes of entities•primary key of each entity•Identifying relationship•Resolve relationships•Normalization •Naming Standard

PDM LDM •Specification all tables and columns. •Denormalization on user requirements.•PDM considerations•PDM as per RDBMS

•Entities into tables•Relationships into foreign keys•Attributes into columns. •Columns Data type as per RDBMS•PDM modification for performance •Naming Standard

Data Modeling Standards And Guidelines

Category Standards and Guidelines

Naming Standards

• Be correct, that is, both functionally and technically accurate.• Be clear, avoiding the use of vague terms such as “handle” or “process.”• Be concise, using the fewest number of words possible• Be unique, avoiding wording similar to that of any other name.• Be atomic, representing only a single concept.• Contain only letters of the alphabet, numbers, and word separators.• Use complete names wherever possible instead of abbreviations or acronyms.• Use only approved abbreviations or acronyms when the data modeling tool restricts the length of the name• Standard data dictionary

CDM • CDM consist of data entities and their relationships.• CDM describes key business information by subject area from a data perspective.• CDM should be divided into manageable size subject areas. In practical terms, this means a

model usually has between 4 and 15 entities per subject area.• Every subject area must have a unique title

Category Standards and Guidelines

LDM •Resolve many-to-many relationships, is fully attributed, and is normalized to Third Normal Form (3NF).•Non-specific relationship line of CDM between entities will be replaced with identifying or non-identifying relationships.•LDM also shows all primary key attributes and non-key attributes in the attribute area.•A fully attributed logical data model will be in Third Normal Form (3NF). •IE standard Notation used for LDM

PDM •Designate a unique primary key column for every table. •Each column name should contain all of the elements of the logical attribute from which it was derived, but should be abbreviated to fit within the max. length.•Do not use hyphens in table or column names because some programming languages interpret hyphens as subtraction operators.•Implement table and column names should supported by all target DBMS tools.•The physical model will assign lengths and data types to all columns. Data types should be specific to the target DBMS tool.•The physical data model will, at a minimum, provide examples of possible values for identifier, indicator, and code columns.

Data Modeling Standards And Guidelines

Data Modeling Standards And Guidelines

Category Standards and Guidelines

PDM • A certain amount of demoralization is usually necessary when implementing the physical data model.

• Estimate the expected storage requirements for each table based on the size of each row, expected growth, number of rows.

• Performance improvements may be realized by taking advantage of features such as partitions, storage properties, and index optimization.

• IE standard Notation used for LDM

Data Model Review Framework – Objectives

Category What Needs to Review?.

Model Correctness

• Is the model accurately capturing what it represents• Ensures the design represents the data requirements• Data elements with different formats than industry standards.• Incorrect cardinality• Key defined incorrectly

Model Completeness

• Does scope of model match identically the requirement?• Can a model be complete yet incorrect? Incomplete yet correct?• Relationships not shown then should be clarify any ambiguously defined terms.

Model Structure

• Ensures the model follows standard modeling practices, independent of content• Entity Structure Review• Data Element Review• Relationship Review

Model Flexibility

• Ensures the correct level of abstraction is applied to capture new requirements.• Achieving right level of flexibility• Prove there is value in every abstraction situation

Data Model Review Framework – Objectives

Category What Needs to Review?.

Modeling Standards & Guidelines

• Ensures correct and consistent Enterprise conceptual, Logical and physical level as per standards & Guidelines.

• Names and abbreviations.

Model Representation

• Parent and child entities placement.• Use of color in grouping or highlighting entities• Relationship lines crossing each other or through unrelated entities• Correct use of subject area.• Ensures the model is arranged to maximize readability and understanding

Physical Design Accuracy

• Ensures that our design is for the real world & specific to application. • Null values consideration.• Use of partitioning.• No or improper indexing and space.• Consideration of Denormalization.

Data Quality • Ensures the design and actual data are in synch with each other.• Determines how well the data elements and their rules match reality.• Avoids costly surprises later on in development.

Data Modeling Design Consideration for MongoDB

Embedded Data Models: With MongoDB, you may embed related data in a single structure or document. These schema are generally known as “denormalized” models, and take advantage of MongoDB’s rich documents.

Normalized Data Models: Normalized data models describe relationships using references between documents.

Atomicity of Write Operations : no single write operation can atomically affect more than one document

Document Growth : Some updates, such as pushing elements to an array or adding new fields, increase a document’s size

Data Use and Performance : When designing a data model, consider how applications will use your database for application only uses recently inserted documents, Or if your application needs are mainly read operations to a collection, adding indexes to support common queries can improve performance

Sharding : MongoDB uses sharding to provide horizontal scaling MongoDB uses the shard key. Selecting the proper shard key has significant implications for performance, and can enable or prevent query isolation and increased write capacity.

Atomicity : No single write operation can change more than one document. Ensure that your application stores all fields with atomic dependency requirements in the same document.

Indexes : Use indexes to improve performance for common queries. Build indexes on fields that appear often in queries and forall operations that return sorted results. MongoDB automatically creates a unique index on the _id field.