Upload
elliando-dias
View
1.892
Download
1
Embed Size (px)
DESCRIPTION
Citation preview
Building the Agile Database
Larry Burns
Consultant
PACCAR Data Services
2
What does “agility” mean?
The ability to respond quickly and effectively to changes in business requirements and new technology.
The agile approach is characterized by an emphasis on personal interaction and collaboration, determining people’s needs, and working quickly to meet those needs.
3
How does AD work?
The goal of Agile Development (AD) is to quickly produce solutions that are “good enough” (meeting 80% of the requirements).
Software development occurs continuously and iteratively, with new releases taking place every 2-6 weeks.
Continuous testing is built into the development process.
4
5
Essential concepts of AD
Agile Development emphasizes:– Individuals and interactions over processes and
tools (collaboration and teamwork).– Working software over comprehensive
documentation (“just enough” process and documentation to get the job done).
– Customer collaboration over contract negotiation– Flexible response to change over fixed plans.
6
Benefits of AD
Allows you to speed up “time to market”, and take advantage of narrower “windows of opportunity”.
Allows requirements to be adjusted as the product is developed
Eliminates the waste of developing features that aren’t needed
7
Benefits of AD
Reduces the risk of project failure Problems and risks can be surfaced early Testing is integrated into the development
process Final product is more in-line with current user
requirements Reduces the risk of outsourcing
8
What does “agility” imply?
The ability to reuse application components (and data) is essential to AD.
The ability to design and build loosely-coupled systems is essential to AD.
The ability to automate routine tasks is essential to AD.
9
What does “agility” imply?
The ability to create policy-based (rule-based) components is essential to AD.
The ability to enhance processes based on experience is essential to AD.
The ability to recognize that problems and exceptions will occur, and to empower people to handle them is essential to AD.
10
What does “agility” imply?
The ability to generalize (i.e., understand areas outside your particular domain) as well as specialize (within your domain) is essential to AD.
A customer service mindset and positive (“can do”) attitude are essential to AD.
11
Critical Issues for AD
Reusability. Coding for reuse takes 50-100% longer, and most developers don’t do this. But DBAs have to – it’s our job!
Quality. Application errors are easier to detect and fix (or work around) than data errors. Quality directly affects reusability. It also affects ROI (“time to money”)!
Waste. AD methods can generate large amounts of “scrap and rework”.
12
Critical Issues for AD
Resources. AD works best when resources are 100% dedicated to a single project, but DBAs have to support multiple projects. Also, AD is more resource-intensive than other methodologies.
Focus. Development focus is on a single application; DBA focus is on designing and building an infrastructure that meets current and future data needs.
13
Critical Issues for AD
Maintainability. Somebody has to maintain the application after it’s written; maintenance expense far exceeds development expense.
Personnel. AD projects involve long hours, frequent requirements changes, intensive collaboration and lots of stress. This may be a difficult adjustment for data professionals not used to this sort of work environment.
14
Principles of data management
Reusability – Ability to reuse data for multiple applications and multiple business purposes (e.g., quality improvement, business process improvement, customer relationship management, strategic planning, etc.). Important: Organizations have data needs outside
of the data requirements of particular applications!
15
Principles of data management
Integrity – ensuring that data always has a valid business meaning and value, and always reflects a valid state of the business. Data should also be, as much as possible, self-monitoring and self-correcting.
16
Principles of data management
Security – Ensure true and accurate data is always available to authorized persons, but only to authorized persons. We also want to make sure that the privacy concerns of all our stakeholders – including our customers, partners, and government regulators – are met.
17
Principles of data management
Performance and Ease of Use: ensuring quick and easy access to data by approved users in a usable and business-relevant form, maximizing the business value of both our applications and our data, and improving our relationships with our customers and business users.
18
Principles of data management
Low Cost of Maintenance: ensure that all data work is done at a cost that yields value; that the cost of creating, using, and disposing of data doesn’t exceed its value to the business. We also want to ensure the fastest possible response to changes in business processes and new business requirements .
19
Principles of data management
Performance and Ease of Use Reusability Integrity Security Maintainability
20
Agile Data Management
Design and build highly-cohesive, loosely coupled (i.e., normalized) data structures
Make data available in application-friendly, non-normalized forms (e.g., views)
Abstract and encapsulate database functionality – eliminate coupling
Refactor at a virtual level, not at the database schema level
21
Agile Data Management
Learn to manage non-relational data Make the database do the data work
(the n-tier approach) Automate as much of the database
development process as possible Learn to collaborate Learn to work iteratively (within reason!) Develop a customer service mindset
22
Abstraction and Encapsulation
Abstraction– Identifying the “what”
Encapsulation– Packaging the “how”
Present an easy-to-use interface that enables the “what” and hides the “how”
Exs: light switch, car dashboard
23
Database Abstraction
– Fundamental Stored Procedures (FSPs)– Data Access Components / Layers– Data Integration Web Services– Views– Work Tables
24
Database Abstraction
– ADO.NET Datasets – Stored Procedures– Triggers– User-defined datatypes (UDTs) – User-defined functions (UFTs)
25
Database Issues
Performance Maintainability Portability Refactoring
26
Database Issues
Performance– Limit the scope of views– Wrap views in parameterized procedures or
functions– Use work tables in the database for
denormalization (read-only)– Use replication as necessary– Create reporting databases or data marts
27
Database Issues
Maintainability– Views and wrapper procedures/functions don’t
require much (if any) maintenance– New code can be written as needed– Good idea to document all code, including the
application it was written for, and all known applications that use it
– Make sure all procedure code is testable
28
Database Issues
Portability– Usually not an issue (except for commercial
software packages)– The database code has to go somewhere!– Database code performs better in the DBMS– Migration is usually not that difficult (and lots of
help is available from vendors and user groups!)– Cost of migrating is far exceeded by the economic
benefit of data reuse and data quality
29
Database Issues
Refactoring– Much easier (and less costly) to refactor at the
virtual level, not the base schema level.– Denormalizing too early can mask key data and
lead to data corruption, making future refactoring impossible!
– Denormalizing can complicate queries and lead to performance problems!
30
TaskID IDENTITY not null
TaskTitle varchar null
TaskDesc varchar null
ProjectNo int null
ProjectMgr varchar null
EmployeeNo int null
EmployeeName varchar null
WeekNo tinyint null
EstHours1 smallint null
ActHours1 smallint null
EstHours2 smallint null
ActHours2 smallint null
etc, etc, etc… smallint null
EstHours7 smallint null
ActHours7 smallint null
OverTimeHours smallint null
Tasks
31
Event
EventID IDENTITY
EventTypeCode smallint
EventType1Key int
EventType2Key int
EventType3Key int
…etc. etc. etc.
EventDateTime datetime
EventDesc varchar
32
The parameter list for your access procedure will have to look like this:
CREATE PROCEDURE csEventProcedure (@EventTypeCode smallint, @EventType1Key int = null, @EventType2Key int = null, @EventType3Key int = null…)
And the WHERE clause for the SELECT will have to look something like this:
WHERE (@EventType1Key IS NOT NULL AND @EventType1Key = Event.EventType1Key) OR (@EventType2Key IS NOT NULL AND @EventType2Key = Event.EventType2Key) OR (@EventType3Key IS NOT NULL AND @EventType3Key = Event.EventType3Key) …Or perhaps like this:
WHERE (@EventTypeCode = 1 AND @EventType1Key = Event.EventType1Key) OR (@EventTypeCode = 2 AND @EventType2Key = Event.EventType2Key) OR (@EventTypeCode = 3 AND @EventType3Key = Event.EventType3Key)
33
Non-Database Issues
Preferred approach (Engineer vs. Artist) Perspective (Enterprise vs. Application)
– It’s about the business!
Architectural Myopia– Process-only view– Data-only view– Information Systems view
34
Resolving the Conflicts
Have a system of checks and balances in place:– Architecture group– Project Management group– Quality group– Data group– Application groups (dev. & maint.)– Business and IT management
35
Resolving the Conflicts
Commit to finding a workable solution:– Understand each group’s concerns– Accept the inevitable “trial and error”– Maintain an “agile attitude”– Focus on maintaining positive working
relationships
36
Resolving the Conflicts
Negotiate compromises:– Data group involvement in req’s gathering and
analysis– Physical database design review (to promote
opportunities for data virtualization)– Data group commits to supporting an iterative
approach
37
Bonus Slides
Approaches for data virtualization Examples Developing an “Agile Attitude”
38
Fundamental Stored Procedures
– Handle transaction control – Perform error handling– Enforce security– Maintain supertype/subtype relationships
39
Fundamental Stored Procedures
– Handle concurrency control via timestamp checking
– Provide multi-language text support– Automatically generated from DB schema
40
Fundamental Stored Procedures
FSP Example – Table Definition Examples of FSPs
41
Data Access Component
– Automatically handles dataset and datatable updates using FSPs
– Creates, populates and executes ADO.NET objects for queries, procedures, typed datasets, connections, transactions, etc.
42
Data Access Component
– Maintains database timestamps and uses them for updates
– Uses a few simple overloaded functions (CreateConnection, GetData, UpdateData, etc.)
– Supports parameterized SQL queries– Works with any RDBMS
43
Data Access Component
DAC Methods
44
Data Integration Web Services
– Abstracts data combined from multiple sources
– Decouples applications from data sources– Makes data more easily transportable and
consumable
45Data Synchronization Integration Diagram
MainframeDatabases
(includes event queue
tables)
CICS
Reformat as XML
Application Server
CICS Trans
Web Service 1
Web Method A
Web Method C
Web Method D
Web Method B
Web Method E
Web Method F
Web Method G
Web Method H
Web Method I
SQL Database
Integration Server
CICS Listener
TCP to MSMQ
MSMQ
MessageBroker
Web Service 2
Web Method J Met
adat
a
CICS Trans
CICS Client
46
Joined Views
– Create an application-specific view of data, enabling a database to support multiple applications
– Developers don’t have to code complex SQL joins
– Results in greatly improved performance– Views can be optimized and indexed
47
Joined Views
– Views can map directly to application objects
– Can join relational and XML data– Can enforce security– Allows different users to have different
views of the data– Can be used to support encryption
48
Joined Views
– Can support customized application trigger code via “Instead-Of” triggers
– Data fields can be given user-friendly names
– Column widths and datatypes can be changed from standard classword format
– Supports data conversion and reformatting
49
Joined Views
– Encapsulates data access without introducing coupling (denormalization) or diminishing cohesion in the database
– Decouples application from database schema
– Can be developed incrementally as the application develops, supporting a true “agile” approach!
50
Task
TaskIdentifier [IDENTITY]
TaskDescription [varchar(2000)]ProjectIdentifier [int – FK]AccountingCode [char(4) – FK]OvertimeApprovedIndicator [bit]TaskEnteredDateTime [datetime]
Timestamp [timestamp]
TaskStartDateTime [datetime]TaskEndDateTime [datetime]
Account
AccountingCode [char(4)]
AccountDescription [varchar(75)]Timestamp [timestamp]
Employee
EmployeeIdentifier [IDENTITY]
EmployeeLastName [varchar(75)]EmployeeFirstName [varchar(75)]
Timestamp [timestamp]
EmployeePhoneNo [varchar(12)]EmployeeEmail [varchar(255)]
OTHoursToDate [decimal]
ProjectDescription [varchar(2000)]
Project
ProjectIdentifier [IDENTITY]
ProjectMgrEmployeeID [int – FK]Timestamp [timestamp]
ProjectDescription [varchar(75)]
TaskAssignment
AssignmentStartDate [datetime]
Timestamp [timestamp]
TaskIdentifier [int – FK]EmployeeIdentifier [int – FK]
ScheduledEndDate [datetime]HoursWorkedToDate [decimal]OTHoursToDate [decimal]
Normalized application tables
51
EmployeeTasks
Account [varchar(75)]OTApproved [char(3)]
HoursToDate [decimal]
StartDate [char(10)]EndDate [char(10)]
EmpName [varchar(120)]Project [varchar(75)]ProjectMgr [varchar(120)]Task [varchar(75)]
OverTime [decimal]
Customized application view
52SQL code to create the view
CREATE VIEW EmployeeTasks (EmpName, Project, ProjectMgr, Task, Account, OTApproved, StartDate, EndDate, HoursToDate, OverTime) AS SELECT CONVERT(varchar(120), emp.EmployeeFirstName + ‘ ‘ + emp.EmployeeLastName), proj.ProjectDescription, CONVERT(varchar(120), emp2.EmployeeFirstName + ‘ ‘ + emp2.EmployeeLastName), CONVERT(varchar(75), task.TaskDescription), acct.AccountDescription, CASE task.OvertimeApprovedIndicator WHEN 1 THEN ‘Yes’ ELSE ‘No’ END, CONVERT(varchar, ta.AssignmentStartDate, 101), CONVERT(varchar, ta.AssignmentEndDate, 101), ta.OTHoursToDate FROM TaskAssignment ta INNER JOIN Task task
ON ta.TaskIdentifier = task.TaskIdentifier INNER JOIN Employee emp ON ta.EmployeeIdentifier = emp.EmployeeIdentifier INNER JOIN Project proj ON task.ProjectIdentifier = proj.ProjectIdentifier INNER JOIN Employee emp2
ON proj.ProjectMgrEmployeeID = emp2.EmployeeIdentifier INNER JOIN Account acct ON task.AccountingCode = acct.AccountingCode
53
EmployeeTasks
Account [varchar(75)]OTApproved [char(3)]
HoursToDate [decimal]
StartDate [char(10)]EndDate [char(10)]
EmpName [varchar(120)]Project [varchar(75)]ProjectMgr [varchar(120)]Task [varchar(75)]
OverTime [decimal]
Mapping Object to View
EmployeeTask
Account [varchar(75)]OTApproved [char(3)]
HoursToDate [decimal]
StartDate [char(10)]EndDate [char(10)]
EmpName [varchar(120)]Project [varchar(75)]ProjectMgr [varchar(120)]Task [varchar(75)]
OverTime [decimal]
AssignTask (EmpName, Project, Task, StartDate, EndDate)CompleteTask(EmpName, Project, Task)ApproveOT (EmpName, Project, Task, ProjectMgr)
54
Work Tables
– Allow pre-joining of data without normalizing base tables
– Can improve application performance– Useful for unpacking recursive data to
support applications (and views)– Impose a maintenance burden, so use
sparingly and carefully!
55
Work Tables
– Are generally application-specific– Need to manage redundancy; base tables
contain the “data of record”– Can be updated transactionally (from
application procedure call) or periodically (via a scheduled process)
56
ADO.NET
– ADO.NET datasets can be updated using stored procedures
– XML can easily be converted to dataset form for updating
– LINQ will provide the ability to create updateable encapsulation objects in .NET
57
ADO.NET
Example of .NET Dataset Updating
58
Stored Procedures
– Can encapsulate data-specific application or business processes
– Results in greatly improved performance– Reduce network traffic
59
Stored Procedures
– Makes debugging, performance tuning and maintenance easier
– Can be used to enforce security– Should be testable and reusable!
60
Stored Procedures
Sample Application View Sample Wrapper Procedure
61
Triggers
– Can encapsulate data-specific application or business processes
– Useful for complex and cross-database RI checking and updating
– “Instead Of” triggers can be used to map updates on views to underlying base tables
62
Triggers
– Can be used to support auditing of database updates
– Can send messages to applications and invoke application objects
– Can use CLR code in triggers to replace extended stored procedures and OLE Automation
63
Triggers
Example: sample application view Instead-Of trigger on application view Example: database audit trigger
64
User-defined Datatypes
– Scalar UDTs can be used to help enforce domain constraints
– Object UDTs can be used to create complex data structures that map more readily to application objects (Oracle Jpublisher; MS LINQ)
– XML UDTs support hierarchical data and can enable relational data to be more easily accessed by web services
65
User-defined Functions
– Useful for managing UDTs– Useful for datatype conversion and data
reformatting– Are a useful wrapper for views– Cannot be used for updating– Cannot display or print in functions
66
User-defined Functions
– Can make relational data look like XML (and vice-versa)
– Can only call other functions and extended stored procedures from functions
– SQL code in functions is NOT optimized; may cause performance problems in joins
67
User-defined Functions
Sample application view Function to get application data Procedure to update application data Sample execution Output from sample execution
68
User-defined Functions (cont’d)
Function to consolidate data records Function to return consolidated records Function to parse character map Sample execution of parsing functions Output from execution
69
User-defined Functions (cont’d)
Function to return data as XML Procedure to parse XML to table Sample execution Output from sample execution
70
Merging SQL and XML
Merging SQL and XML Example 1 Output from Example 1 Merging SQL and XML Example 2 Output from Example 2
71
Developing an “Agile Attitude”
Make using the database (and developing applications for databases) as quick, easy, and painless as possible.
Stay business-focused; the objective is meeting the business requirements and deriving the maximum business value from the project.
72
Developing an “Agile Attitude”
Adopt a “can do” attitude, and be as helpful as possible.
Don’t let database standards become a threat to the success of a project. Accept any defeats and failures encountered during a project as “lessons learned”, that can be applied to future projects.
73
Developing an “Agile Attitude”
Communicate with people on their level, and in their terms.
Concentrate on solving other people’s problems, not your own.
74
Developing an “Agile Attitude”
Learn as much as possible about what your developers and business users do, and how and why they do it. Learn to become more of a “generalist”; this adds to your value.
Be flexible, and open to new ideas and new ways of doing things. But make sure that the things that need doing get done.
75
Bio and Contact Information
Larry Burns has worked in IT for more than 25 years as a database administrator, application developer, consultant and teacher. He holds a B.S. in Mathematics from the University of Washington and a Masters degree in Software Engineering from Seattle University. He currently works for Paccar ITD Data Services as a database consultant on numerous application development projects, and teaches a series of data management classes for application developers. He has been an instructor and advisor in the certificate program for Data Resource Management at the University of Washington in Seattle. You can contact him at [email protected].