View
225
Download
0
Embed Size (px)
Citation preview
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Databasesystemer
Managing Organizational Memory
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Learning objectives
• Kunne forklare begreberne transaktioner, concurrency control, deadlock and recovery.
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
DBMS environments
• Distinctions– Single user versus multi-user– Multiprogramming– Multiprocessing
• The more complex the environment the more effort that must be expended on maintaining integrity
• The desired outcome is a consistent database for which all integrity constraints hold
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Management of organizational memories
Update
Query
Create
Ensuringconfidentiality
Maintainingquality
Protectingexistence
MaintainingIntegrity
MakingAvailable
Goals
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Strategies for data integrity• Protecting existence
– Preventative (isolation)– Remedial (database backup and recovery)
• Maintaining quality– Update authorization– Integrity constraints– Data validation– Concurrent update control
• Ensuring confidentiality– Data access control– Encryption
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Strategies for data integrity
• Legal– Privacy laws
• Administrative– Storing database backups in a locked vault
• Technical– Using the DBMS to enforce referential integrity
constraint
Type of constraint
Explanation Example
TYPE Validating a data item value against a specified data type.
Supplier number is numeric.
SIZE Defining and validating the minimum and maximum size of a data item.
Delivery number must be at least 3 digits and at most 5.
VALUES Providing a list of acceptable values for a data item.
Item colors must match the list provided.
RANGE Providing one or more ranges within which the data item must fall or must NOT fall.
Employee numbers must be in the range 1-100.
PATTERN Providing a pattern of allowable characters which define permissible formats for data values.
Department phone number must be of the form 542-nnnn (stands for exactly four decimal digits).
PROCEDURE
Providing a procedure to be invoked to validate data items.
A delivery must have valid itemname, department, and supplier values before it can be added to the database. (Tables are checked for valid entries.)
CONDITIONAL
Providing one or more conditions to apply against data values.
If item type is ‘Y’, then color is null.
NOT NULL(MANDATORY)
Indicating whether the data item value is mandatory (not null) or optional. The not null option is required for primary keys.
Employee number is mandatory.
UNIQUE Indicating whether stored values for this data item must be unique (unique compared to other values of the item within the same table or record type). The unique option is also required for identifiers.
Supplier number is unique.
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
The concept of a transaction
• A user/user program access/updates data using a high-level query and update language supported by the DBMS.
• Execution of access/update requests (transactions) can be seen as a series of read and writes of database objects.
• You could also consider a transaction a unit of work.
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Four important properties • Atomic: either all actions are carried out or none
are.
• Consistency: A transaction run by itself with no concurrent execution of other transactions must preserve the consistency of the database.
• Isolation: Transactions are isolated, or protected, form the effects of other concurrent transactions.
• Durability: Once the user is informed that a transaction has been successfully completed, its effects should persist even if the system crashes.
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Concurrent update
• The lost data problemTime Action Database record
Part# QuantityP10 40
T1 User A receives paperworkfor a delivery of 80 units of P10
T2 User A reads P10 P10 40
T3 User B sells 20 units of P10
T4 User B reads P10 P10 40
T5 User A processes the delivery(40 + 80 = 120)
T6 User A updates the file P10 120
T7 User B processes the sales(40 - 20 = 20)
T8 User B updates the file P10 20
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Concurrent update
• Avoiding the lost data problemTime Action Database record
Part# QuantityP10 40
T1 User A receives paperworkfor a delivery of 80 units of P10
T2 User A reads P10 P10 40
T3 User B sells 20 units of P10
T4 User B attempts to read P10 denied P10 40
T5 User A processes the delivery(40 + 80 = 120)
T6 User A updates the file P10 120
T7 User B reads P10 P10 120
T8 User B processes the sales(120 - 20 = 100)
T9 User B updates the file P10 100
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Interleaved Execution of Transactions
• The DBMS interleaves the actions of different transactions to improve performance – system throughput and response time.
• Not all interleaving should be allowed.• Two actions on the same data object
conflict if at least one of the actions i a write.
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Serializability
• Executing transactions serially in different orders may produce different results.
• All results are presumed to be acceptable (consistent database state).
• If interleaved execution of transactions are used the DBMS makes no guarantees about which result will be the outcome.
• A serializable schedule is a schedule whose effect is identical to a serial schedule.
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Simple notation
T1 T2
R(A)W(A)
R(B)W(B)
R(C)W(C)
Transaction
Reading A
Commit
Writing A
Abort
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Reading uncommitted Data
T1 T2R(A)W(A)
R(A)W(A) R(B)W(B)Commit
R(B)W(B)Commit
T1 transfers 1000 kr. from A to B.
T2 increments both A and B by 10% interest.
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Unrepeatable Reads
T1 T2
R(A)
R(B)
W(A)
W(A)
T1 wants to increment A by 1.
T2 wants to decrement A by 1.
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Overwriting Uncommitted Data
T1 T2
W(A)
W(B)
The value of A and B must be the same.
T1 write 20.000 to A.
T2 write 30.000 to B.”A blind write”
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Aborted Transactions
• A recoverable schedule is a schedule in which transactions commit only after all transactions whose changes they read have committed.
• A schedule avoids cascading aborts if it only reads data written by already committed transactions.
• A strict schedule only reads or writes data written by already committed transactions.
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Unrecoverable Schedule
T1 T2R(A)W(A)
R(A)W(A) R(B)W(B)Commit
Abort
T1 transfers 1000 kr. from A to B.
T2 increments both A and B by 10% interest.
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Lock-based Concurrency Control• A DBMS must ensure that only serializable, recoverable
schedules are allowed, and that no result of committed transactions are lost undoing aborted transactions.
• A locking protocol is used to achieve this. It is a set of rules to be followed by transactions, to ensure that the net effect of interleaved transactions is identical to a serial order of the transactions.
• A lock manager is part of the DBMS and maintains a lock table with lock table entries, and a transaction table with pointers to a list of locks held by the transactions.
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Strict Two-Phase Locking
1. If a transaction wants to read/modify an object, it first requests a shared/exclusive lock on the object.
2. All locks held by a transaction are released when the transaction are completed.
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Serial/Interleaved Execution
T1 T2X(A)R(A)W(A)
X(A)R(A)W(A) X(B)R(B)W(B)Commit
Commit
T1 T2S(A)R(A)W(A)
S(A)R(A)X(B)R(B)W(B)Commit
Commit
X(C)R(C)W(C)
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Concurrent update• The deadly embrace
– User A’s update transaction locks record 1
– User B’s update transaction locks record 2
– User A attempts to read record 2 for update
– User B attempts to read record 1 for updateUpdate transaction
(User A)
Update transaction(User B)
Record 1
Record 2
Lock record 11
Lock record 22
Attempt to lock record 13
4 Attempt to lock record 2
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Deadlocks
• Deadlock Prevention– Requires users to lock in advance all records
they will require.
• Deadlock Detection– DBMS must periodically check for deadlocks. – A structure called a wait-for graph is used.
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Transaction failure and recovery
• Program error
• Action by the transaction manager
• Self-abort
• System failure
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Recovery strategies
• Switch to a duplicate database– RAID technology approach
• Backup recovery or rollback– Return to prior state by applying before-images
• Forward recovery or rollforward– Recreate by applying after-images to prior backup
• Reprocess transactions
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Crash Recovery• Analysis phase
– Determines: the point in the log to start the redo, pages in the buffer pool at the time of the crash and transactions active at the time of the crash.
• Redo phase– Repeats history by transforming the database into its state before
the crash.• Undo phase
– Undo transactions that are aborted since they were active at the time of the crash.
• Checkpoints– To be able to recover without reading the complete log,
periodically snapshots of the database is made.
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Potential backup procedures
Getrecord
Retrievedrecord
Processrecord
Updatedrecord
CPU Log updatetransaction
Log before imageof record
Log after imageof record
Outputmessage
Updatetransaction
8
9
4
5
2
1
7
Writeupdatedrecord
Obtainrecord
6
3
Periodicdatabasebackup
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Integrity constraints
Example Explanation
CREATE TABLE STOCK
(STKCODE CHAR(3) NOT NULL,
…,
NATCODE CHAR(3),
PRIMARY KEY(STKCODE),
FOREIGN KEY FKNATION(NATCODE)
REFERENCES NATION
ON DELETE RESRICT)
STKCODE must always be assigned a value of 3 or less alphanumeric characters. STKCODE must be unique because it is a primary key.NATCODE must be assigned a value of 3 or less alphanumeric characters. NATCODE must exist as the primary key of the table NATION.Do not allow the deletion of a row in NATION while there still exist rows in STOCK containing the corresponding value of NATCODE.
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
A general model of data security
Identificationchecked
Authorizationchecked
Dataretrieved
Encryptionprocessing Database
User profilesand
authorizationtables
User
Userid
DBMS access denied
Identification data
User privilegesdata
DBMS access approved
Retrieval request
Request denied
Results of request
Request approved
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Authenticating mechanisms
• Information remembered by the person– Name– Account number– Password
• Object possessed by the person– Badge– Plastic card– Key
• Personal characteristic– Fingerprint– Signature– Voiceprint– Hand size
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Authorization tables
• Indicate authority of each user or groupSubject/Client Action Object Constraint
Accounting department Insert Supplier record None
Purchase department clerk Insert Supplier record If quantity < 200
Purchase department supervisor
Insert Delivery record If quantity ≥ 200
Production department Read Delivery record None
Todd Modify Item record Type and color only
Order processing program Modify Sale record None
Brier Delete Supplier record None
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
SQL authorization
• Grant– Giving privileges to users
• Revoke– Removing privileges
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Firewall
• A device placed between an organization’s network and the Internet
• Monitors and controls traffic between the Internet and Intranet
• Approaches– Restrict packets to those with designated IP
addresses– Restrict access to applications
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Encryption
• Encryption is as old as writing• Sensitive information needs to be remain
secure• Critical to electronic commerce• Encryption hides the meaning of a message• Decryption reveals the meaning of an
encrypted message
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Public key encryption
DecryptEncrypt
Receiver’spublic key
Receiver’sprivate key
Sender Receiver
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Signing
• Message authentication
VerifySign
Sender’sprivate key
Sender’spublic key
Sender Receiver
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Monitoring activity
• Audit trail analysis– Time and date stamp all transactions
• Monitor a sequence of queries– Tracker queries
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Data administration is the common term applied to the task of managing
organizational memory.
Data Administration
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Data management problems
• Redundancy
• Inconsistent representations
• Multiple definitions of data items
• Essential data missing
• Inaccurate or incomplete data
• Uncaptured data
• Data that cannot be located
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Goals of data management
• Enable users to access the data they need in the most suitable format
• Maintain data integrity
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Management of the database environment
Data Administration(project level support)
Data Administration(system level support)
DataDictionary/DirectorySystem(DD/DS)
DatabaseManagement
System(DBMS)
User-Systeminterfaces
Multipledatabases
ExternalDBMS
Database development
cycle
Feasibility analysisDevelop a project implementation planDevelop data standardsEstablish data stewards
Project planning
1
Identify data requirements
Requirementsdefinition
2
Develop a data modelSpecify data integrity controlsSpecify test proceduresDesign
3
Map the data model to the DBMSEstablish and test data integrity controlsAccess control and securityIntegrity constraints and data validation rulesBackup and recovery proceduresCreate and load the test databaseTest database operation and integrity controls
Testing
4
Implement data integrity controlsCreate and load databaseTrain usersImplementation
5
Monitor database performanceTune and reorganize database as neededEnforce data standards and policiesSupport users: consulting, informing, and training
Use
6
Plan and implement growthImplement change control procedures
Evolution
7
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Development roles
Database Development Phase
Database Developer
Data Administrator
User
Project planning Does Consults Provides information
Requirements definition
Does Consults Provides requirements
Database design Does Consults
Data integrity
Validates data models
Database testing System and user testing
Consults
Data integrity
Does user testing
Database implementation
System related activities
Consults
Data integrity
Does user activities
Database usage Consults Data integrity monitoring
Uses
Database evolution Does Change control Provides additional requirements
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Data administration tools
Database development phase
Data Dictionary (DD) Database Management System (DBMS)
Performance monitoring
Case tools
1. Project planning Document Data mapDesign aid
Estimation tools
2. Requirements definition
Document Design aid DocumentDesign aid
3. Database design DocumentDesign aidData mapSchema generator
DocumentDesign aidData map
4.Database testing Data mapDesign aidSchema generator
Define, create, test, data integrity
Impact analysis Test data generatorDesign aid
5.Database implementation
DocumentChange control
Data integrityImplement Design
MonitorTune
6. Database use Document Data mapSchema generatorChange control
Provide tools for retrieval and updateEnforce integrity controls and procedures
Monitor Tune
7. Database evolution DocumentData mapChange control
Redefine Impact analysis
Lene Pries-Heje Managing Organizational memory
Databasesystemer E2002
Use of the data dictionary
• Documentation support
• Data maps
• Design aid
• Schema generation
• Change control