Databases and SQLite · † sqlite3 has been part of the Python library since 2.5 † sqlite3 provides connection and cursor objects to manage databases 26. Melbourne Linux Users

Databases and SQLite

Steve BrownMelbourne Linux Users Group

Programming SIGhttp://www.mlinux.org

17 February 2009

Melbourne Linux Users Group 17 February 2009

Outline

• What are databases and why?

• SQL

• SQLite

• Java bindings for SQLite

• Python bindings for SQLite

1


Definition of Database

• A database is a structured, persistent collection of data records storedon a computer system

• The software that implements and controls a database is called a databasemanagement system (DBMS).

• The organization of data in a database is described by a database modelor database schema.

2


Examples of Database Models

Source: http://en.wikipedia.org/wiki/File:Database models.jpg

3


Some Types of Databases

• Flat (or table)– Defines data as a two-dimensional array of records (rows) and columns

– Each column has a name and a data type

• Hierarchical– Defines data in a tree-like structure of branches and leaves

– Examples: filesystems, XML documents

• Object– These attempt to match the database model more closely to object-oriented

programming styles

– There are several approaches (e.g. object relational databases, object-oriented

databases) which are not as yet well standardized

• Relational database– The most common type of database in use—more on these coming up

4


Ancient History

• Databases began to be used in the 1960’s, as computers came to beapplied to business applications

• Most early DBMS’ were network or hierarchical model—most were tightlycoupled to the underlying data, to make efficient use of scarce resources

• In 1970, Edgar Codd at IBM developed the concept of the relationaldatabase, which represented data as a set of tables of fixed lengthrecords, together with relationships between the tables implemented askeys (values appearing in multiple tables)

• The concept was very general, and quickly caught on

• The first popular RDBMS (relational DBMS) for microcomputers wasdBase II for CP/M and later DOS– This was followed by many others

5


Relational Databases

• The best way to store data for efficient searching and retrieval is to usefixed length records

• But fixed length records aren’t efficient for data with optional fields, orvariable length records

• Codd realized that tables could be tied together with keys, commonfields in multiple tables, so that the database could be both efficientlyaccessed and efficiently stored

• Codd used tuple calculus to show that this kind of database model coulddo insertions, updates, etc. and still allow efficent lookups

6


Structured English Query Language

• IBM worked on a prototype RDBMS (“System R”) soon after Coddpublished his paper

• In the course of this, they came up with a language for describing dataretrieval and insertion operations

• This was initially called SEQUEL, but later changed to SQL

• Eventually, this became a standard language for interacting withdatabases– Strictly speaking, an SQL database is not a exactly relational database– But now, when most people say ’relational database’, they mean an

SQL database

7


How It Works

• A relational database comprises a set of tables with named columns(formally, the set of fields [i.e. columns] in a table is sometimes called arelvar)

• Some of the columns [fields] belong to more than one table [relvar]

• These columns [fields] are called keys

• Every key must be unique in at least one table [relvar], i.e. no two rowsof the table can have the same value for this field– This is sometimes called a candidate key in the table for which it is

unique, and a foreign key in other tables

• Candidate/foreign key columns implement the relationship between tables

8


Database Integrity

RDBMS usually will enforce rules on the database models to help insuredata integrity and efficient access

• Every table has at least one unique key, called a primary key (enforcingthe rule that every table row is unique)

• Every foreign key value must occur as a candidate key value in someother table (enforcing a one-to-many relationship)

• Foreign keys may also be specified as unique (enforcing a one-to-one-or-none relationship)

• Not all keys implement relationships—some keys are used for efficientaccess

9


Database Relations

Recipe_IDPreparation Recipe_ID

One−to−Many Relation

ServesTitle Quantity Unit Ingredient

10


Database Relations

Cat_id NameCat_idModel Link

One−to−one linkage

11


Database Relations

Many−to−one linkage

Lnk_IDLink URLLnk_ID Tag

12


Database Relations

Many−to−many linkage

RecIDName LinkTagID Tag

RecID TagID

13


SQL

SQL expressions fall in several different categories:

• Queries for retrieving

• Data Manipulation Language (DML) statements, for inserting, deleting,or modifying the data

• Data Description Language (DDL) statements for creating, destroying,or modifying tables

• Data Control Language (DCL) statements for controlling access rights

• Comments

14


SQL Queries

• SQL queries return data from the database

• Queries begin with the SELECT keyword and can include a wide range ofqualifiersSELECT FirstName,LastName FROM Patients WHERE Age>35;

• Queries can also perform joins, which are data sets spanning two or moretables which have some relationshipSELECT Ingredients.Quantity,Ingredients.StuffFROM Ingredients,RecipesWHERE Recipes.Name=’lasagna’ AND Recipes.RecID=Ingredients.RecID

• Queries can be become very complex

15


Results Sets

• Queries return result sets, which are like tables (they have columns orrows

• Depending on the RDBMS implementation and the language binding, aresult set can be returned as a temporary table or as a view

• A view is structure which can be iterated with a cursor, allowing accessto a single row of the result set at a time

• Some views are updatable—changes to the view will propagate back tothe underlying table

16


Data Manipulation Language

• DML instructions can add rows to a tableINSERT INTO Patients(FirstName,LastName,Age)VALUES (’Fred’,’Flintstone’,29);

• update rows in a tableUPDATE Patients SET Age=29WHERE FirstName=’Fred’ AND LastName=’Flintstone’;

• and delete rows from a tableDELETE FROM PatientsWHERE FirstName=’Fred’ AND LastName=’Flintstone’;

17


DML Transactions

• There is no “undo” command for the DML, but some complicateddatabases require that changes in many places all be made together tokeep the database consistent

• To support this, SQL databases usually implement transaction control:BEGIN TRANSACTION; -- dialect warning

UPDATE Accounts SET Balance=Balance+50WHERE AccountNumber=342;

UPDATE Accounts SET Balance=Balance-50WHERE AccountNumber=117;

COMMIT TRANSACTION; -- or ROLLBACK TRANSACTION

• Grouping DML into transactions can also be more efficient in some cases

18


Data Description Language

• DCL commands create databases, tablesCREATE TABLE Recipes (Title TEXT, Serves INT,Prep TEXT, RecID INTEGER PRIMARY KEY);

• delete tablesDROP TABLE Recipes

• change the data modelALTER TABLE Recipes DROP COLUMN Serves

• and other operations depending on implementation

19


Data Control Language

• DCL commands control access to the table

• Depending on implementation, access control can usually apply to specificcommands, users, and hosts

20


Client/Server Database Implementations

21


Advantages to Client/Server Paradigm

• Scalability

• Portability

• Separates application from database schema

• Facilitates unit testing

• Enables WWW applications

22


SQLite

• Flat-file implementation

• Implements large subset of SQL

• NOT a client/server-type implementation

23


Advantages to SQLite

• Low overhead, lightweight implementation

• Persistent, sophisticated data storage

• Environment for application prototyping, testing database schema

24


Java Binding for SQLite

• Java provides a standard object interface for talking to RDBMSes calledJDBCTM

• JDBC requires a driver object specific to the RDBMS

• SQLiteJDBC is a SQLite driver for JDBCTM

25


Python Binding for SQLite

• sqlite3 has been part of the Python library since 2.5

• sqlite3 provides connection and cursor objects to manage databases

26


Potential Problems with SQLite

• Scaling

• Database locking

• Dialect differences with other RDBMS

27


SQL Shortcomings

• Lack of standardization

• APIs don’t insulate programmer from SQL syntax– Note that both the Java and Python RDBMS API still require the programmer to

write SQL

• Complex queries

28


Pitfalls with Databases

• Optimizing data models

• Overhead

• Poor fit with application language

29


References

• SQLite: http://www.sqlite.org/

• Java JDBC:http://java.sun.com/javase/6/docs/api/java/sql/package-summary.html

• Java JDBC tutorial:http://java.sun.com/docs/books/tutorial/jdbc/index.html

• SQLiteJDBC: http://zentus.com/sqlitejdbc/

• Python/sqlite: http://docs.python.org/library/sqlite3.htmlhttp://oss.itsystementwicklung.de/trac/pysqlite/wiki/CodeSnippets

30

Documents

Databases and SQLite · † sqlite3 has been part of the Python library since 2.5 † sqlite3 provides connection and cursor objects to manage databases 26. Melbourne Linux Users