Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
course:
Database Applications (NDBI026) WS2018/19
RNDr. Michal Kopecký, Ph.D. Department of Software Engineering, Faculty of Mathematics and Physics, Charles University in Prague
M. Kopecký SQL Language - Introduction (NDBI026, Lect. 1) 2
student duties final DB application
▪ DB layers ▪ Application layer in DB (procedures/functions/triggers, etc.) ▪ Either in Oracle or MS SQL database
Attendance recommended (but not mandatory) ▪ the slides alone are not comprehensive ▪ other sources
▪ manuals: http://docs.oracle.com/cd/E11882_01/index.htm http://www.orafaq.com/ http://technet.microsoft.com/en-us/library/bb545450.aspx
web: http://www.ms.mff.cuni.cz/~kopecky/teaching/ndbi026/
It is about (knowledge of theory from course NDBI025 is supposed)
Practical database development against given database server
What take into account
▪ During creation of the DB schema
▪ When SQL query are written
▪ Optimization
▪ Indexes
▪ Execution plans
▪ In multi-user environment
▪ Locking
▪ Transaction processing
▪ For data security
M. Kopecký SQL Language - Introduction (NDBI026, Lect. 1) 3
Other topics are subject to follow-up courses Database languages I, II
Datalog
Oracle and MS SQL Server administration
Transactions
Stochastic methods in databases
Searching the web and multimedia databases
Retrieval of multimedia content on the web
XML technology
NoSQL databases
M. Kopecký SQL Language - Introduction (NDBI026, Lect. 1) 4
Relational Model
Currently main platform for OLTP/OLAP
Query optimization
Indexing and correct query formulation can affect the execution time in many orders
Multi-user environment
Not correctly implemented application can cause incorrect data processing and strange results
M. Kopecký SQL Language - Introduction (NDBI026, Lect. 1) 5
Procedural extension
Triggers - extended integrity constraint checking
Procedures and functions – application logic
Object-oriented extensions
User-defined types
Nested tables
Full-text extensions XML data processing
M. Kopecký SQL Language - Introduction (NDBI026, Lect. 1) 6
RDBMS Oracle 11g Object-relational database
Support for server-side code execution in languages:
▪ PL/SQL
▪ Java
▪ C/C++ (any .dll/.so library)
XML support, multi-media support, …
RDBMS MS SQL 2008 R2 Object-relational database
Support for server-side code execution in languages:
▪ T-SQL
▪ C#
XML support, text-search support, …
M. Kopecký 7 SQL Language - Introduction (NDBI026, Lect. 1)
SQL standards and implementations SELECT statement Embedded functions
M. Kopecký SQL Language - Introduction (NDBI026, Lect. 1) 8
Structured query language
Standard language for access to (relational) databases
Originally ambitions to provide “natural language” (that’s why, e.g., SELECT is so complex – a single phrase)
Different subsets of statements
Data definition language (DDL)
▪ CREATE/ALTER – creation and altering of relational (table) schemas
▪ Definition of integrity constraints
Data manipulation language (DML)
▪ Querying
▪ Data insertion, Deletion, Updating
Transaction management
Administration
M. Kopecký SQL Language - Introduction (NDBI026, Lect. 1) 9
Standards ANSI/ISO SQL 86, 89, 92, 1999, 2003
(backwards compatible)
Commercial systems implement SQL at different standard
level (most often SQL 99, 2003)
Unfortunately, not strict implementation ▪ Lot of extra nonstandard features supported
▪ Some standard ones not supported
Specific extensions for procedural, transactional and other functionality
▪ TRANSACT-SQL (Microsoft SQL Server)
▪ PL/SQL (Oracle)
M. Kopecký SQL Language - Introduction (NDBI026, Lect. 1) 10
SQL 86 – first „shot“, intersection of IBM SQL implementations
SQL 89 – small revision triggered by industry, many details left for 3rd parties
SQL 92 – stronger language, specification 6x longer than for SQL 86/89
schema modification, tables with metadata, inner joins, cascade deletes/updates based on foreign keys, set operations, transactions, cursors, exceptions
four subversions – Entry, Transitional, Intermediate, Full
SQL 1999 – many new features, e.g.,
object-relational extensions
types STRING, BOOLEAN, REF, ARRAY, types for full-text, images, spatial data
triggers, roles, programming language, regular expressions, recursive queries, etc.
SQL 2003 – further extensions, e.g., XML management, autonumbers, std. sequences, but also type BIT removed
M. Kopecký SQL Language - Introduction (NDBI026, Lect. 1) 11
SQL-86 SQL-92
ANSI/SQL2;ISO/IEC 9075:1992
Entry,
Intermediate,
Full SQL-99
ANSI/ISO/IEC 9075:1999
SQL-2003 ISO/IEC 9075:2003
…
86
99
92
M. Kopecký 12 SQL Language - Introduction (NDBI026, Lect. 1)
Individual database servers not strictly follow standards Usually SQL-92 Entry
Lot of non-portable extensions ▪ Strong vendor-lock
Not all features implemented according to ANSI
▪ Newer versions have better compatibility
▪ Usually exist both native and ANSI versions side-by-side
86
99
92
Common SQL-92 compatible RDBMS
M. Kopecký 13 SQL Language - Introduction (NDBI026, Lect. 1)
The more features above SQL-92 Entry are used in the application The less is probability that the
application will be able to run on different RDBMS from different vendor
Lot of high-level fetaures are available only in proprietary form and can not be easily ported
Necessity to choose the platform before the application development
Change of the platform during development is complicated and expensive
78
99
92
Common SQL-92 compatible RDBMS
M. Kopecký 14 SQL Language - Introduction (NDBI026, Lect. 1)
What to do in case that RDBMS doesn’t understand my query?
Is (not) the statement correct?
Does the RDBMS (not) understand/support given feature?
At the end SQL statement has to be rewritten
M. Kopecký 15 SQL Language - Introduction (NDBI026, Lect. 1)
SELECT [DISTINCT] expr_c1 [[AS] c_alias1] [, …] FROM source1 [[AS] t_alias1] [, …] [WHERE row_cond] [GROUP BY expr_g1 [, …] [HAVING group_cond]] [ORDER BY expr_o1 [, …]]
M. Kopecký 16 SQL Language - Introduction (NDBI026, Lect. 1)
SELECT [DISTINCT] expr_c1 [[AS] c_alias1] [, …] FROM source1 [[AS] t_alias1] [, …] [WHERE row_cond] [GROUP BY expr_g1 [, …] [HAVING group_cond]] [ORDER BY expr_o1 [, …]]
First, all data sources (tables, views, sub-queries) are combined together
If sources are delimited by commas, a cartesian product is computed
ANSI SQL-92 introduced JOIN ON, NATURAL JOIN, OUTER JOIN, …
M. Kopecký 17 SQL Language - Introduction (NDBI026, Lect. 1)
SELECT [DISTINCT] expr_c1 [[AS] c_alias1] [, …] FROM source1 [[AS] t_alias1] [, …] [WHERE row_cond] [GROUP BY expr_g1 [, …] [HAVING group_cond]] [ORDER BY expr_o1 [, …]]
Second, rows that don’t follow the condition are eliminated
M. Kopecký 18 SQL Language - Introduction (NDBI026, Lect. 1)
SELECT [DISTINCT] expr_c1 [[AS] c_alias1] [, …] FROM source1 [[AS] t_alias1] [, …] [WHERE row_cond] [GROUP BY expr_g1 [, …] [HAVING group_cond]] [ORDER BY expr_o1 [, …]]
Remaining rows are grouped according to equality of grouping expressions (SORT/HASH)
Every resulting row – group – contains atomic columns with values of grouping expressions and set columns with sets of values from all rows that form the group
M. Kopecký 19 SQL Language - Introduction (NDBI026, Lect. 1)
SELECT [DISTINCT] expr_c1 [[AS] c_alias1] [, …] FROM source1 [[AS] t_alias1] [, …] [WHERE row_cond] [GROUP BY expr_g1 [, …] [HAVING group_cond]] [ORDER BY expr_o1 [, …]]
Groups that don’t correspond to the group conditions are eliminated
M. Kopecký 20 SQL Language - Introduction (NDBI026, Lect. 1)
SELECT [DISTINCT] expr_c1 [[AS] c_alias1] [, …] FROM source1 [[AS] t_alias1] [, …] [WHERE row_cond] [GROUP BY expr_g1 [, …] [HAVING group_cond]] [ORDER BY expr_o1 [, …]]
Rows/groups are ordered according to required expression values
M. Kopecký 21 SQL Language - Introduction (NDBI026, Lect. 1)
SELECT [DISTINCT] expr_c1 [[AS] c_alias1] [, …] FROM source1 [[AS] t_alias1] [, …] [WHERE row_cond] [GROUP BY expr_g1 [, …] [HAVING group_cond]] [ORDER BY expr_o1 [, …]]
Remaining (ordered) rows/groups are produced on the output
In case of DISTINCT select, all duplicities are removed (before ORDER BY) Require additional
SORT/HASH operation
M. Kopecký 22 SQL Language - Introduction (NDBI026, Lect. 1)
GROUP BY has to sort/hash all rows to put rows from one group together Useful to group as less rows as possible
If rows can be filtered out by WHERE clause before grouping, the result will be more effective than if unwanted groups are eliminated later
M. Kopecký 23 SQL Language - Introduction (NDBI026, Lect. 1)
SELECT
Street, COUNT(*)
FROM Citizen
WHERE
City='Prague'
GROUP BY
City, Street;
Only one million of rows is
ordered/hashed
SELECT
Street, COUNT(*)
FROM Citizen
GROUP BY
City, Street
HAVING
City='Prague';
10 millions of rows are ordered / hashed, most of groups are dropped in the next step
M. Kopecký 24 SQL Language - Introduction (NDBI026, Lect. 1)
DISTINCT clause sorts (hashes) resulting rows (even before ORDER BY operation), to find and eliminate duplicit records
If it is possible, it is good to write query without DISTINCT clause
ORDER BY clause should be used only when necessary
It is not good idea to use it in view definitions, because the view is often used as a source for further querying
M. Kopecký 25 SQL Language - Introduction (NDBI026, Lect. 1)
CREATE TABLE tab_name (
col_name [(maxsize[,prec])] [col_constr],
…,
row_constraint,
…
);
CREATE TABLE Person (
id numeric(11,0)
CONSTRAINT Person_PK PRIMARY KEY,
name character(50) NOT NULL
);
M. Kopecký 26 SQL Language - Introduction (NDBI026, Lect. 1)
SQL-92 distinguishes two server-side encodings of characters ▪ Due to UTF-8 (UTF-16) support
▪ Able to store and manipulate characters from any language
▪ Not so effective multi-byte storage for national language alphabets
1. Global character set, ▪ Can use single-byte encoding CP-1250, ISO-8859-2, …,
or UTF
2. National character set, ▪ For texts in any language, can use UTF
M. Kopecký 27 SQL Language - Introduction (NDBI026, Lect. 1)
SQL-92 distinguishes further two string representations
1. Fixed length – Simpler data actualization
– Less effective representation
2. Variable length, only used characters are stored (plus length)
– More effective representation
– More complicated data actualization due to different number of bytes needed
M. Kopecký 28 SQL Language - Introduction (NDBI026, Lect. 1)
CHARACTER(n) text in fixed length n bytes/chars
CHARACTER VARYING(n) CHAR VARYING(n) text in variable length, max. n bytes/chars
NATIONAL CHARACTER(n) text in fixed length n bytes/chars in national alphabet
NATIONAL CHARACTER VARYING(n) NATIONAL CHAR VARYING(n) NCHAR VARYING(n) text in variable length, max. n bytes/chars in national alphabet
M. Kopecký 29 SQL Language - Introduction (NDBI026, Lect. 1)
Constants are enclosed in single quotas Single quotas inside string has to be doubled
M. Kopecký 30 SQL Language - Introduction (NDBI026, Lect. 1)
NUMERIC(p[,s]) common numeric type using p numbers, (with fixed decimal point using s positions after decimal point
INTEGER, INT, SMALLINT integer
FLOAT(b) real with b-bit precision
REAL real
DOUBLE PRECISION real number with double precision
M. Kopecký 31 SQL Language - Introduction (NDBI026, Lect. 1)
DATE date (YYYY-MM-DD), precision at least days, maybe more
TIME time (HH:MM.SS.MMMM), precision at least seconds
TIMESTAMP date plus time (YYYY-MM-DD HH:MM.SS.MMMM)
TIMESTAMP(p) WITH TIMEZONE p denotes precision of second fragments, timezone as +HH:MM, resp. –HH:MM at the end
M. Kopecký 32 SQL Language - Introduction (NDBI026, Lect. 1)
Constants are enclosed in single quotas in shown format
M. Kopecký 33 SQL Language - Introduction (NDBI026, Lect. 1)
Databases
Not necessary support all mentioned types
Sometimes support them not natively, the data type is “translated” to similar natively supported type
M. Kopecký 34 SQL Language - Introduction (NDBI026, Lect. 1)
CHARACTER(n) CHARACTER VARYING(n)
CHAR VARYING(n) NATIONAL CHARACTER(n) NATIONAL CHARACTER VARYING(n)
NATIONAL CHAR VARYING(n) NCHAR VARYING(n)
NUMERIC(p,s) INTEGER, INT, SMALLINT FLOAT(b)
DOUBLE PRECISION REAL
CHAR(n) VARCHAR2(n)
VARCHAR2(n) NCHAR(n) NVARCHAR2(n)
NVARCHAR2(n) NVARCHAR2(n)
NUMBER(p,s) NUMBER(38) NUMBER
NUMBER NUMBER
M. Kopecký 35 SQL Language - Introduction (NDBI026, Lect. 1)
DATE Precision in seconds, i.e. corresponds to
TIMESTAMP minimal requirements in SQL-92
Default (American) format DD-MON-YY for example 01-JAN-2015
VARCHAR2(size), //recommended VARCHAR(size) String in variable length representation
▪ The size max. 4000 chars (recommended. max. 2000 chars)
M. Kopecký 36 SQL Language - Introduction (NDBI026, Lect. 1)
[CONSTRAINT cons_name] constraint_definition [INITIALLY {DEFERRED|IMMEDIATE}] [[NOT] DEFERRABLE] If the constraint is not explicitly named,
it obtains usually artificial name (In Oracle e.g. SYS_Cnnnnnn).
Therefore it is recommended to name them explicitly
Column constraints are delimited each from another by space
M. Kopecký 37 SQL Language - Introduction (NDBI026, Lect. 1)
NULL, resp. NOT NULL The column can, resp. cannot contain undefined value
NULL. UNIQUE The column has to have all not null values different.
PRIMARY KEY The column forms the primary key of the table, is
automatically understood as both NOT NULL and UNIQUE.
M. Kopecký 38 SQL Language - Introduction (NDBI026, Lect. 1)
CHECK (condition) Column value has to fulfill given condition.
REFERENCES table_name(column) [ON DELETE {CASCADE|SET NULL}] Column value references to primary key, or candidate key
(UNIQUE column) of given table
Using ON DELETE clause the deletion of master row is allowed. If it is deleted, referencing row is deleted as well or its value is set to NULL
M. Kopecký 39 SQL Language - Introduction (NDBI026, Lect. 1)
DEFAULT value Not exactly integrity constraint, cannot be named, cannot
be deferred
Default value, used if the INSERT didn’t use value for this column explicitly
By default is column defined as DEFAULT NULL
M. Kopecký 40 SQL Language - Introduction (NDBI026, Lect. 1)
Example CREATE TABLE Person(
RC NUMERIC(11,0) CONSTRAINT Person_PK PRIMARY KEY, NAME CHAR VARYING(30) CONSTRAINT Person_U_Name UNIQUE NOT NULL, EMAIL CHAR VARYING(30) CONSTRAINT Person_C_Email CHECK (EMAIL LIKE '_%@_%._%'
);
M. Kopecký 41 SQL Language - Introduction (NDBI026, Lect. 1)
Information about tables are in Oracle available in views – USER_TABLES
– USER_TAB_COLUMNS
– USER_CONSTRAINTS
Information about tables are in MS SQL available in views – INFORMATION_SCHEMA
.TABLES
– INFORMATION_SCHEMA .COLUMNS
– INFORMATION_SCHEMA .TABLE_CONSTRAINTS
M. Kopecký 42 SQL Language - Introduction (NDBI026, Lect. 1)
Can be applied on more columns of the same row
CHECK (event_begin <= event_end)
Can define multi-column primary, candidate and foreign keys
PRIMARY KEY (event_begin, event_end)
FOREIGN KEY (event_begin, event_end) REFERENCES Parent (x, y)
M. Kopecký 43 SQL Language - Introduction (NDBI026, Lect. 1)
ENABLED / DISABLED
Constraint is (is not) active and the validity is checked ALTER TABLE tab_name
{ENABLE|DISABLE} CONSTRAINT cons_name;
DEFERRED / IMMEDIATE
Constraint checking is deferred at the end of transaction, by default is checked immediately after every data change
DEFERRABLE / NOT DEFERRABLE
Constraint can be / cannot be deferred
M. Kopecký 44 SQL Language - Introduction (NDBI026, Lect. 1)
If possible, check all data changes at the moment they occures and can be checked Whatever the user can insert in wrong
place/format/… will be inserted wrongly ▪ Integrity constraints, resp. triggers
Cleaning of inconsistent data later is time-consuming and often not fully possible
It is better to check everything at the database, than hope that the input will be tested in every applications running on data
M. Kopecký 45 SQL Language - Introduction (NDBI026, Lect. 1)
Check the uniqueness of data
Every table should have the primary key
Even in case the primary key is artificial, individual instances (rows) usually have some natural one or multi column identifier, which should be set as candidate key of the table (UNIQUE)
Sometimes more candidate keys can be found
M. Kopecký 46 SQL Language - Introduction (NDBI026, Lect. 1)