48
Preliminary Definitions Preliminary Definitions MySQL : An Open Source, Enterprise-level, multi-threaded, relational database management system that stores and retrieves data using the Structured Query Language o licensed with the GNU General public license http://www.gnu.org/ Structured Query Language (SQL): A standardized query language for getting information from a relational database. Relational Database : A database that stores data in the form of relational tables as 0pposed to flat files. Database Management System (DBMS): A system that manages relational databases; A collection of programs that enabling the storage, modification, and extraction of information from a database.

Preliminary Definitions

Embed Size (px)

DESCRIPTION

Preliminary Definitions. MySQL : An Open Source, Enterprise-level, multi-threaded, relational database management system that stores and retrieves data using the Structured Query Language licensed with the GNU General public license http://www.gnu.org/ - PowerPoint PPT Presentation

Citation preview

Preliminary DefinitionsPreliminary Definitions

• MySQL: An Open Source, Enterprise-level, multi-threaded, relational database management system that stores and retrieves data using the Structured Query Languageo licensed with the GNU General public license

http://www.gnu.org/

• Structured Query Language (SQL): A standardized query language for getting information from a relational database.

• Relational Database: A database that stores data in the form of relational tables as 0pposed to flat files.

• Database Management System (DBMS): A system that manages relational databases; A collection of programs that enabling the storage, modification, and extraction of information from a

database.

Main FeaturesMain Features

• Fully multi-threaded using kernel threads.• Works on many different platforms.• Many column types• Very fast joins using an optimized one-sweep

multi-join• Full operator and function support in the

SELECT and WHERE parts of queries. • You can mix tables from different databases in

the same query.• A privilege and password system that is very

flexible and secure.• Handles large databases.• Tested with a broad range of different

compilers. (C/C++)• No memory leaks.• Full support for several different character

sets.

Cells, Rows, Tables and Cells, Rows, Tables and DatabasesDatabases

• Cell -- a single (scalar) value.

12134

Cells, Rows, Tables and Cells, Rows, Tables and DatabasesDatabases

• Row -- a group of scalar values representing a single instance of an object or event.

12135 1310391314 Letter: July 23,1842

Cells, Rows, Tables and Cells, Rows, Tables and DatabasesDatabases

• Table -- a series of rows describing separate objects or events.

ID METSID LABEL12134 1090313313 Letter: November 18, 183812135 1310391314 Letter: July 23,184212136 1313020414 Waterloo at Sunset

Cells, Rows, Tables and Cells, Rows, Tables and DatabasesDatabases

• Database -- a collection of related tables describing various facets of a group of objects or events.

OBJECTS CLINKS COLSID METSID ID

METSID COLID NAMELABEL URL

ABSTRACT

Relations -- One to OneRelations -- One to One

Table1 Table2Record RecordRecord RecordRecord RecordRecord RecordRecord RecordRecord RecordRecord Record

Relations -- One to ManyRelations -- One to Many

Table1 Table2Record RecordRecord RecordRecord RecordRecord RecordRecord RecordRecord RecordRecord Record

Relations -- Many to ManyRelations -- Many to Many

Table1 Table2Record RecordRecord RecordRecord RecordRecord RecordRecord RecordRecord RecordRecord Record

First Name Last Name PhoneNadia Li 2687Madhu Charu 7856Ajuma Kinsaka 4489Wade Randal 5257Helen Clark 2147

Employees

Relational DatabasesRelational Databases

• A database is a collection of tables

• Columns define attributes of the datao All data in a column must have the same data type

• A record is stored in a row

table name

column

row

Use a Relational Database Use a Relational Database When…When…

• You have a very large dataset• There is redundant data

o Wastes disk spaceo Increases errors

•Information must be updated in multiple locations

• Security is importanto Different users can be granted different permissions

• Strict enforcement of data types is important

Spreadsheet ExampleSpreadsheet Example

Title Author Borrower PhoneA House for Mr. Biswas VS Naipaul Sarah 646.555.1234Midnight's Children Salman RushdieOn the Road Jack Kerouac

Spreadsheet ExampleSpreadsheet Example

Title Author Borrower PhoneA House for Mr. Biswas VS Naipaul Sarah 646.555.1234Midnight's Children Salman RushdieOn the Road Jack Kerouac

Data is inconsistent!

Now imagine you are designing the New York Public Library database which has tens of million books and well over a million cardholders.

One Flew Over the Cuckoo's Nest Ken Kesey Sarah 646.555.1244Sula Toni MorrisonVillette Charlotte Bronte Jim 646.555.4586

Database DesignDatabase DesignEntity Relationship Design

Entity (“thing”, “object”)

Table

Attributes (describe entity)

Columns

Entity Instance Row

Relationships between entities preserved in relationships between tables.

If you are interested in learning more formal design methods look up “normalization” and/or “third normal form”.

Our dataOur dataEnsembl Gene ID Symbol /

NameChromosome

Start Position (bp)

End Position (bp)

LocusLink ID

Taxonomy ID

Common Name

Species

ENSG00000186891.3 TNFRSF18 1 1044947 1048147 8784 9606 human Homo sapiens

ENSG00000078808.4 CAB45 1 1058370 1073469 51150 9606 human Homo sapiens

ENSG00000176022.1 B3GALT6 1 1073703 1076476 126792 9606 human Homo sapiens

ENSG00000160087.5 UBE2J2 1 1095352 1115292 118424 9606 human Homo sapiens

ENSG00000162572.4 SCNN1D 1 1123634 1133467 6339 9606 human Homo sapiens

ENSG00000162576.4 MGC3047 1 1194130 1199973 84308 9606 human Homo sapiens

ENSG00000175756.3 AKIP 1 1215168 1216641 54998 9606 human Homo sapiens

ENSG00000131586.2 MRPL20 1 1288703 1294063 55052 9606 human Homo sapiens

ENSG00000179403.2 WARP 1 1322311 1327547 64856 9606 human Homo sapiens

ENSG00000160072.5 ATAD3B 1 1358611 1396091 83858 9606 human Homo sapiens

ENSG00000008128.5 CDC2L2 1 1582617 1604060 985 9606 human Homo sapiens

ENSG00000169911.4 SLC35E2 1 1611978 1625728 9906 9606 human Homo sapiens

ENSG00000008130.3 FLJ13052 1 1630975 1659805 65220 9606 human Homo sapiens

ENSG00000078369.3 GNB1 1 1665027 1770792 2782 9606 human Homo sapiens

ENSMUSG00000041954.1 TNFRSF18 4 154139702 154142251 21936 10090 mouse Mus musculus

ENSMUSG00000023286.1 UBE2J2 4 154057210 1540722964 140499 10090 mouse Mus musculus

What entities or “objects” are defined here?Is there any redundant data?What happens if we want to add another species attribute (e.g. genus)?

Our tablesOur tables

How do we know which organism a gene belongs to?

Do we have unique identifiers?

Can the organism have more than one gene?

Can the gene have more than one organism?

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Our tablesOur tables

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.Means each gene has “one and only one” organism, but that each organism can have more than one gene.

This is an example of an entity relationship diagram.

Our tablesOur tables

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

• A primary key is a unique identifier for a record in a table.• A foreign key in one table refers to the primary key of another.

• The gene table has a foreign key, Organism_ID. Each record in the gene table will have an organism id to link it to the correct species record in the organism table.

Database Design CaveatDatabase Design Caveat

• Sometimes design is “sacrificed” for speed.

Data TypesData Types• float• integer• tinyint• varchar(size)

o stores stringso size can be between 0 - 255, inclusive

• datetime• + more What data types should our attributes

(columns) be?

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Complete DesignComplete DesignGene

Column Data Typegene_id integerensembl_gene_id varchar(50)organism_id integername varchar(35)locuslink varchar(10)chromosome tinyintchromo_start integerchromo_end integer

description varchar(255)

OrganismColumn Data Typeorganism_id integertaxonomy_id integer common_name varchar(35)species varchar(35)

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Database name: ensmartdb

Connecting to MySQL from the Connecting to MySQL from the Command LineCommand Line

mysql -uusername -p

Example:>mysql -uroot

To EXIT MySQL:EXIT;

Basic SQL CommandsBasic SQL Commands

• SQL statements end with a semicolon• View databases

SHOW DATABASES;

START A MYSQL CLIENTSTART A MYSQL CLIENT

• Without using passwords (when the password for the specified user is empty) mysql -u <user> -h <Host>

• Using passwords mysql -u <user> -h <Host> -p Example: mysql -u root -h localhost

• Exitting with the command quit or exit.

DATA MANAGEMENTDATA MANAGEMENT

• SHOW DATABASES;• USE databaseName;• SHOW TABLES;• DESCRIBE table;• SELECT * FROM table;• SELECT * FROM table \G• CREATE DATABASE databaseName;• DROP DATABASE databaseName;• CREATE TABLE tableName(name1 type1, name2

type2, ...); • DROP TABLE tableName;• INSERT INTO TABLE VALUES( value1, value2, ...);• SELECT field1, field2, ... FROM tableName;• SELECT * INTO OUTFILE 'C:/tmp/skr.txt' FROM skr; • LOAD DATA INFILE /path/file.txt INTO TABLE skr;

Importing a DatabaseImporting a Database

• Creating a databaseCREATE DATABASE trii;

• From the command line:mysql -uusername -ppassword databasename < filename.sql

• Example:o mysql -uroot trii < trii.sql

• Use database databasenameUSE databasename;

• Display all tables in a databaseSHOW TABLES;

Basic SQL CommandsBasic SQL Commands

Create TableCreate Table

CREATE TABLE organism ( organism_id INTEGER NOT NULL AUTO_INCREMENT, taxonomy_id INTEGER NOT NULL, common_name VARCHAR(35) NOT NULL, species VARCHAR(35) NOT NULL,

PRIMARY KEY (organism_id), UNIQUE (taxonomy_id));

database name

column

names

View column details for a tableView column details for a table

DESC tablename;

Selecting all dataSelecting all data

SELECT * FROM tablename;

Select only the columns you Select only the columns you need need (it will be faster)(it will be faster)

SELECT common_name, speciesFROM organism;

Limiting your dataLimiting your data

• Get the species name for a specific organism (you have the id)SELECT species FROM organism WHERE organism_id=1;

How do we select all the gene names for chromosome 1?

InsertInsert

• Inserting a geneINSERT INTO gene(ensembl_gene_id,organism_id,name,chromosome,chromo_start,chromo_end) VALUES (‘MY_NEW_GENE’,1, ‘MY GENE’, 1, 12345, 67890);

• Get the id of the gene:SELECT gene_id FROM gene WHERE name='MY GENE';

Delete/UpdateDelete/Update

• Deleting a geneDELETE FROM gene WHERE gene_id=19;

• Updating a geneUPDATE gene SET name=‘NEW NAME’ WHERE name=‘AKIP’;

DocumentationDocumentation

• http://www.mysql.com/documentation/• http://www.mysql.com/documentation/manual.php

• As text manual.txt• As HTML manual_toc.html • As GNU Info mysql.info • As PostScript manual.ps http://www.turbolift.com/mysql

PHPMyAdminPHPMyAdmin

• Web application• Makes it easier to use MySQL• To launch: http://localhost/phpmyadmin/

• Download it here: http://www.phpmyadmin.net/home_page/downloads.php

Create DatabaseCreate Database

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Enter name of database here and click

“Create”

You are logged in as “root”

To get back to this page at

any time select

“Databases”

Create TableCreate Table

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

SQL query used to create

“ensmartdb”

Enter table name and number of

fields then click “Go”

We are currently using this database

Define ColumnsDefine Columns

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Select “auto_increme

nt” here

Select “Primary” here since this is the primary

key

Don’t forget to click “Save”!

We are defining the columns for table “gene”

View DatabaseView Database

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Select “ensmartdb” to view tables in database

Click “gene” to

view “gene” table Add

another table here

Insert DataInsert Data

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Don’t forget to save by

clicking “Go”!

Click on “insert” to add data to table

View/Delete/Edit DataView/Delete/Edit Data

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Check rows to edit or

delete Click here to edit checked

rows

Click here to delete checked

rows

Click on “Browse” to

view/edit/delete data

Database UsersDatabase Users

• To use MySQL you must have a username and password

• A user in MySQL has permissions set regardingo MySQL itself (ex. whether or not the user can create a database)

o Specific databases within MySQL• For example, user “guest” may have permission to view database “x” but not database “y”

• Multiple users can access a MySQL database simultaneously

Granting a User PrivilegesGranting a User Privileges

• PHPMyAdmin is logged in as user “root”, and has permission to do anything

• You should NOT make a habit of connecting to your database as root

• Create a user with restricted permissions to your database instead

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.Click “Add new User”

Granting a User Privileges Granting a User Privileges (Continued)(Continued)

Enter “User name”, “Host”, and “Password”

This are privileges the

user will have on ALL databases

Granting a User Privileges Granting a User Privileges (Continued)(Continued)

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Select “ensmartdb” to edit user permissions

for database “ensmartdb”

On the next screen select ONLY the permissions the user must have

After you save the “global”

permissions, you may add database

specific permissions