14

Click here to load reader

Los Angeles R users group - Dec 14 2010 - Part 2

Embed Size (px)

Citation preview

Page 1: Los Angeles R users group - Dec 14 2010 - Part 2

Database Access through R

Krishna Bhogaonker

December 14, 2010

Page 2: Los Angeles R users group - Dec 14 2010 - Part 2

14/12/10 Database Access Through R 2

Where is this Going?

● Introduction to database connection methods● Examples from some common R packages (cheat sheets

a.k.a. eye charts)● Introduction to the sqldf package

Page 3: Los Angeles R users group - Dec 14 2010 - Part 2

14/12/10 Database Access Through R 3

Database access is about fitting round pegs into square holes.

Page 4: Los Angeles R users group - Dec 14 2010 - Part 2

14/12/10 Database Access Through R 4

Issues to Consider when Choosing a Data Access Method for Basic Analysis● How much work does it take to set up?

● Lazy ways – GUIs like RCommander, Deducer, JGR, Revolutions, RedR . . . .

● Diligent ways – Database or Protocol Specific R Packages.

● Speed● Stability ● Platform

Page 5: Los Angeles R users group - Dec 14 2010 - Part 2

14/12/10 Database Access Through R 5

High-Level Database Connection Procedure

● Open a database connection object using the appropriate driver (ODBC, JDBC, etc.)

● Authenicate user and confirm connection● Execute database tasks by referencing the appropriate

methods on the database object

Page 6: Los Angeles R users group - Dec 14 2010 - Part 2

14/12/10 Database Access Through R 6

DBI Package

● Big package with connections to various database protocols including Oracle, PostgreSQL, ODBC, SQLite, MySQL## choose the proper DBMS driver and connect to the server

drv <- dbDriver("ODBC")

con <- dbConnect(drv, "dsn", "usr", "pwd")

## the interface can work at a higher level importing tables as data.frames and exporting data.frames as DBMS tables.

dbListTables(con)

dbListFields(con, "quakes")

if(dbExistsTable(con, "new_results"))

dbRemoveTable(con, "new_results")

dbWriteTable(con, "new_results", new.output)

Page 7: Los Angeles R users group - Dec 14 2010 - Part 2

14/12/10 Database Access Through R 7

RODBC

● Provides access to ODBC compliant databases, including MSSQL, MS Access, and others# connect to database

library(RODBC)

myconn <-odbcConnect("mydsn", uid="Rob", pwd="aardvark")

# query data from the database

crimedat <- sqlFetch(myconn, Crime)

pundat <- sqlQuery(myconn, "select * from Punishment")

# close database connection

close(myconn)

Page 8: Los Angeles R users group - Dec 14 2010 - Part 2

14/12/10 Database Access Through R 8

RJDBC

● Uses the DBI interface for the front-end and JDBC driver on the back-end# connect to the database

drv <- JDBC("com.mysql.jdbc.Driver",

"/etc/jdbc/mysql-connector-java-3.1.14-bin.jar", "`")

conn <- dbConnect(drv, "jdbc:mysql://localhost/test")

# access database tables

dbListTables(conn)

data(iris)

# write to and query tables

dbWriteTable(conn, "iris", iris)

dbGetQuery(conn, "select count(*) from iris")

d <- dbReadTable(conn, "iris")

Page 9: Los Angeles R users group - Dec 14 2010 - Part 2

14/12/10 Database Access Through R 9

RMySQL

● Database interface for MySQL driver using the DBI standard.## connect and authenticate to a MySQL Db

con <- dbConnect(MySQL(), group = "lasers")

con2 <- dbConnect(MySQL(), user="opto", password="pure-light",

dbname="lasers", host="merced"

## list tables ad fields in a table

dbListTables(con)

dbListFields(con, "table_name")

## import and export data frames

d <- dbReadTable(con, "WL")

dbWriteTable(con, "WL2", a.data.frame) ## table from a data.frame

dbWriteTable(con, "test2", "~/data/test2.csv") ## table from file

Page 10: Los Angeles R users group - Dec 14 2010 - Part 2

14/12/10 Database Access Through R 10

RpgSQL

● PostgreSQL interface to R via RJDBC# the user/password/dbname used here are actually the defaults

con <- dbConnect(pgSQL(), user = "postgres", password = "", dbname = "test")

# create table, populate it and display it

s <- 'create table tt("id" int primary key, "name" varchar(255))'

dbSendUpdate(con, s)

dbSendUpdate(con, "insert into tt values(1, 'Hello')")

dbSendUpdate(con, "insert into tt values(2, 'World')")

dbGetQuery(con, "select * from tt")

# transfer a data frame to pgSQL and then display it from the database

# dbWriteTable is case sensitive

dbWriteTable(con, "BOD", BOD)

# table names are lower cased unless double quoted

dbGetQuery(con, 'select * from "BOD"')

Page 11: Los Angeles R users group - Dec 14 2010 - Part 2

14/12/10 Database Access Through R 11

RMongo

● Access to Mongodb through R. Modeled on RMySQL. Still in alpha as of Nov 3, 2010.# connect to a database

mongo <- mongoDbConnect("eat2treat_development")

# show the collections

dbShowCollections(mongo)

# perform an 'all' query with a document limit of 2 and offset of 0.

# the results is a data.frame object. Nested documents are not supported at the moment. They will just be the string output.

results <- dbGetQuery(mongo, "nutrient_metadatas", "{}", 0, 2)

names(results)

results <- dbGetQuery(mongo, "nutrient_metadatas", '{"nutrient_definition_id": 307}')

Page 12: Los Angeles R users group - Dec 14 2010 - Part 2

14/12/10 Database Access Through R 12

A Few Words about the sqldf Package

● Sqldf provides a way to run SQL statements on R dataframes.

● Sqldf works with the SQLite, H2, and PostgreSQL databases.

● This package allows you to run most SQL commands against an R dataframe: Selects, Joins, Ordering, Grouping, Averaging, etc.

Page 13: Los Angeles R users group - Dec 14 2010 - Part 2

14/12/10 Database Access Through R 13

Sqldf Example

# load sqldf into workspace and execute SELECT queries

library(sqldf)

sqldf("select * from iris limit 5")

sqldf("select count(*) from iris")

sqldf("select Species, count(*) from iris group by Species")

# example of a JOIN

Abbr <- data.frame(Species = levels(iris$Species),

+ Abbr = c("S", "Ve", "Vi"))

sqldf("select Abbr, avg(Sepal_Length)

+ from iris natural join Abbr group by Species")

Page 14: Los Angeles R users group - Dec 14 2010 - Part 2

14/12/10 Database Access Through R 14

Thank You

“How are you going to run the universe if you can't answer a few unsolvable problems?”