Click here to load reader
Upload
rusersla
View
2.560
Download
0
Embed Size (px)
Citation preview
Database Access through R
Krishna Bhogaonker
December 14, 2010
14/12/10 Database Access Through R 2
Where is this Going?
● Introduction to database connection methods● Examples from some common R packages (cheat sheets
a.k.a. eye charts)● Introduction to the sqldf package
14/12/10 Database Access Through R 3
Database access is about fitting round pegs into square holes.
14/12/10 Database Access Through R 4
Issues to Consider when Choosing a Data Access Method for Basic Analysis● How much work does it take to set up?
● Lazy ways – GUIs like RCommander, Deducer, JGR, Revolutions, RedR . . . .
● Diligent ways – Database or Protocol Specific R Packages.
● Speed● Stability ● Platform
14/12/10 Database Access Through R 5
High-Level Database Connection Procedure
● Open a database connection object using the appropriate driver (ODBC, JDBC, etc.)
● Authenicate user and confirm connection● Execute database tasks by referencing the appropriate
methods on the database object
14/12/10 Database Access Through R 6
DBI Package
● Big package with connections to various database protocols including Oracle, PostgreSQL, ODBC, SQLite, MySQL## choose the proper DBMS driver and connect to the server
drv <- dbDriver("ODBC")
con <- dbConnect(drv, "dsn", "usr", "pwd")
## the interface can work at a higher level importing tables as data.frames and exporting data.frames as DBMS tables.
dbListTables(con)
dbListFields(con, "quakes")
if(dbExistsTable(con, "new_results"))
dbRemoveTable(con, "new_results")
dbWriteTable(con, "new_results", new.output)
14/12/10 Database Access Through R 7
RODBC
● Provides access to ODBC compliant databases, including MSSQL, MS Access, and others# connect to database
library(RODBC)
myconn <-odbcConnect("mydsn", uid="Rob", pwd="aardvark")
# query data from the database
crimedat <- sqlFetch(myconn, Crime)
pundat <- sqlQuery(myconn, "select * from Punishment")
# close database connection
close(myconn)
14/12/10 Database Access Through R 8
RJDBC
● Uses the DBI interface for the front-end and JDBC driver on the back-end# connect to the database
drv <- JDBC("com.mysql.jdbc.Driver",
"/etc/jdbc/mysql-connector-java-3.1.14-bin.jar", "`")
conn <- dbConnect(drv, "jdbc:mysql://localhost/test")
# access database tables
dbListTables(conn)
data(iris)
# write to and query tables
dbWriteTable(conn, "iris", iris)
dbGetQuery(conn, "select count(*) from iris")
d <- dbReadTable(conn, "iris")
14/12/10 Database Access Through R 9
RMySQL
● Database interface for MySQL driver using the DBI standard.## connect and authenticate to a MySQL Db
con <- dbConnect(MySQL(), group = "lasers")
con2 <- dbConnect(MySQL(), user="opto", password="pure-light",
dbname="lasers", host="merced"
## list tables ad fields in a table
dbListTables(con)
dbListFields(con, "table_name")
## import and export data frames
d <- dbReadTable(con, "WL")
dbWriteTable(con, "WL2", a.data.frame) ## table from a data.frame
dbWriteTable(con, "test2", "~/data/test2.csv") ## table from file
14/12/10 Database Access Through R 10
RpgSQL
● PostgreSQL interface to R via RJDBC# the user/password/dbname used here are actually the defaults
con <- dbConnect(pgSQL(), user = "postgres", password = "", dbname = "test")
# create table, populate it and display it
s <- 'create table tt("id" int primary key, "name" varchar(255))'
dbSendUpdate(con, s)
dbSendUpdate(con, "insert into tt values(1, 'Hello')")
dbSendUpdate(con, "insert into tt values(2, 'World')")
dbGetQuery(con, "select * from tt")
# transfer a data frame to pgSQL and then display it from the database
# dbWriteTable is case sensitive
dbWriteTable(con, "BOD", BOD)
# table names are lower cased unless double quoted
dbGetQuery(con, 'select * from "BOD"')
14/12/10 Database Access Through R 11
RMongo
● Access to Mongodb through R. Modeled on RMySQL. Still in alpha as of Nov 3, 2010.# connect to a database
mongo <- mongoDbConnect("eat2treat_development")
# show the collections
dbShowCollections(mongo)
# perform an 'all' query with a document limit of 2 and offset of 0.
# the results is a data.frame object. Nested documents are not supported at the moment. They will just be the string output.
results <- dbGetQuery(mongo, "nutrient_metadatas", "{}", 0, 2)
names(results)
results <- dbGetQuery(mongo, "nutrient_metadatas", '{"nutrient_definition_id": 307}')
14/12/10 Database Access Through R 12
A Few Words about the sqldf Package
● Sqldf provides a way to run SQL statements on R dataframes.
● Sqldf works with the SQLite, H2, and PostgreSQL databases.
● This package allows you to run most SQL commands against an R dataframe: Selects, Joins, Ordering, Grouping, Averaging, etc.
14/12/10 Database Access Through R 13
Sqldf Example
# load sqldf into workspace and execute SELECT queries
library(sqldf)
sqldf("select * from iris limit 5")
sqldf("select count(*) from iris")
sqldf("select Species, count(*) from iris group by Species")
# example of a JOIN
Abbr <- data.frame(Species = levels(iris$Species),
+ Abbr = c("S", "Ve", "Vi"))
sqldf("select Abbr, avg(Sepal_Length)
+ from iris natural join Abbr group by Species")
14/12/10 Database Access Through R 14
Thank You
“How are you going to run the universe if you can't answer a few unsolvable problems?”