Access dashDB (or DB2) using RODBC from R notebook
To get started you will need credentials for a dashDB database. If you don't already have an instance of dashDB you can get one for free by following steps in this article.
Next, create a new Jupyter notebook in Data Scientist Workbench. Set it to use R.
The RODBC package and the dashDB/DB2 CLI (ODBC) driver is already installed and available in your R notebook in Data Scientist Workbench. Create a new cell and load the RODBC package:
library(RODBC)Enter the other connection details for your instance of dashDB (or DB2 database).
#Enter the values for you database connectionRun the cell above. Create a new cell to create the connection:
dsn_driver <- "{IBM DB2 ODBC Driver}"
dsn_database <- "BLUDB" # e.g. "BLUDB"
dsn_hostname <- "<Enter Hostname>" # e.g.: "awh-yp-small03.services.dal.bluemix.net"
dsn_port <- "50000" # e.g. "50000"
dsn_protocol <- "TCPIP" # i.e. "TCPIP"
dsn_uid <- "<Enter UserID>" # e.g. "dash104434"
dsn_pwd <- "<Enter Password>" # e.g. "7dBZ39xN6$o0JiX!m"
conn_path <- paste("DRIVER=",dsn_driver,
";DATABASE=",dsn_database,
";HOSTNAME=",dsn_hostname,
";PORT=",dsn_port,
";PROTOCOL=",dsn_protocol,
";UID=",dsn_uid,
";PWD=",dsn_pwd,sep="")
conn <- odbcDriverConnect(conn_path)
conn
Let's get a list of tables in the samples schema:
table.list <- sqlTables(conn,tableType="TABLE", schema="SAMPLES")Let's get a list of columns in a sample table:
cat("There are", nrow(table.list), "tables in the", schema, "schema.\n")
table.name <- "SAMPLES.COW"You can use the very versatile sqlQuery() function to issue SQL Data Manipulation Language (DML) statements such as SELECT, INSERT, UPDATE, or DELETE, and SQL Data Definition Language (DDL) statements such as CREATE TABLE. Let's use sqlFetch to fetch a row from a sample table:
col.list <- sqlColumns(conn,table.name)
cat("There are", nrow(col.list), "columns defined in", table.name,"\n")
cows <- sqlFetch(conn, table.name)We can use the sqlSave() function to create and populate a new table. In this scenario, the safer parameter allows the script to replace any existing table with the same name.
print (cows[1,1:4], row.names=FALSE)
tab.name <- "CLASSMARKS"
NAMES <- c("Bob","Mary","Fred")
MARKS <- c(78,88,91)
# Create a data frame of test scores and names
CLASSMARKS <- data.frame (NAMES,MARKS,stringsAsFactors=FALSE)
# Create a new table and populate it with the data frame CLASSMARKS
sqlSave(conn, CLASSMARKS, rownames=FALSE,safer=FALSE)
NEWCLASS <- sqlFetch(conn,tab.name)
cat( "Mean mark for the class is", mean(NEWCLASS[,"MARKS"]),"\n")
Cool! You've accessed data in a dashDB data warehouse using RODBC from a R notebook. Finally, as a best practice we should close the database connection once we're done with it.
odbcCloseAll()