Skip to content

Access PostgreSQL from R using RPostgreSQL

PostgreSQL is a very popular open source relational databases that has sophisticated features such as Multi-Version Concurrency Control, point in time recovery, tablespaces, etc. In this article we will explore how to connect to a PostgreSQL database from a R notebook and perform data access operations using RPostgreSQL.

RPostgreSQL is a fairly mature driver for interacting with PostgreSQL from the R scripting language. It provides to efficiently perform the full range of SQL operations against Postgres databases. This package is already pre-installed in your Workbench.

Open the Jupyter notebooks on your Data Scientist Workbench and choose R in the top right corner to start a new notebook. Create a new cell, enter the following cell and execute it (Ctrl+Enter).
Next, enter your database credentials.
#Enter the values for you database connection
dsn_database = "<database name>"           # e.g. "compose"
dsn_hostname = "<your host name>" # e.g.: ""
dsn_port = "<port>"               # e.g. 11101
dsn_uid = "<your user id>"       # e.g. "admin"
dsn_pwd = "<your password>"     # e.g. "xxx"
Now establish the database connection.
    drv <- dbDriver("PostgreSQL")
    print("Connecting to database")
    conn <- dbConnect(drv,
                dbname = dsn_database,
                host = dsn_hostname,
                port = dsn_port,
                user = dsn_uid,
                password = dsn_pwd)
    error=function(cond) {
            print("Unable to connect to database.")
Let's create a test table called Cars and insert some data.

dbSendQuery(conn, "DROP TABLE IF EXISTS Cars")
dbSendQuery(conn, "CREATE TABLE Cars(Id INTEGER PRIMARY KEY, Name VARCHAR(20), Price INT)")
dbSendQuery(conn, "INSERT INTO Cars VALUES(1,'Audi',52642)")
dbSendQuery(conn, "INSERT INTO Cars VALUES(2,'Mercedes',57127)")
dbSendQuery(conn, "INSERT INTO Cars VALUES(3,'Skoda',9000)")
dbSendQuery(conn, "INSERT INTO Cars VALUES(4,'Volvo',29000)")
dbSendQuery(conn, "INSERT INTO Cars VALUES(5,'Bentley',350000)")
dbSendQuery(conn, "INSERT INTO Cars VALUES(6,'Citroen',21000)")
dbSendQuery(conn, "INSERT INTO Cars VALUES(7,'Hummer',41400)")
dbSendQuery(conn, "INSERT INTO Cars VALUES(8,'Volkswagen',21600)")
Now you can run a query to select data from the newly created table into a datafram and print contents of the dataframe.
df <- dbGetQuery(conn, "SELECT * FROM Cars")
It is a good practice to close the connection when you are done.
Good luck!

Feedback and Knowledge Base