postgresql

working with R, postgresql + SSL, and MSSQL

I’ve been able to take a break from my regularly scheduled duties and spend some time working with R.  This is a short log of what I did to get it working.

The main things I’m looking to do is regression modelling from a large dataset I have in postgresql and various stats calculations on some business data I have in SQL Server.  Today I got to the stage in my R learning where I wanted to hook up the databases.

My setup:

  • R version 2.12.0 on windows 7
  • postgresql 8.4.5 on ubuntu server, requiring SSL
  • MS SQL Server 2005 on Windows 2003

R connects to databases via RJDBC, which (surprise) uses JDBC.  You need to download JDBC drivers for each server, and then can load those up inside R.

  1. Install RJDBC
    1. Open R
    2. Packages -> Install package(s)
    3. pick a mirror near you
    4. select RJDBC
  2. install JDBC driver for MSSQL
    1. I used jtds: http://jtds.sourceforge.net/ (there is also a Microsoft provided driver I didn’t hear about until I was done)
    2. download and unzip
    3. note the path to the jtds jar file (hereafter referred to as $JTDS and the jar filename
    4. open http://jtds.sourceforge.net/faq.html#driverImplementation, which has some magic strings JDBC wants
    5. optional – copy $JTDS/(x64|x86)/SSO/ntlmauth.dll into your %PATH% if you want to use windows authentication with SQL Server
  3. install JDBC driver for Postgresql
    1. Download from http://jdbc.postgresql.org/
    2. note the path to the jar file (hereafter referred to as $PG) and the jar file name
    3. open http://jdbc.postgresql.org/documentation/head/load.html, which has some magic strings JDBC wants

Then, to connect with MSSQL:

> library(RJDBC)
> mssql <- JDBC("net.sourceforge.jtds.jdbc.Driver", "$JTDS/jtds-1.2.5.jar", "`")
> testdb <- dbConnect(mssql, "jdbc:jtds:sqlserver://host/dbname")
> typeof(dbGetQuery(testdb, "SELECT whathaveyou FROM whither"))
[1] "list"

And you’re off and running with a list of your results in a list and can do whatever you like.

Now for postgresql+ssl:

> pgsql <- JDBC("org.postgresql.Driver", "$PG/postgresql-9.0-801.jdbc3.jar", "`")
> testdb <- dbConnect(pgsql, "jdbc:postgresql://host/dbname?ssl=true", password="password")
> typeof(dbGetQuery(testdb, "SELECT whathaveyou FROM whither"))
[1] "list"

The connection here has a lot more options, and depends highly on your server’s pg_hba.conf.  It took a little while figure out the “?ssl=true” bit.  Luckily you get pretty descriptive error messages if you can’t connect, and the PostgreSQL JDBC docs are pretty good.

Now to re-learn everything I once knew about regression modeling!

code snippet
linux
mssql
open source
postgresql
R
windows

Comments (0)

Permalink

latest postgres docs bookmarklet

When using google to find things in the excellent Postgresql documentation, I often end up on pages showing old postgres versions.  For example, googling for “postgresql create index”, the first hit is for the postgresql 8.2 docs, and I’m running 8.4 now.  My co-workers made a greasemonkey script to automatically redirect to the current version, and I adapted that into a bookmarklet.

Drag this link into your address bar to to use it in your browser:

pg-docs

When you find yourself on a old postgres docs page, click the bookmarklet to redirect to the latest version of that page.   This should work as long as the postgres folks keep their URL naming scheme.

javascript
postgresql

Comments (0)

Permalink