Skip to content

coroutines in common lisp

After spending awhile in python land, I wanted to have “yield” in lisp.  After a month or so of stewing, I decided to dive in tonight.  My first stab uses threads, not continuations to accomplish this.  I made that choice partially because I find the arnesi library intimidating (arnesi has continuations in there somewhere), and partially because I wanted to more practice with threads.

I ended up futzing with bordeaux-threads for a few hours, and eventually punted and used a library that had already solved these problems.  My basic test function was:

In retrospect, this may have been a bit pathological.  Virtually no time was spent anywhere, and so everything was happening pretty much at once.

My basic threading approach was to make a reader/writer pair:

  1. run the coroutine (writer) in a thread, and lexically bind yield using a flet such that calling yield set shared memory (with the appropriate locks)
  2. build a lambda (reader) that, when funcalled, waits for the thread to have a value ready, pulls it from shared memory and returns it (with the appropriate locks)

The “with the appropriate locks” bit killed me.  I spent a lot of time in deadlock, and had race conditions everywhere.  I ran into these issues:

  • race condition during startup where the writer thread would start too slowly, missing the notify from the reader to give me a value, and then get stuck waiting for the reader to notify
  • race condition at the end of the coroutine, where the writer thread wouldn’t die fast enough, and the reader would get stuck waiting for the writer to notify
  • many cases where I wanted to CONDITION-WAIT in one thread before I CONDITION-NOTIFY in another, but kept getting it backward.  Adding more layers of locks/condition variables seemed to just defer the problem to another level.

My initial bordeaux-threads version worked great if I ran it from the REPL (with 1+ second pauses for me to run the commands), but the race conditions screwed me when I put it all together.

After a few hours (and a few beers) of debugging, I decided to look at how chanl did it, which rapidly degraded into a chanl-based implementation.  This, of course, took 10m to write and worked great:

For reference, my last broken bordeaux-threads version was:

Fun stuff! Good to know I suck at threads, maybe I’ll take another try with less beer later. At least now I can browse simpy source with less envy.

working with R, postgresql + SSL, and MSSQL

I’ve been able to take a break from my regularly scheduled duties and spend some time working with R.  This is a short log of what I did to get it working.

The main things I’m looking to do is regression modelling from a large dataset I have in postgresql and various stats calculations on some business data I have in SQL Server.  Today I got to the stage in my R learning where I wanted to hook up the databases.

My setup:

  • R version 2.12.0 on windows 7
  • postgresql 8.4.5 on ubuntu server, requiring SSL
  • MS SQL Server 2005 on Windows 2003

R connects to databases via RJDBC, which (surprise) uses JDBC.  You need to download JDBC drivers for each server, and then can load those up inside R.

  1. Install RJDBC
    1. Open R
    2. Packages -> Install package(s)
    3. pick a mirror near you
    4. select RJDBC
  2. install JDBC driver for MSSQL
    1. I used jtds: http://jtds.sourceforge.net/ (there is also a Microsoft provided driver I didn’t hear about until I was done)
    2. download and unzip
    3. note the path to the jtds jar file (hereafter referred to as $JTDS and the jar filename
    4. open http://jtds.sourceforge.net/faq.html#driverImplementation, which has some magic strings JDBC wants
    5. optional – copy $JTDS/(x64|x86)/SSO/ntlmauth.dll into your %PATH% if you want to use windows authentication with SQL Server
  3. install JDBC driver for Postgresql
    1. Download from http://jdbc.postgresql.org/
    2. note the path to the jar file (hereafter referred to as $PG) and the jar file name
    3. open http://jdbc.postgresql.org/documentation/head/load.html, which has some magic strings JDBC wants

Then, to connect with MSSQL:

> library(RJDBC)
> mssql <- JDBC("net.sourceforge.jtds.jdbc.Driver", "$JTDS/jtds-1.2.5.jar", "`")
> testdb <- dbConnect(mssql, "jdbc:jtds:sqlserver://host/dbname")
> typeof(dbGetQuery(testdb, "SELECT whathaveyou FROM whither"))
[1] "list"

And you’re off and running with a list of your results in a list and can do whatever you like.

Now for postgresql+ssl:

> pgsql <- JDBC("org.postgresql.Driver", "$PG/postgresql-9.0-801.jdbc3.jar", "`")
> testdb <- dbConnect(pgsql, "jdbc:postgresql://host/dbname?ssl=true", password="password")
> typeof(dbGetQuery(testdb, "SELECT whathaveyou FROM whither"))
[1] "list"

The connection here has a lot more options, and depends highly on your server’s pg_hba.conf.  It took a little while figure out the “?ssl=true” bit.  Luckily you get pretty descriptive error messages if you can’t connect, and the PostgreSQL JDBC docs are pretty good.

Now to re-learn everything I once knew about regression modeling!

making SQL Server backups using python and pyodbc

I have a set of python scripts to help me manage a few SQL Servers at work, and one of the things I do is take database backups using BACKUP DATABASE and BACKUP LOG.  I’ve been using pymssql to connect, but today tried switching to pyodbc.  pymssql seems to be having momentum problems, so I figured I’d try the alternative and see how well it works.

I ran into two issues:

  • pymssql supports “%s” as a parameter placeholders, and pyodbc wants “?”
  • BACKUP and RESTORE operations on SQL Server don’t run like normal queries

The first was trivial to fix, the second took some digging.  If you try to run a BACKUP via pyodbc, the cursor.execute() call starts and finishes with no error, but the backup doesn’t get made.  With help from CubicWeb‘s post MS SQL Server Backuping gotcha, I learned that BACKUP and RESTOREs over ODBC trigger some kind of asynchronous / multiple result set mode.  To get around this, you can poll for file size (as Cubicweb did), but that gets ugly when making a backup on a remote server.

In a backup, I think each “X percent processed.” message is considered a different result set to ODBC. Instead of polling the file size, you can call cursor.nextset in a loop to get all the “percent processed” sets:

After adding that while loop, backups of small and medium sized databases worked like a charm.

Installing VS 2008 and SQL 2008 Express on Windows 7

A new decade means time for a fresh windows install at work.  I ran into some trouble with windows 7, visual studio 2008, and SQL 2008 Express.  Here’s how I resolved them.  Contrary to most things I found on the web, I’m not using betas or release candidates.

First off, installing SQL 2008 Express.  I only wanted the management tools, and this was a little hard to come by.  I downloaded various EXE files from MSDN, but none of them worked (they would error out, bring up an seemingly unrelated installer, or any other confusing behavior that may have led you here).  Here’s what worked for me:

  1. Be sure any previous installation attempts have been purged via Add / Remove Programs
  2. Go to the “other install options” page for SQL express: http://www.microsoft.com/express/Database/default.aspx#Installation_Options
  3. Click the “Management Tools” install button (for me that’s: http://www.microsoft.com/web/gallery/install.aspx?appsxml=www.microsoft.com%2Fweb%2Fwebpi%2F2.0%2FWebProductList.xml%3Bwww.microsoft.com%2Fweb%2Fwebpi%2F2.0%2FWebProductList.xml&appid=134%3B135)
  4. Install the “Microsoft Web Platform Installer” (MWPI) if it asks you to
  5. Should be straightforward from here on

The funny thing here is the MWPI seems to download an installer that looks a lot like the one at Microsoft® SQL Server® 2008 Management Studio Express that didn’t work for me.

Next up, Visual Studio 2008 (VS2008).  My company has an MSDN subscription, so we downloaded an ISO (named en_visual_studio_2008_professional_x86_x64wow_dvd_X14-26326.iso) and I used freeware MagicISO to mount it, then ran “setup.exe”.  The install failed on the “Microsoft Visual Studio Web Authoring Component” (MVSWAC).  Here’s what worked for me:

  1. IF YOU WANT SQL2008, DO THAT FIRST
  2. Be sure any previous installation attempts have been purged via Add / Remove Programs
  3. Download WebDesignerCore.EXE from microsoft
  4. Run it
  5. Install VS2008 from disc/iso as normal.

Digging into the ISO using 7zip, the problem is /WCU/WebDesignerCore/WebDesignerCore.EXE is corrupt.  To get VS2008 to install cleanly, first we need to install MVSWAC, at which point the VS2008 installer will happily skip past the corrupt file.  I ran across several blog/forum posts with horror stories about VS2008 installing SQL2005, and needing to uninstall half the planet to get things working right.

As always, be sure to hit up windows update, and change your update settings so you get fixes for VS2008 and SQL2008.

Microsoft® SQL Server® 2008 Management Studio Express

Testing

image

This is a test of the wordpress android app. I am likely too lazy to delete this test later.

Is programming all marshmallows and toothpicks, or is it just web apps?

I’ve been doing some maintenance programming for a few days solid (rare for me to get to program that much), and I again find myself amazed that any software works at all.  I’ve only been programming seriously for about a decade (mostly web apps), but it feels like I’m building rickety crap on top of other people’s horrible hacks.

The bar for quality software seems so abysmally low.  When coding around some bizarre behavior I’m seeing out of the .NET framework, I know I’m introducing weird brittle bits.  It feels wrong, but I don’t see any other option.  And this is new code, written for the latest released version of a very popular system!  It seems like everyone else is doing the same thing in every programming environment I’ve seen.

My best guess is I’m working at maybe the 1000th layer of abstraction over the bare metal, and that sounds low.  That’s a lot of cruft, hacks, bugs, security holes, late-night fixes, bad compromises and coffee.

Maybe my sense of “clean code” is just OCD?  Sometimes I wonder if writing good code is just a waste of time.  Is shoddy copy/paste winning the evolutionary battle for the software base that will drive humanity for the next millennium?

more heat-maps using vecto and ch-image

This is a follow-up to my post last year about simplistic heat-maps using Vecto. To recap, I’m trying to make heat maps for google maps overlays.

Here’s how it works in a nutshell:

  1. From javascript I pass to the server the lat/lng region currently shown on the google map, and what size heat map to generate, in pixels.
  2. lisp pulls weights from my database within the given lat/lng region
  3. lisp iterates over the db results, mapping lat/lng to x/y coordinates for the final heat map image
  4. lisp uses the list of mapped (x y weight) to draw the heat map in png
  5. javascript throws the png on top of the google map

I tried a few things based upon the comments I got back from the helpful lisp community.

  • used zpng to get direct pixel access, and calculated each pixel’s color using a weighted average of nearby points using distance.  This didn’t produce good images, and was pretty slow.
  • used zpng to get direct pixel access, and calculated each pixel’s color using the gravity formula against nearby points.  This didn’t produce good images, and was very slow.

I did some more research and learned about the Generic Mapping Tools and bicubic interpolation. The GMT is a set of C programs, similar to the Imagemagick suite.  GMT showed one way to draw heat maps in the Image Presentations tutorial.  It spoke of gridded data sets, and that gave me one more vecto-based idea: split the desired heat-map into a grid and color each square in the grid based upon an average of the weights mapped in that square.  This is a neat effect, but not what I was going for:

This is reasonably fast, taking about 1 second on my dev server.  To quickly find what weights belong in which grid square, I make a spatial index of all the weights, using an r-tree from the spatial-trees library.

The next method I tried was to use interpolation to get a smooth look.  I found Cyrus Harmon‘s ch-image library supports image interpolation, and got to it.  As Patrick Stein noted elsewhere, ch-image isn’t easy to install.  It’s not asdf-installable, and the project page doesn’t list all its dependencies.  For future reference, here’s what I think I needed to install it:

(asdf-install:install "http://cyrusharmon.org/static/releases/ch-asdf_0.2.14.tar.gz")
(asdf-install:install "http://cyrusharmon.org/static/releases/ch-util_0.3.10.tar.gz")
(asdf-install:install "http://cyrusharmon.org/static/releases/smarkup_0.4.2.tar.gz")
(asdf-install:install "http://mirror.its.uidaho.edu/pub/savannah/cl-bibtex/cl-bibtex-1.0.1.tar.gz")
(asdf-install:install "http://cyrusharmon.org/static/releases/clem_0.4.1.tar.gz")
(asdf-install:install "http://cyrusharmon.org/static/releases/ch-image_0.4.1.tar.gz")

Armed with ch-image, now the drawing process becomes:

  1. draw a small image, coloring pixels based upon weights
  2. enlarge the small image with interpolation

The first step is very similar to the code I wrote to make the grid version above.   Instead of drawing a rectangle, I draw a pixel using ch-image’s pixel access functions.  This was a little weird because ch-image’s coordinate system has 0,0 at the top left of the image.  I’m still not sure how to best choose the size of this smaller image, but ultimately it should depend on my data.  For now I just have it hard-coded be 20x smaller than the desired size:

Yep, that’s pretty small.  Applying a transform to scale it up to the desired size using bilinear interpolation yields:

It looks pretty good and takes about a half-second to draw.  If you click into the larger version, you can see some discontinuities in there, which is a well-known result of bilinear interpolation.  However, based upon other graphics I’ve seen, what I really want is bicubic interpolation.  Luckily, ch-image has this built in:

Oops, maybe not so luckily.  I can certainly see the kinds of look I’m wanting in all the garbled stuff, but ch-image is freaking out somewhere there.

Bilinear it is!  Here’s a screenshot of the overlay in place on the map:

It’s pretty fast, and looks pretty nice, and is fairly close to the look I wanted.  I probably still have some off-by-one errors somewhere, and need to check the git repos for the ch-* libs to see if there might be newer versions than the tarballs I installed.  I still count this as great progress for 5 hours of coding and research.  Huzzah for the much-maligned lisp libraries!

latest postgres docs bookmarklet

When using google to find things in the excellent Postgresql documentation, I often end up on pages showing old postgres versions.  For example, googling for “postgresql create index”, the first hit is for the postgresql 8.2 docs, and I’m running 8.4 now.  My co-workers made a greasemonkey script to automatically redirect to the current version, and I adapted that into a bookmarklet.

Drag this link into your address bar to to use it in your browser:

pg-docs

When you find yourself on a old postgres docs page, click the bookmarklet to redirect to the latest version of that page.   This should work as long as the postgres folks keep their URL naming scheme.

git-like line counts in svn using bash

I really like how git tells me how many lines inserted/removed when I commit, and wanted to get something similar from Subversion.  I’m working on a refactoring of an older system, and I wanted to know how my refactorings were effecting the code.  I think I’m going to remove a lot more code than I add, but why wonder when svn has all this info?

Using my horrible bash skills and this post on SVN Line Output Totals, I came up with an inefficient bash program to do what I want:

Example:

 > svn_line_changes -r 264:265
Scanning -r 264:265
Removed: 287
Added: 141
Difference: -146

simplistic heat-maps using Vecto

I stole some time from my increasing non-technical workload to play with generating heat-maps of residential energy consumption in my http://gainesville-green.com project.  The initial results are promising:

There are a few neat things going on here.  I’ve got a url handler in my lisp that looks to the query string for lat-lng bounds, image size, and some other variables to generate a PNG file.   I pass that URL to a Google Maps API GGroundOverlay to put the image onto the map.  Add some javascript event glue and I can do cool things like automatically regenerate the heat map overlay when you zoom/pan the map around, and display an animated heat map showing consumption over the course of the year.  There’s still a lot of UI interaction to sort out, but I think it’s a nice approach.

The heat map itself is generated using Vecto, and I think I’m doing it wrong.  I jump through some hoops to map lat-lng to image pixel coordinates, pull from the database, and end up with a list of (x y weight) tuples, with the weight being a number between 0.0 and 1.0 representing the relative consumption of the home that should be at pixel x,y in the result image.  Then I start painting, which is where I think I should be doing more math.  For each point, I pick a color between green and red based on the weight, using the handy cl-colors library to interpolate:

(defun find-color (percent)
  (if (> .5 percent)
      (cl-colors:rgb-combination cl-colors:+green+ cl-colors:+yellow+ (* 2 percent))
      (cl-colors:rgb-combination cl-colors:+yellow+ cl-colors:+red+ (* (- percent .5) 2))))

I actually have to go from green->yellow, then yellow->red, with some goofy adjustments to the percent to make the interpolation work out.  Once I have that, then I have my color, and my pixel, so I can start drawing.  To get a smoother look, for each point I draw concentric circles with different radius and opacity, so each individual data point is rendered like this:

This is enlarged to show some of the blockiness, it ends up looking pretty nice when they are small.  Here’s the actual function:

(defun draw-point (x y color max-radius)
  (iterate (for r from max-radius downto 1 by (max 2 (round (/ max-radius 6))))
	   (for alpha = (/ 1 r))
	   (vecto:set-rgba-fill (cl-colors:red color)
				(cl-colors:green color)
				(cl-colors:blue color)
				alpha)
	   (vecto:centered-circle-path x y r)
	   (vecto:fill-path)))

Max-radius determines how large the largest circle is, and is calculated based on how many points I’m drawing.

There are a few drawbacks to this approach.  First, it’s slow.  Drawing operations aren’t exactly cheap, especially when messing with alpha channels.  It takes me around 5s for 578 data points, which is fine for offline tasks, but on a web-app it needs to be super zippy or you fickle internet folk will close the tab. I also want it to be easy to show animations, so generating a bunch of them quickly would be nice.  The time spent increases fairly linearly with data points, and I’d like to be able to render heat maps for large areas with tens of thousands of data points.  Profiling shows practically all of my time and bytes consed are spent in the draw-point function. UPDATE: after more profiling, vecto:fill-path is most of my time, which makes sense.

Second, I have to be really careful to draw these points from lowest weight to highest weight, because I want red dots to be painted on top of green dots.  It seems like I should decide what color each pixel should be, then draw it once, rather then accumulating the right color in the image canvas.  Right now there’s also some bug with drawing lots of data points, I just get a big green image, when I would expect some reds or yellows.

Another issue is for apartments I have coordinates for the apartment complex, but not each individual unit.  This makes some funny results, like the big orange blob on the right side of the screenshot above where I’ve painted a few dozen points on top of each other.

I did some googling on heat-map algorithms, and found some actionscript and java code, but the actionscript was using a similar approach and the java was incomprehensible.  I think I’ll try making a big array for the result image, and calculating an average weight for each pixel, then loop through that and draw once.  I’m also going to try calculating the weights using magnetic field strength or gravity math.  I think that approach will end up faster, look nicer, and should be a fun problem.