Here at WhitePages we are fond of Postgres. We use it to store most of our data. When we recently decided to rebuild our entire Email and SMS system, Postgres was the natural choice for storing user contact information in that system. What was not an easy choice for that system, however, was the Ruby ORM to talk to the database. In the end we went with Sequel, but not with out a few modifications to a Ruby Gem in order to allow for testing without a database connection. Here’s how it all came together:
Why ActiveRecord wasn’t the solution
Being (predominantly) a Ruby on Rails shop the first place we looked was at ActiveRecord. Though it has become a very mature and full featured ORM we thought that it was packed with way too much functionality for what we needed, especially considering that we were not building this entire new Email/SMS system on Rails. Had we decided to build on Rails we would have taken a much more serious look at ActiveRecord but we felt like we had enough experience in using it across several other products to know that it was way more bloated than what we wanted to deal with for this project. There has also been quite a bit of discussion about performance issues in recent ActiveRecord versions and that certainly played a part in our skepticism about using it.
What we needed was an ORM that gave us a reasonably well organized model representation of our data, was light weight, supported multiple connection configurations (to quickly switch between read-only and read-write), included basic migration tools, and was easy to test. We found everything we needed with Sequel… well, almost everything. Though the Sequel ORM is simple and elegant and very easy to tie to Postgres it was not obvious how to test against anything other than a live, connected database.
That wasn’t good enough for us. We wanted to be able to allow developers and automated test systems to perform service tests that quickly generated their own test data, threw it away when they were done with it, and did it all without a connection to a database. Basically, we wanted in-memory fixtures.
Fortunately, we came across a public Ruby Gem called ‘Sequel-Fixture’. It was built to hook into Sequel and allow data to be created and destroyed easily from rspec tests. Unfortunately, however, that Gem had not been maintained for several months, was not Ruby 2.0 compatible, and lacked a major key feature: ‘in memory’. It still depended on a connection to the database. So it was part way there.
Not wanting to start from scratch we decided that there was quite enough useful functionality in the Gem for us to extend it to what we needed. With the original author’s permission (thank you Xavier) we did just that.
What We Updated
Dynamically Defined Schema
A key part of our automated database testing is being able to easily respond to schema changes. As we iterate on our database, especially early in the project, we want to be able to extend and maintain our tests with as little work as possible. More importantly, however, to be fully managed in-memory we need to be able to specify exactly what the database tables are supposed to look like on every test run.
So we extended Sequel-Fixture to support a “schema” section right inside of the fixture file.
As you can see, the data definition moved into a new “data” section while the new “schema” section is a yaml array of column definitions in the “users” table. Each row in the schema allows the name and type of the row to be specified.
There is also support for specifying which row is the “primary_key”, this is not exactly that interesting for most testing but it is something that the Sequel ORM needs when creating a table. If the primary_key attribute is not specified then we just pick the first column (entry).
Rows in the Data Section
In the fixture example above you will note that the “data” section is an array of entries. Obviously each of these entries (rows) in the array represents an individual row of fixture data in the ‘users’ table. In the original version of the Gem each data entry was a Hash value, it was made up of a key and a blob of data for the value.
As you can see below, this allowed for a neat way of referencing the data in your tests.
However, we felt that this was not the most accurate way to represent a table of rows so we updated the internal Table definition to look more like an array.
Today we have a Gem which represents the database layer of our Email/SMS Service and the spec folder contains several fixture files. Our test coverage is great, as we have iterated on the database the updates to the tests have been painless, and best of all, our Sequel driven service code is tested daily on our Jenkins server without a connection to a real database… just what we were aiming for.
Taking a snippet from Gem’s sample tests you can see how nice and clean this turned out.