versioning databases

December 20th, 2006

Keeping track of, and applying, database changes is probably the part of web development that I find to be the most difficult and annoying.

At the moment, we have a folder in the svn repository called ‘sql’, into which are placed numbered files containing a set of database changes. When the code that needs the changes is checked in, the appropriate file is included along with it. When an update is done elsewhere, it’s a matter of running each .sql file in turn against the database to bring it up to the current version.

The system is working well for us so far, with 3 (ish) developers working on the project, but it seems so brittle. The best approach seems to be to make sure that absolutely every line of sql gets typed into a text file before it is pasted into/run against the development database; this makes it more difficult to forget to include a change, but it can still happen. Performing the update is a laborious task at present- it could probably be scripted, but it seems wrong to put any more effort into this system when there must be better ways of doing things.

Keeping track of, and applying, database changes is probably the part of web development that I find to be the most difficult and annoying.

At the moment, we have a folder in the svn repository called ‘sql’, into which are placed numbered files containing a set of database changes. When the code that needs the changes is checked in, the appropriate file is included along with it. When an update is done elsewhere, it’s a matter of running each .sql file in turn against the database to bring it up to the current version.

The system is working well for us so far, with 3 (ish) developers working on the project, but it seems so brittle. The best approach seems to be to make sure that absolutely every line of sql gets typed into a text file before it is pasted into/run against the development database; this makes it more difficult to forget to include a change, but it can still happen. Performing the update is a laborious task at present- it could probably be scripted, but it seems wrong to put any more effort into this system when there must be better ways of doing things.

Rails includes a rather nifty system it calls ‘Migrations’ to handle this problem. A Migration is basically a class with pair of ruby functions, ‘up’ and ‘down’. The ‘up’ function contains the change(s) you want to make, while ‘down’ contains the steps (if there can be any) to reverse the change. Migrations live in numbered (and also named) files, and are generated (as stubs) when models are generated.

Cake, which is a PHP Framework commonly considered to be a Rails ‘clone’, has a similar system; rather than writing Migrations in code (PHP), they are written using YAML. I can see the benefit of both approaches: being able to use the full power of the language allows you to do much more, while restricting Migrations to a set of common operations, accessed through a YAML specification makes things a lot easier to maintain.

I began work a while ago on a PHP-based Migrations system that will integrate with the framework we’re using, but will also hopefully be usable elsewhere. I decided that rather than follow either Cake or Rails, I’d accomodate both methods. Migrations can be either PHP classes that extend a base Migration, or they can be YAML definitions that are parsed and translated into Migration objects. As we’ve got rather a lot of SQL files at the moment, it seems sensible to also allow for a Migration to consist just of SQL, if you so desire.

The operations that the Rails Migration class supports are quite basic:
  • create table / drop table
  • add column / drop column
  • rename table
  • rename column
  • change column
  • add index
  • remove index

They are a good place to start, but I can see potential for expansion to combine common sequences of these basic actions into named operations. As an example: The system we’re making makes uses of Views (in the database, not MVC views) for the presentation of related entities. Such a view will commonly be something like


SELECT p.*, c.name AS company FROM projects p JOIN company c ON (p.company_id=c.id);
Adding a column to the table doesn’t cascade to the view; the ‘SELECT *’ is expanded when the View is created. Having migrations take care of such things would make life easier.

I think it will probably be the case that keeping it simple, with the option to fall back to code should something require it, is how I proceed, producing reusable complex operations as and when there’s a need for them.

1 Response to “versioning databases”

  1. David Goodwin Says:
    Moodle uses the approach of having a version number stored in the DB, and a number of appropriately numbered files.... when it then comes to upgrading, it knows which to apply. Obviously this doesn't help with rolling back to a previous version - which would have to be 'restore from backup'. It's not something we've had to worry about yet at work - adding extra fields into the database occurs through Propel's schema.xml - and as long as we never remove or rename existing fields there should be no problem (the problem occurs when the database dump is reloaded; and obviously it fails if fields are no longer there or names change). For one project we have a more complex database setup, with which we write small .sql files which are added to svn and applied manually on the customers system and our own.

Leave a Reply