Over the last few months I’ve been testing Red Hat’s new PaaS offering, OpenShift.  The testing insofar has gone extremely smooth, but what about preparing for when disaster strikes.  The rhc command-line client provides snapshot functionality but unfortunately carries the caveat of needing to shut down and restart the application in the process.  For a PHP or Python application this is fine as restarts are lightning fast, but redeploying a JavaEE application is… cough… not so lightning fast.

If you can live with the need to restart, this command will give you a very complete tarball:

rhc app snapshot save -a ApplicationName

The other option is to do it via a cron job that will back up the data into your OpenShift data directory on the server (the data in this directory is not deleted each time the application is redeployed).  To do this, you will need to have the cron cartridge installed in your application.

If you don’t already have the cron cartridge installed, you can easily enable it with this command:

rhc app cartridge add -a ApplicationName -c cron-1.4

With cron setup, we can setup jobs to run minutely, hourly, daily, weekly, or monthly by dropping scripts into the appropriate folder in our application’s .openshift/cron directory. Create a file named backup_postgres.sh in the minutely, hourly, daily, weekly, or monthly folder and paste the following script in. (Also available on GitHub)

#!/bin/bash
# Backs up the OpenShift PostgreSQL database for this application
# by Skye Book <skye at skyebook.net>

NOW="$(date +"%Y-%m-%d")"
FILENAME="$OPENSHIFT_DATA_DIR/$OPENSHIFT_APP_NAME.$NOW.backup.sql.gz"
pg_dump $OPENSHIFT_APP_NAME | gzip &gt; $FILENAME

With the backup script created simply commit it and push the changes to your application.

git add backup_postgres.sh
git commit -m "Added PostgreSQL backup script"
git push

Once the changes are pushed you can retrieve your application’s database backups by using rsync

rsync -avz -e ssh loginhash@appname-you.rhcloud.com:app-root/data ./

This, admittedly, isn’t the prettiest solution to the problem but it does its job.  I may end up patching the rhc CLI app to do something similar to this and automatically download the results.