10 years of overt.org

Ten years ago today, I sat in my dorm room at UT Austin with a dictionary in my lap, hunting for a simple English word, five letters or less, that hadn’t already been claimed as a domain name. I must have tried a hundred things before I came across “overt.” I was a college freshman and totally digging all this new freedom over my life and interests, meeting new friends every day, trying on new ideas all the time—so overt seemed like a perfect fit. Although all three of overt.com, overt.net, overt.org were available (these being the only three top-level domains available for public registration at the time, not counting ones for other countries), my anti-corporate and alliterative (see?) tendencies let me to choose overt.org. In retrospect, I should have just registered all three, but the $75/year it cost to register a domain in 2000 seemed like a lot of money to me and I definitely wasn’t going to spend it three times over.

I worked part-time doing web and database development at the career center for the college of engineering and had access to several internet connected computers. The same night that I registered overt.org, I set up an illicit web server (embarrassingly, Windows 2000/IIS) on my publicly accessible workstation in the career center and pointed my new domain to it. I stayed up late creating a web page with dated, journal-like entries that ran in reverse order on the front page just like my favorite website at the time, slashdot. Later I would learn that web sites like this had been called “web logs” since at least 1997—I was way behind the curve! And so, overt.org was born:

A week or two later, I set up an SMTP/POP3 email server on the same machine, assumed the email address I’ve used since then, and started handing out accounts to my friends.

The hardware behind overt.org took many forms over the next four years at UT, moving from my workstation to a dedicated machine in ENS, the electrical engineering building, but it always remained hidden in a corner or under a table, leeching off of UT’s excellent and pretty much unmonitored internet connection. As we approached graduation, Ali, George, Drew and I pooled our money together to pay for a dedicated server based out of San Francisco: overt.org was legit and has been ever since.

Over the last ten years, overt has grown quite a bit. It now hosts over three dozen web sites, blogs, and photo galleries. It’s a labor of love for Ali and I to maintain the server and it’s been a lot of fun to watch become a home for us, our friends, and our families on the internet.

Here’s to the next ten years of overt.org!

off-site backup for $0.10/GB using dirvish and Amazon EC2 and EBS

I’ve been using dirvish, an rsync-based snapshotting backup system, for years to manage local and off-site backups. It’s simple to set up, automatic, creates daily snapshots of entire systems (or just specific directories), and it’s a breeze to browse and restore–all the files are right there in a tree, organized by date. Think of it like Apple’s time machine, but better because you can actually make it do what you want.

I recently needed to set up off-site backup for a few hundred gigabytes of data. My first thought was S3, but the HTTP interface meant that I couldn’t use a simple tool like rsync (or dirvish) to automate the snapshotting, and that browsing and restoring entire filesystems from backup would be cumbersome. Then I remembered that Amazon recently announced support for booting EC2 instances from persistent EBS volumes. This lets you “save” an instance by shutting it down and starting it up again, and you only pay for compute hours when the computer is running. Storage on EBS volumes is cheaper than on S3 ($0.10/GB instead of $0.15). Also, EBS volumes are just normal block devices that can be mounted by EC2 instances as though they were hard drives.

So here’s the idea: create an EC2 instance that boots from a big, dedicated EBS volume. Every night (or week, or whatever), start up that instance, run dirvish for the off-site backup, and then shut it down again. I only pay for the instance during the short periods it runs to perform the backup, and my data is saved offsite on the durable EBS volume. I implemented this system and it has been working great for several weeks. I just launch a python script (as a cron job) that starts the instance, runs dirvish, and then shuts it down when it’s complete. For those interested, here’s the (quick, dirty) python source (which uses the excellent boto library for manipulating the EC2 instance):

#!/usr/bin/env python -t
# encoding: utf-8
"""
run_offsite_backups.py

Wake up the EC2 backup server, run dirvish backup, then shut it down

Created by Bryan Klingner (code.b@overt.org) on 2010-02-02.
Feel free to use this code yourself. Maybe email me if you do :)
"""

import sys
import os
import boto
import time
import subprocess

BACKUP_INSTANCE_ID = 'YOUR_INSTANCE_ID'

def main():

    conn = boto.connect_ec2()

    # get the backup instance object
    instance = conn.get_all_instances(instance_ids=(BACKUP_INSTANCE_ID,))[0].instances[0]

    # if the instance is stopped, start it up
    if instance.state != 'running':
        conn.start_instances(instance_ids=(BACKUP_INSTANCE_ID,))
        waited = 0
        while instance.state != 'running':
            instance.update()
            sys.stdout.write("rInstance starting up (%d sec)..." % (waited))
            sys.stdout.flush()
            time.sleep(1)
            waited += 1

    print "n"
    print "Backup instance running:"
    print "    ID:       ", instance.id
    print "    State:    ", instance.state
    print "    DNS name: ", instance.dns_name

    # chill for a few seconds so the SSH server is listening
    time.sleep(10)

    print ""
    print "Initiating backup..."
    retcode = ssh_cmd('dirvish-expire; dirvish-runall', instance.dns_name, user='username')
    print ""

    # backup is done; shut down the instance
    conn.stop_instances(instance_ids=(BACKUP_INSTANCE_ID,))
    waited = 0
    while instance.state != 'stopped':
        instance.update()
        sys.stdout.write("rInstance shutting down (%d sec)..." % (waited))
        sys.stdout.flush()
        time.sleep(1)
        waited += 1
    print ""

def ssh_cmd(cmd, host, user='root'):
    """ Run a shell command on a remote server via ssh """

    ssh_cmd = 'ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no ' + user + '@' + host + " '%s'" % (cmd)
    print "Running SSH command: %s" % ssh_cmd
    returncode = subprocess.call(ssh_cmd, shell=True)

    #logging.debug( output, returncode )
    return returncode

if __name__ == '__main__':
    main()