From The Beginning

Alright. So you’ve just completed your first Rails app, and now it’s time to put it somewhere public. You’ve gone and purchased a domain name, found hosting, and done some reading on deploying Rails applications. In the process you discover that most people recommend something called “Capistrano”, which seems to promise turn-key deployment of Rails applications.

Cool!

Installation of Capistrano is easy enough, just `gem install capistrano`. But what next?

Well, here. Let’s walk through this together.

Decisions

Before we can effectively configure Capistrano, we have quite a few deployment-related decisions to make. Capistrano won’t make them for you, but if you do things a certain way, Capistrano’s defaults will do most of the work for you.

Let’s look at our options. Think of this as your pre-flight checklist. No deploy will succeed until you’ve got all of these taken care of. It’s tedious and (unless you enjoy configuring servers) not too fun, but once done, you shouldn’t have to deal with it again.

Web Server

First, you need to decide what web server software you’re going to use. Apache, Lighttpd, and nginx are the three big players in this arena (as of this writing). If you’re on a shared host, this decision may already be made for you. If you have no preference and are in a position to install your own web server software, you might want to look at nginx.

Database

Next, you need to decide what kind of database you want your production environment to use. MySQL and PostgreSQL seem to be the biggest players in the open source community, though big companies might also use databases like Oracle.

Some people are using SQLite in production, but that’s not recommended. It’s a great database for single-user and embedded scenarios, but it really doesn’t scale well in a web environment.

Of the options available, MySQL is generally the easiest to set up and install, so unless you’ve experience with PostgreSQL or otherwise have a reason not to use MySQL, I’d recommend MySQL.

Ruby

Ruby is pretty critical to any Rails application. You should decide what version of Ruby you need, and what implementation. JRuby can run Rails these days, so if you decide you want to deploy to JRuby instead of MRI Ruby, that will significantly alter your deployment configuration.

Application Layer

The “application layer” is the part of the stack that actually runs your application. Mongrel is a very popular option for this layer, though Thin and Passenger (e.g. mod_rails) are gaining in popularity.

Some people have tried to deploy to WEBrick (the web server that is bundled with Ruby), but this is definitely not recommended. WEBrick suffices as an environment in which to develop and test your application, but does not scale nearly well enough in a production environment.

Also, note that if you choose to go with something like Passenger, you’ll need to make sure you are using a supported web server. (Passenger only works with Apache 2, for example.)

Source Control

It might seem strange to think of source control coming into play in a deployment configuration, but Capistrano is (at least by default) highly dependent on your source control system.

Capistrano supports quite a few SCM’s right out of the box, including Subversion and Git, to name two of the most popular among Rails programmers these days. Just choose one that you’re comfortable with and go with it. Unless it is a pretty esoteric system, odds are Capistrano supports it.

This Tutorial

This tutorial will focus on the following setup. If your setup is different, you’ll need to adapt in a few places.

  • nginx
  • MySQL
  • MRI Ruby 1.8.6
  • Mongrel
  • Subversion

Capification

The first thing you need to do (after installing Capistrano) is to “capify” your application. This is the process of configuring Capistrano to deploy your application. It’s easy enough; just make sure you’re in your application’s root directory and type “capify .” (that’s “capify”, followed by a space, followed by a dot).

This will create two files:

Capfile
this is the “main” file that Capistrano needs. Just as “make” automatically loads “Makefile”, and “rake” automatically loads “Rakefile”, Capistrano looks for and loads “Capfile” by default. The default Capfile generated by capify is pretty minimal. All it does is load “config/deploy.rb”...
config/deploy.rb
this is the file that contains your Rails application’s deployment configuration. Rails prefers that all configuration go under the config directory, and so Capistrano tries to abide by that as much as it can by just pointing the Capfile at config/deploy.rb

In general, you’ll leave the Capfile alone and do all your tweaking in config/deploy.rb. If you were using Capistrano in a non-Rails environment, though, you’d probably have only a Capfile, with no config/deploy.rb at all. (We’ll leave that for a later tutorial, though. We’re focusing on getting your new Rails application deployed, after all!)

Configuration

Now that we’ve got the files we need, the next step is to tell Capistrano about our server. Because we’re starting small, we have only one server, which will host our web server, our Rails application, and our database, all in one.

The configuration process is actually pretty painless. Capistrano mostly just requires a few little bits of information.

If you look at the default config/deploy.rb, you’ll see there’s not much to it. We just need to tell Capistrano a few things. First, we need to tell it what our application is called. For this tutorial, I’m going to call it “tutorial”, but you should obviously choose something more appropriate for your own deployment:

set :application, "tutorial"

Then, we need to tell Capistrano where our source code resides. This is the repository address for your application, and by default it must be accessible both by your local machine (where you will be deploying from) and your production servers (where you will be deploying to). The following URL is bogus, but it should give you an idea of how to specify your own:

set :repository, "svn+ssh://code.capify.org/repos/tutorial"

Now, if you aren’t using Subversion, you’d next want to tell Capistrano what SCM you are using:

set :scm, :git

In our case, though, we’re using Subversion, which is the default, so we don’t need to specify the SCM at all.

Next, we should tell Capistrano, where on our server we are deploying to. To really understand what this means, we need to detour a bit and talk about the directory structure Capistrano will use to deploy your application.

Deployment Directory Structure

A successful deployment with Capistrano will result in a directory structure like the following (where [deploy_to] is where you want Capistrano to be deploying to on your server):

[deploy_to]
[deploy_to]/releases
[deploy_to]/releases/20080819001122
[deploy_to]/releases/...
[deploy_to]/shared
[deploy_to]/shared/log
[deploy_to]/shared/pids
[deploy_to]/shared/system
[deploy_to]/current -> [deploy_to]/releases/20080819001122

Each time you deploy, a new directory will be created under the “releases” directory, and the new version placed there. Then, the “current” symbolic link will be updated to point to that directory. (For now, don’t worry about the shared directory; we’ll get to that eventually.)

You’ll need to make sure you’ve configured your web server (whichever one you choose) to look at [deploy_to]/current/public as the root web directory for your application.

Back to Configuration

So, back to configuring things, we need to tell Capistrano where on the server we want our application to live. It defaults to ”/u/apps/#{application}” (where #{application} is the name you set application to, above), but you might prefer (or need) to deploy to something like ”/var/www”. To do so, just set the :deploy_to variable:

set :deploy_to, "/var/www"

Lastly, we need to tell Capistrano where our servers are (we only have one), and what roles they each play (again, we only have one, which will fill all of the roles). This is done via the “role” declaration.

role :app, "tutorial.com"
role :web, "tutorial.com"
role :db, "tutorial.com", :primary => true

Because we have only a single server, our role declarations look a bit redundant. An alternative syntax uses the “server” keyword:

server "tutorial.com", :app, :web, :db, :primary => true

Both are (in this case) functionally identical, but the former scales to multiple servers a bit more maintainably than the latter.

Those three roles (app, web, and db) are the three that Capistrano needs for Rails deployment. You can always add your own roles, too, for tasks of your own, but make sure you declare at least one server in each of app, web, and db when you’re deploying Rails applications.

  • web: this is where your web server software runs. Requests from the web go straight here. For sites with lots of traffic, you might have two or three (or more!) web servers running behind a load balancer.
  • app: this is where your application layer runs. You might need only a single web server, and have it send requests to any of four or five different app servers, via a load balancer.
  • db: this is where migrations are run. Some prefer not to actually run code on the server where the database lives, so you can just set this to one of the app servers if you prefer. Capistrano will deploy your application to servers in this role. The ":primary => true" bit just says that this is our “primary” database server—you can have as many as you want (for database slaves and so forth), but Capistrano will only run migrations on the one you designate as “primary”.

The addresses you place in each role are the addresses that Capistrano will SSH to, in order to execute the commands you give.

Miscellaneous Configuration

The above should suffice for many people. But here are some additional variables you might find yourself wanting to set.

  • set :user, “foo”. If you log into your server with a different user name than you are logged into your local machine with, you’ll need to tell Capistrano about that user name.
  • set :scm_username, “foo”. If you access your source repository with a different user name than you are logged into your local machine with, Capistrano needs to know. Note that not all SCM’s support the scm_username variable; you might need to embed the scm_username into the repository, e.g. “svn+ssh://#{scm_username}@foo.bar.com/path/to/repo”.
  • set :use_sudo, false. By default, Capistrano will try to use sudo to do certain operations (setting up your servers, restarting your application, etc.). If you are on a shared host, sudo might be unavailable to you, or maybe you just want to avoid using sudo.

Talking to Capistrano

Alright, now that we’re all configured, we can start asking Capistrano some questions. This is the “getting to know each other” phase.

First, try asking Capistrano what it knows about itself:

$ cap -h

It’ll spit out a list of all the options it accepts. Let’s try another one of those:

$ cap -H

Giving it a capital H will give you a verbose description of what each option does.

Next, let’s ask Capistrano what “tasks” it is aware of. “Tasks” are Capistrano’s unit of work. Though Capistrano comes bundled with several built-in tasks, you can also write your own to automate workflows of your own. For now, let’s see what tasks Capistrano knows:

$ cap -T

This tutorial will only touch on a few of those, but you can get more in-depth help on any of them with the “-e” switch:

$ cap -e deploy:web_disable

It’s a great way to learn about tasks you’ve not encountered before.

Setting Up the Server

Next, we can actually start using Capistrano to talk to the server. We’ll first need to ask Capistrano to set up the basic directory tree it will use:

$ cap deploy:setup

This will log into your server and do a series of “mkdir” calls. Note, though, that if you aren’t using sudo (e.g., you set use_sudo to false, in your deploy.rb), you’ll need to make sure the permissions on your deploy_to directory are okay. If they aren’t, deploy:setup will fail. To fix the permissions, you’ll need to log into your server and do a few invocations of chown to set the permissions. (What they need to be set to depends greatly on your configuration, so if you aren’t sure how to do this, feel free to ask on the Capistrano mailing list.)

Even if you are using sudo (or, maybe especially if you are using sudo), you may need to fix the permissions on the directories that deploy:setup creates. Make sure the directories have the right permissions for whichever user you are deploying as to write to them.

Check Dependencies

Now that we’ve got the skeleton of the directories in place, we can ask Capistrano whether we’ve got all the dependencies in place. This includes directory permissions, as well as programs that will be needed (both locally and remotely). It’s easy and painless to do:

$ cap deploy:check

Capistrano will check both your local and remote servers for a variety of things. If any of the dependencies are not met, you’ll get an error message indicating what the problems are. For things like “you do not have permissions”, you’ll need to log into the server and manually chmod or chown directories until the permissions are sufficient. For others (like the “svn” executable missing, if you’re using Subversion) you’ll need to make sure you have those tools installed.

A quick aside: you might have subversion (or whatever SCM you’re using) installed, but in a place that’s not in the standard path. (The standard path is typically /bin:/usr/bin:/usr/sbin.) If this is the case, Capistrano won’t be able to find your svn executable, and you’ll need to tell Capistrano explicitly where it is. To do so, set :scm_command to the path on the remote servers where it is located. If you do this, though, you might discover that Capistrano can no longer find the command on the local server; in that case, set :local_scm_command to :default (or to the explicit path on your local server).

Keep running “deploy:check” after your tweaks until Capistrano says that you pass. It’s a hassle, I know, but you only need to do it once! Once a server is configured, subsequent deploys are painless.

Database Initialization

Alright, we’re getting closer to that first deployment! Before we get there, though, we need to make sure our database is ready. You’ve got it installed, but you also need to make sure of three things:

  1. Have you created a database for your application? Aside from the database software (the DBMS, or Database Management System), there is also the concept of a logical database. A single instance of MySQL, for example, can serve multiple logical databases. For many DBMS’s, this is as simple as logging in as a superuser and issuing a “CREATE DATABASE” call (you’ll want to read the manual for your chosen database system, though).
  2. Have you added a “production” section to your config/database.yml file? Some people are uncomfortable adding production login credentials to their source code repository, but this tutorial (for simplicity’s sake) will assume that’s what you’re doing. (If you’d rather not, you can search Google for examples of alternatives, or ask on the mailing list. Also, Mike Clark’s “Advanced Rails Recipes” has a good example you can practically copy and paste.)
  3. Have you set adequate permissions on that database for your application? Whichever user you configured for production access in database.yml must have adequate permissions for your database. This means (at least) read and write access, but the exact permissions (and how to set them) will vary from application to application, and DBMS to DBMS.

Spinners and Spawners

One last thing and then we’ll be ready to push our code out to the server. We need to tell Capistrano how to “spin up” (start) our application layer. The precise way this works will vary depending on how you’re running your application layer. If you’re using mod_rails, for instance, it will be very different than if you’re using mongrel. Here, I’ll assume you’re using mongrel.

By default, when Capistrano needs to start your application layer, it will try to execute a script called “spin”, in the “script” directory of your application, on each remote server. We’ll need to write that script, and then check it into the source repository.

Rails makes it pretty easy to do this. There is already a script built into rails, in your script/process directory, called “spawner”. What it does is try to start up your application layer. We’ll use it in our script/spin script, like this:

#!/bin/sh
#{deploy_to}/current/script/process/spawner \
  mongrel \
  --environment=production \
  --instances=1 \
  --address=127.0.0.1 \
  --port=#{port}

Be sure to replace #{deploy_to} with your actual (full) deployment path, and #{port} with the port you want your mongrel instance to listen on. (If you haven’t a preference, 8000 is as good a port as any.) Make sure script/spin is executable, and then check it into your repository. (You Windows folks will need to work harder to make sure things are executable; if you’re using Subversion, you can do “svn propset svn:executable ON script/spin”, and committing again.)

Now, looking at the script you just wrote, notice that we’re only starting a single mongrel instance (—instances=1). For a personal blog or other low-traffic site, this is entirely sufficient. It is only if you expect your site to serve more than one request per second over a sustained period of time that you’d want to consider adding more mongrel instances. If you increase that number, subsequent mongrels will listen on the next port number (e.g., if port is 8000 and you specify that you want 5 mongrel instances, they will listen on ports 8000, 8001, 8002, 8003, and 8004).

Lastly here, make sure your web server is configured to proxy all requests for your application to the port in question. If you are using multiple mongrel instances, you’ll need to configure your web server to round robin (or otherwise load balance) across the multiple ports. (If your web server doesn’t make it easy, or even possible, you might consider using something like HAProxy).

Pushing the Code, Kicking the Tires

Alright. We’re not going to do an actual “deploy” yet, but we’re going to push the code out to the servers to make sure things are all kosher thus far. It’s easier to fix the problems that arise this way, than if you were trying to do a full-blown deploy.

So, let’s push things out to the server:

$ cap deploy:update

This will copy your source code to the server, and update the “current” symlink to point to it, but it doesn’t actually try to start your application layer.

If any of that fails, you might need to look closely at the output to determine what went wrong. Maybe it was a permissions error? Maybe it can’t find the svn executable? A common error at this point is when subversion is unable to talk to the repository server, due perhaps to a missing public key on the server. Most of those things should have been caught at the deploy:check stage, but it seems like there are always little edge cases that slip through the cracks. Fix those up and keep trying “deploy:update” until it succeeds.

Once it succeeds, you’ll log into your server and change to the directory of your new release (e.g., [deploy_to]/current). First, we need to load up the schema into your database. (NOTE: do NOT do this if you already have an existing database with data in it, because it will WIPE ANY EXISTING DATA. You have been warned!) This will also test that our production database configuration is correct.

$ rake RAILS_ENV=production db:schema:load

If this succeeds, you’ll see a bunch of text scroll quickly by as Rails sends “create table” and “alter table” commands to your database. If it fails, you’ll need to figure out why—usually it will be because your production database configuration is incorrect (e.g., wrong database name, wrong user name, wrong password, or similar). It might also be because Rails is not installed. (People typically check a copy of Rails into their vendor directory, so it is always available whether the server has Rails installed or not. Search Google for “rake rails:freeze:edge” and “rake rails:freeze:gems”.) Whatever fixes it, be sure you commit the fix to your source repository, too!

Once that’s done, we can test to see that our application will actually load by starting up a console instance:

$ script/console production

If all goes will, it will start up and present you with a prompt. Mimic an HTTP request now by using the “app” helper in the console. For now, just grab the root URL and see what happens:

>> app.get("/")

That will return the HTTP status code. If it returns 200 or 302 (or any other 2xx or 3xx code, or even 4xx if you haven’t configured the ”/” url for your application), you’re probably set. If it returns something in the 500’s, you’ll want to check your log/production.log and see why it blew up.

Next, let’s check that your web server is configured right to serve up the static assets. Choose one of your javascript or stylesheet files (we’ll go with prototype.js, since it’s pretty standard in most Rails applications, but you can choose whichever you want.) Open up a web browser and browse to your application’s URL, adding ”/javascripts/prototype.js” to the end. If you see the contents of prototype.js in your browser, then you should be set. If not, you’ll need to dig around in your web server’s configuration, restarting your web server after each change, until it does. (This part can be frustrating, especially if you’re new to configuring a web server. Feel free to ask questions on the rails-deploy or Capistrano mailing lists if you need assistance.)

Lastly, let’s try firing up the application layer and seeing if the dynamic content is being served correctly. From your local machine again, try this task:

$ cap deploy:start

This will start up the application layer from a cold (non-running) state, using the script/spin script we wrote earlier. If you get an error, look close at the output and see if you can figure out what went wrong. Maybe the script/spin script isn’t executable? Maybe there’s a typo in the script? Perhaps mongrel isn’t installed?

Once deploy:start finishes successfully, we reach the moment of truth. By this point, if everything is configured correctly, you should be able to pull up your application in your browser. Let’s try it now and see what happens!

If it works, you’ll know it—you’ll see exactly what you would expect to see. On the other hand, if things aren’t set up right, you’ll either get an error, or see something otherwise unexpected. Typically, here you’ll see some kind of “proxy” error. This means either the web server is trying to proxy to the wrong port, or that your mongrels aren’t running (or are running on the wrong port). You can check the mongrels easily enough, at least. Just log into your server and try doing “curl http://localhost:8000/blah” (or whatever port you chose). Replace ”/blah” with a route applicable for your application. If you see the HTML for the page you expected, then the mongrel is running on the expected port. If curl complains, then it isn’t. A quick “ps wwaux | grep mongrel” should tell you whether a mongrel process is currently running.

Regardless, if it isn’t working at this point, you need to look at your web server logs, your Rails logs, and similar to diagnose the issue. Feel free to ask questions on the rails-deploy and capistrano mailing lists, but if you do, give as much information about the issue as you can!

If it does work here, then congratulations! You’ve got your application configured and running on your server! From here on out, things are much smooth sailing.

Restarting and re-deploying

We’ve got two more tests to do. First, let’s make sure Capistrano can successfully restart our application layer:

$ cap deploy:restart

Run that, and then go view your application in a browser again. If it worked, you should still be able to see your application. If it didn’t, then you’ll need to troubleshoot.

For the last test, let’s run a full deploy. By this point, we’ve tested our configuration, and the various parts of a deploy: pushing the code out, restarting the application layer, and so forth. So theoretically, nothing should go wrong! Here we go:

$ cap deploy

It should work! If it doesn’t, hunt down the errors and fix them, running “cap deploy” until it finally works. Once it does, subsequent deploys should “just work” as well.

And once that works, you’re ready for the next tutorial, which will cover slightly more advanced topics. You’ll learn about rolling back bad deploys, cleaning up old deployments, seeing what changes will be deployed, and more.

About deploy:cold

“Hold on a minute,” you say. “I was reading this other blog and they just used cap deploy:cold to do all this stuff.”

First of all, deploy:cold was a bad idea. It attempted to do what I have described in this tutorial, but without enough opinions, and without enough options for configuration. I’ve come to the conclusion that setting up a new app on a new machine is not something that can be generically automated. Individual set ups on specific OS’s might be possible (consider Mike Bailey’s Deprec recipes for setting up Ubuntu servers), but to try for a general solution to setting up any arbitrary Rails application…that would be called a “fool’s errand”, in my book.

So, the bad news, setting up a new application isn’t turn-key. At least not with Capistrano. The good news is that Capistrano can make it easier than doing it all manually. The best news is that once it is set up, subsequent deployments truly are turn-key—just “cap deploy” and you’re all set.

In the end, though, “deploy:cold” remains. If you want to use it, you are certainly welcome to. There may even be certain cases where it is even reliable and useful. But in general, I discourage its use.