Chapter 15. Scaling Gracefully

Scaling has always been a controversial issue for Rails applications, yet there are many examples of apps that have been able to scale successfully. The key is always being able to identify bottlenecks and distribute the load across different services to handle different tasks efficiently. In this chapter, we are going to see how we can break down the complex task of scaling an application, and how this doesn’t simply mean scaling a specific framework (in this case, Rails). We are going to introduce the concept of middleware and see how we can use it to distribute the load across the different server instances and APIs that we are going to use, in order to scale horizontally. We will take a look at Nginx and the Lua programming language to see how these can be used to create a front-facing HTTP server and load balancer for our APIs.

Scaling Rails

As mentioned, scaling is a controversial issue in Rails. But let’s tackle this issue with the appropriate preparation.

First of all, there are a couple of concepts about scaling that we should have clear. Scaling a service isn’t just about the framework used to code the service functions. It is about the architecture, the databases, how you use caching, how events are queued, disk I/O, content distribution networks, and a variety of other things.

So to answer the question “Does Rails scale?” as straightforwardly as possible, the response is definitely “Yes!”

Furthermore, to comment on that question, Rails is among the most efficient frameworks available right now to quickly build an application—so you shouldn’t worry about the scalability of the framework you are using, but actually design your architecture for scalability independently of Rails.

Deploying an application can technically mean a range of different things, and the processes involved might take place at very different levels. Rails can in fact be deployed with different servers, and the deployment process can be automated using different tools.

If we were to consider a very simple scalable application, we would identify the following two levels in our service architecture:

  • Application server (Unicorn, Puma)
  • Front-facing HTTP server/load balancer (Nginx)

There are different application servers that can be used for Rails, but I have mentioned Unicorn and Puma specifically. Both Unicorn and Puma are designed for running Rack applications only. Rack is a middleware providing a minimal interface between web servers that support Ruby and Ruby frameworks. Rack considers an application to be any object that responds to the call method, which takes the environment hash as a parameter and returns an array with three elements:

  • The HTTP response code
  • A hash of headers
  • The response body, which must respond to each

The array is passed from the web server to the app, and the response is sent back to the server.

Unicorn is a very mature web application server for Ruby, “designed to only serve fast clients on low-latency, high-bandwidth connections and take advantage of features in Unix/Unix-like kernels.” Specifically, processes are used in Unicorn to handle multiple requests concurrently, while the operating system kernel takes care of performing load balancing. It is a fully featured server, but by design Unicorn is principally and only a web application server. It uses forked processes to allow Rails applications to handle multiple requests concurrently.

Puma is a web server built specifically for Ruby and based on Mongrel, designed with speed and parallelism in mind. It is described as “a small library that provides a very fast and concurrent HTTP 1.1 server for Ruby web applications.” Puma has several working modes: you can set the minimum and maximum number of threads it can use to do its job, but it can also work in a clustered mode whereby you can use forked processes to handle requests concurrently.

Depending on your service requirements, you could choose to use either Puma or Unicorn.

Past the application server we find the HTTP server. We will now focus on preparing a production environment and configuring an HTTP server, using Nginx as our example. With Nginx, either Unicorn, Puma, or another web app server can be used.

There are many reasons behind the choice of Nginx over another HTTP server, such as Apache. Notably, Nginx is usually faster than Apache, and quicker to set up and configure. The Nginx HTTP server has been designed from the ground up to act as a multipurpose, front-facing web server. In addition to serving static files (e.g., images, text files, etc.), it can balance connections and deal with some exploit attempts. It acts as the first entry point for all requests, distributing them to web application servers for processing.

We will use Nginx as a middleware enabling our different APIs to communicate. This will allow us to eventually distribute the traffic to all our endpoints. More generally, Nginx can act as a proxy and load balancer, helping integrate different APIs into a product or service, or even allowing you to wrap different API requests in code snippets written in a language similar to JavaScript.

So, the scaling of our Rails app can be split into different subtasks:

  1. Preparing, maintaining, and deploying the Rails application servers
  2. Preparing the Nginx-based front-facing server with the load balancer and the reverse proxy to distribute the load across the Rails application servers
  3. Scaling the storage options and databases
  4. Scaling the infrastructure hardware (if you actually own this)

As you can see, scaling a Rails application involves much more than just scaling the Rails framework. On the contrary, you are scaling a whole set of different services.

In general terms, there are two main scaling strategies:

  • Horizontal scaling
  • Vertical scaling

Scaling horizontally means you are adding more servers, while when you are scaling vertically you are altering and tweaking a server’s resources, for example by increasing its size or adding more memory to it. A reasonable scaling strategy often involves scaling both vertically (up) and horizontally (out). To scale horizontally you will have to add a middleware to route traffic to your different servers according to a set of defined rules. We are going to see how this is accomplished in the next sections.

Using Processes Efficiently

A problem with scaling is managing the background workers and jobs that an application needs to handle and schedule. Foreman is a tool that can be used to configure the various processes that your application needs to run, using a Procfile.

More information about Foreman can be found at David Dollar’s blog.

Creating a Middleware for Different APIs to Communicate

To understand what an API middleware is, we need to borrow some concepts from the field of distributed systems. After all, web applications using different APIs to communicate are de facto distributed systems.

The simplest definition of a middleware is that of a component that lies in between two systems. If you think of a client/server application, the middleware is the slash between the client and the server, or in other words the component making the communication between the two possible.

There are different advantages to using such a solution when dealing with and integrating APIs. First of all, a middleware really facilitates the development of distributed systems by accommodating the heterogeneity of the different parts (in our case, the different APIs). It does this by hiding the individual details of the single services and providing a set of common and domain-specific endpoints.

The different services and applications in our architecture will communicate directly with the middleware, and this will be responsible for distributing the traffic to our different APIs according to the methods and the resources requested.

It follows that when an API changes, you do not have to modify all the applications and services that use that API. You just have to apply the change in the middleware, modifying the logic that handles the new calls and methods; all the different applications can keep calling the same methods they were using before. 

Another advantage of using a middleware is that you will be able to monitor how your applications are using the different APIs from a single point in your architecture. You will have a clear picture of the traffic load on each component, since all the calls and requests will go through the middleware. This means you will not have to develop and maintain different analytic solutions on a per-service basis.

Within RESTful applications the middleware can also function as a REST connector with non-RESTful components. These could be legacy services or even external services that are being integrated into your architecture, like WebSockets or streaming services.

Configuring a Reverse Proxy with Nginx

In this section we are going to set up Nginx on OpenShift, so we can start using it with our APIs.

To create an OpenShift application, you can use the client tools that you installed in Chapter 10. We’ll name the application nginx and use the “do-it-yourself” cartridge type, diy-0.1:

$ rhc app create nginx diy-0.1

We can then show the newly created application’s information:

$ rhc app show -a nginx

When you go to your application page in OpenShift, on the righthand side you will find a Remote Access section. You will have to copy the ssh command shown there to open a secure shell session to your application. The command will look something like this:

$ ssh <random-strings>@nginx-<your-namespace>.rhcloud.com
Note

OpenShift offers a whole set of environment variables that you can use to configure your application to run properly. You can check them all out at the OpenShift website.

You can now proceed to install Nginx. Navigate to the tmp directory and download the Nginx source files:

$ cd $OPENSHIFT_TMP_DIR 
$ wget http://nginx.org/download/nginx-1.7.8.tar.gz
$ tar zxf nginx-1.7.8.tar.gz 
$ cd nginx-1.7.8

You may need to install some libraries from source. If you run:

$ ./configure --prefix=$OPENSHIFT_DATA_DIR

you will get the following errors:

checking for PCRE library ... not found 
checking for PCRE library in /usr/local/ ... not found 
checking for PCRE library in /usr/include/pcre/ ... not found 
checking for PCRE library in /usr/pkg/ ... not found 
checking for PCRE library in /opt/local/ ... not found   
./configure: error: the HTTP rewrite module requires the PCRE 
library. 
You can either disable the module by using 
--without-http_rewrite_module option, or  install the PCRE library 
into the system, or build the PCRE library statically from the 
source with Nginx by using --with-pcre=<path> option.

Since we cannot install the PCRE library into the system, we have to build it from source directly:

$ cd $OPENSHIFT_TMP_DIR 
$ wget ftp://ftp.csx.cam.ac.uk/pub/software/programming/
  pcre/pcre-8.36.tar.bz2 
$ tar jxf pcre-8.36.tar.bz2

We have now the possibility to modify the makefile to suit our needs. When configuring the makefile, we can enable a list of standard and optional HTTP modules that are supported by Nginx. The full list can be found in the Nginx documenation. We will keep the defaults for now and just run the configure command:

$ cd nginx-1.7.8 
$ ./configure --prefix=$OPENSHIFT_DATA_DIR 
  --with-pcre=$OPENSHIFT_TMP_DIR/pcre-8.36 

If the configuration runs successfully you should see the following output:

Configuration summary 
  + using PCRE library: /tmp//pcre-8.36 
  + OpenSSL library is not used 
  + md5: using system crypto library 
  + sha1: using system crypto library 
  + using system zlib library   
nginx path prefix: "/var/lib/stickshift/
      c45cdc9a27944dc5b1cd7cb9e5c9f8c7/nginx/runtime/" 
nginx binary file: "/var/lib/stickshift/
      c45cdc9a27944dc5b1cd7cb9e5c9f8c7/nginx/runtime//sbin/nginx" 
nginx configuration prefix: "/var/lib/stickshift/
      c45cdc9a27944dc5b1cd7cb9e5c9f8c7/nginx/runtime//conf" 
nginx configuration file: "/var/lib/stickshift/
      c45cdc9a27944dc5b1cd7cb9e5c9f8c7/nginx/runtime//
      conf/nginx.conf" 
nginx pid file: "/var/lib/stickshift/
      c45cdc9a27944dc5b1cd7cb9e5c9f8c7/nginx/runtime//
      logs/nginx.pid" 
nginx error log file: "/var/lib/stickshift/
      c45cdc9a27944dc5b1cd7cb9e5c9f8c7/nginx/runtime//
      logs/error.log" 
nginx http access log file: "/var/lib/stickshift/
      c45cdc9a27944dc5b1cd7cb9e5c9f8c7/nginx/runtime//
      logs/access.log" 
nginx http client request body temporary files: "client_body_temp" 
nginx http proxy temporary files: "proxy_temp" 
nginx http fastcgi temporary files: "fastcgi_temp" 
nginx http uwsgi temporary files: "uwsgi_temp" 
nginx http scgi temporary files: "scgi_temp"

This information will be needed to configure Nginx. Now we can compile and install:

$ make install

Once the installation has finished, you can navigate to $OPENSHIFT_DATA_DIR, where your Nginx is installed.

OpenShift currently allows one internal IP address and port for your application; these are available through the $OPENSHIFT_DIY_IP and $OPENSHIFT_DIY_PORT environment variables. These values may change, so you will want to include these environment variables directly in the nginx.conf file by using the env directive. Please note that these env variables can only be referred to in the main block of the config, not the http, server, or location blocks.

Let’s then edit the Nginx configuration file:

$ nano $OPENSHIFT_DATA_DIR/conf/nginx.conf

Change the listen value to:

http {
    ...
    server {
        listen       $OPENSHIFT_IP:$OPENSHIFT_PORT;
        server_name  localhost;
        ... 
        }
    ...
    }

Then copy the modified configuration file:

$ mv $OPENSHIFT_DATA_DIR/conf/nginx.conf \
  $OPENSHIFT_DATA_DIR/conf/nginx.conf.template

We have just bound the internal IP address and port in the Nginx configuration dynamically. We now need to modify the $OPENSHIFT_<cartridge-name> _IP and $OPENSHIFT_<cartridge-name>_PORT values when the start action hook is called.

To start up your application automatically, you’ll need to edit the local .openshift/action_hooks/start file, adding ${OPENSHIFT_RUNTIME_DIR}/nginx/sbin/nginx. Exit from your ssh session. Then run these commands on your machine:

$ cd nginx 
$ nano .openshift/action_hooks/start
#!/bin/bash
# The logic to start up your application should be put in this
# script. The application will work only if it binds to
# $OPENSHIFT_DIY_IP:8080
# nohup $OPENSHIFT_REPO_DIR/diy/testrubyserver.rb 
# $OPENSHIFT_DIY_IP $OPENSHIFT_REPO_DIR/diy |
# & /usr/bin/logshifter -tag diy &
#
sed -e "s/`echo '$OPENSHIFT_IP:$OPENSHIFT_PORT'`/`
echo $OPENSHIFT_DIY_IP:$OPENSHIFT_DIY_PORT`/" 
$OPENSHIFT_DATA_DIR/conf/nginx.conf.template > 
$OPENSHIFT_DATA_DIR/conf/nginx.conf 
nohup $OPENSHIFT_DATA_DIR/sbin/nginx > 
$OPENSHIFT_DIY_LOG_DIR/server.log 2>&1 &

And finally, execute the following commands:

$ git add .
$ git commit -a -m "start nginx when starting up the app" 
$ git push

Then use your browser to navigate to http://nginix-<your-namespace>.rhcloud.com. The welcome page will be displayed (Figure 15-1).

Figure 15-1. Nginx welcome page
Tip

Use the rhc tail -a nginx command to troubleshoot if you are having problem with the start script.

Also, you might have to run:

$ $OPENSHIFT_DATA_DIR/sbin/nginx -s reload

to force Nginx to reload the configuration every time you make some changes.

We now need to configure Nginx to act as a reverse proxy.  We will configure Nginx to do the following:

  • Pass requests to a proxied server
  • Distribute content from different services
  • Configure responses from different services

In particular, we will configure Nginx to proxy calls to the Wikipin API and distribute its responses.

According to the Nginx documentation:

When NGINX proxies a request, it sends the request to a specified proxied server, fetches the response, and sends it back to the client. It is possible to proxy requests to an HTTP server (another NGINX server or any other server) or a non-HTTP server (which can run an application developed with a specific framework, such as PHP or Python) using a specified protocol. NGINX supported protocols include FastCGI, uwsgi, SCGI, and memcached.

To pass a request to an HTTP proxied server, the proxy_pass directive is specified inside a location block:

location /some/path/ { 
  proxy_pass http://www.example.com/link/; 
}

We can set up a location for our Wikipin API as follows:

location /api/v1/pins/ {
  proxy_pass                       
    http://wikipin-<openshift-namespace>.rhcloud.com/api/v1/pins/;
  proxy_intercept_errors           on;
  proxy_redirect                   off;
  proxy_set_header X-Real-IP       $remote_addr;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}

Meet Lua

As described on its project page, “Lua is a powerful, fast, lightweight, embeddable scripting language[, combining] simple procedural syntax with powerful data description constructs based on associative arrays and extensible semantics. Lua is dynamically typed, runs by interpreting bytecode for a register-based virtual machine, and has automatic memory management with incremental garbage collection, making it ideal for configuration, scripting, and rapid prototyping.”

Lua was designed to be embedded within larger systems written in other languages, and since it has remained minimal and easy to integrate it is a popular choice in certain fields, like video game development (World of Warcraft used it as a scripting language), security and monitoring applications (Wireshark used it for prototyping and scripting), and the Web (Wikipedia has been using it as a template scripting language since 2012).

Lua can be used to extend Nginx into a self-contained web server. Being a scripting language, Lua can be used to write powerful applications directly inside Nginx without the need to use CGI, FastCGI, or uWSGI. Small features can also be implemented easily, just by adding a bit of Lua to an Nginx config file.

To extend Nginx, we need to add a bundle that supports Lua. There are bundles that already have Lua built in for your convenience, such as OpenResty and Tengine.

OpenResty (aka ngx_openresty) is a web application server that bundles together the standard Nginx core with a set of third-party Nginx modules (and their external dependencies), Lua libraries, and more. Tengine is another web server based on Nginx that is known to be very stable and efficient.

You can also install the Lua modules yourself. The modules you will need are:

The ngx_lua module exposes the Nginx environment to Lua via an API, while also allowing us to run Lua code snippets during the Nginx rewrite, access, content, and log phases.

A simple example of Lua scripting could be adding the following to $OPENSHIFT_DATA_DIR/conf/nginx.conf:

location /hello {
   default_type 'text/plain';
   content_by_lua '        
       local name = ngx.var.arg_name or "Anonymous"
        ngx.say("Hello, ", name, "!") ';
}

Bundle Things Together

Lua brings the possibility of using a scripting language to write both simple and complex rules in your nginx.conf file. This actually means that a part of your routing and proxy logic can be relegated to the Nginx server and doesn’t have to go into your application logic.

As mentioned at the beginning of this chapter, Nginx can also be used as a load balancer across multiple application instances, to optimize resource utilization, maximize throughput, reduce latency, and ensure fault-tolerant configurations.

There are different load-balancing methods that you might consider implementing on Nginx; here we’ll only discuss some of them superficially, with the objective being to point you in the right direction should you need to do more research on the topic.

Nginx supports three approaches to load balancing:

Round-robin
Requests to the application servers are distributed in a round-robin fashion.
Least connected
Each incoming request is assigned to the server with the least number of active connections.
IP hash
A hash function is used to determine what server should be selected for the next request, based on the client’s IP address.

The simplest configuration for load balancing with Nginx involves specifying multiple instances of the same application running on a number of servers: srv1srv3. When a load-balancing method is not specifically configured, Nginx defaults to the round-robin mechanism. All requests are proxied to the server group myapp1, and Nginx applies HTTP load balancing to distribute the requests. The following configuration shows how this can be accomplished:

http { 
    upstream myapp1 { 
        server srv1.example.com; 
        server srv2.example.com; 
        server srv3.example.com; 
    } 
    server { 
        listen 80; 
        location / { 
            proxy_pass http://myapp1; 
        } 
    } 
} 

With the least-connected method you have a bit more control over the load on the application instances, and can account for situations when some requests take longer to complete than others. Nginx will try not to overload an already busy application server with excessive requests, and instead will distribute the new requests to servers that are less busy.

To use the least-connected strategy, you just have to specify it in your nginx.conf file, as in the following example:

upstream myapp1 { 
    least_conn; 
    server srv1.example.com; 
    server srv2.example.com; 
    server srv3.example.com; 
}
Note

If you are interested in trying IP hash load balancing, you need to replace the least_conn directive with ip_hash. This method ensures that requests from the same client are passed to the same server, unless that server is unavailable.

A different load-balancing strategy involves using server weights to further influencing the algorithm. If no server weights are configured, like in the previous example, Nginx will treat all specified servers as equally qualified for a particular load balancing method. With the round-robin method, this results in requests being distributed fairly equally across the servers (provided there are enough requests, and the requests are processed in a uniform manner and completed quickly enough). This can be seen as similar to the least-connected method.

When a weight parameter is specified for a server, this will be considered as part of the load-balancing decision. For example, with this configuration:

upstream myapp1 { 
    least_conn; 
    server srv1.example.com weight=3; 
    server srv2.example.com; 
    server srv3.example.com; 
}

Nginx will distribute three out of every five requests to srv1; the remaining requests will be distributed equally across srv2 and srv3.

Nginx also includes in-band server health checks in its reverse proxy implementation. This implies that if a response from a certain server fails with an error, Nginx will mark the server as failed and will try to avoid it for the following inbound requests.

You can set the max_fails directive to indicate the maximum number of consecutive unsuccessful communication attempts that can be made to a particular server within the period specified by fail_timeout in order for that server to be marked as failed. max_fails is set to 1 by default; setting it to 0 for a particular server disables health checks for that server.

The fail_timeout parameter also defines how long the server is marked as failed once the health check is triggered. Following that interval, Nginx will start to gracefully probe the server with the live client’s requests. If the probes are successful, the server will be marked as live again.

Caching

Generally speaking, a cache is a computing component that transparently stores data so that future requests for that data can be answered faster. That data kept in the cache might be duplicates of values that are stored elsewhere, allowing faster access, or the result of previously executed operations that it is believed might be requested again in the near future.

Rails supports three types of caching natively, without the use of third-party libraries or plug-ins:

  • Page Caching
  • Action Caching
  • Fragment Caching

In order to start playing with caching, you first need to set the following in the corresponding config/environments/* .rb file:

config.action_controller.perform_caching = true

This flag is normally disabled by default for development and testing, and enabled for production.

The first caching mechanism that you will find in Rails is Page Caching. It allows the web server (e.g., Nginx) to fulfill requests for generated pages without having to go through the Rails stack at all, resulting in very fast response times. Page Caching has been removed from the core dependencies in Rails 4, and needs to be installed as a gem. To use Page Caching, add the following line to your application’s Gemfile:

gem 'actionpack-page_caching'

This mechanism cannot be applied to every situation—for example, it won’t work when the app is requesting some information that might change the page, such as for a page that needs authentication—and since the web server is just serving a file from the filesystem, you’ll need to deal with the issue of cache expiration.

Since we are dealing with APIs, we might want to use Action Caching instead to cache particular actions. Action Caching has also been removed in Rails 4, but can be used through a gem. Add the following to your Gemfile:

gem 'actionpack-action_caching'
 

Then, to enable Action Caching, you use the caches_action method:

class Api::V1::CategoryController < ApplicationController
  caches_action :show
  def show
    category = params[:category] ? params[:category] : \
      "Main_topic_classifications"
    @category = Category.where(:cat_title => category.capitalize).first
    if @category
      render :json => @category, serializer: CategorySerializer, root: "category"
    else
      render :json => {:error => {:text =>  "404 Not found", :status => 404}}
  end
end

In this example we have added caching to our CategoryController and considered our show action. The first time anyone requests a certain category Rails will generate the JSON response, and subsequently use Rails.cache behind the scenes to cache the generated JSON.

Please refer to the Action Caching repository for further information and available options.

The action cache is responsible for handling caching for different representations of the same resource. By default, these representations are stored by in the Rails public folder, where cache fragments are named according to the host and path of each request.

The default action cache path can be modified via the :cache_path option.

We might want additional control over how we use the caching mechanism provided. For example, we might want to add an expiration to our cache:

class Api::V1::CategoryController < ApplicationController
  respond_to :json
  caches_action :show, cache_path: { project: 1 }, expires_in: 1.hour
  def show
    category = params[:category] ? params[:category] : \
      "Main_topic_classifications"
    @category = Category.where(:cat_title => category.capitalize).first
    if @category
      render :json => @category, serializer: CategorySerializer, root: "category"
    else
      render :json => {:error => {:text => "404 Not found", :status => 404}}
    end
  end
end

What the action cache does behind the scenes is serve cache fragments after an incoming request has reached the Rails stack and Action Pack. This way, a mechanism like before_filter can still be run before the cache fragments are returned and served. This allows the action cache to be used even in situations where certain restrictions apply, like authenticated sessions.

You can clear the cache when needed using expire_action.

If you are using memcached or Ehcache, you can also pass :expires_in. In fact, all parameters not used by caches_action are sent to the underlying cache store.

As with Page Caching, Action Caching cannot always be used. Sometimes, especially with dynamic web applications, it is necessary to cache specific parts of the page independently from others, as well as with different expiration terms.

For such cases, Rails provides a mechanism called Fragment Caching that allows you to serve fragments of view logic wrapped in cache blocks out of the cache store when subsequent requests come in.

Fragment Caching really means that you are actually caching only certain parts of the answer to a request.

A very important technique involving Fragment Caching is called the Russian doll approach to caching. If you think of a Russian doll, you will picture a container with many smaller containers nested inside the first one.

The Russian doll caching mechanism works like this: cache fragments are nested and expire when an object’s timestamp changes.

There is yet another approach to caching that you might consider: it falls somewhere between Action and Fragment Caching and is referred to as model caching.

Model-level caching is often ignored since we tend to either cache the action or the view. This low-level approach is particularly useful for applications integrating different APIs: in such a situation you might have a model that, before an update, looks something up from another service through an API call.

Note

To find out more about low-level caching, check out the Ruby on Rails Guide.

For model-level caching we can go back to using Redis, since we do not require long-term persistence of caches. We met Redis in Chapter 12 when creating a temporary resource to be queried to get the status of an asynchronous REST request; as you’ll recall, it’s a very fast in-memory key/value store. WRedis is particularly suited for storing cache fragments due to its speed and support for advanced data structures.

To initialize Redis, add the following to your Gemfile:

gem 'redis'
gem 'redis-namespace'
gem 'redis-rails'
gem 'redis-rack-cache'

The redis-rails gem provides caching support (among other things) for Ruby on Rails applications. You can configure redis-rails in config/application.rb:

config.cache_store = :redis_store, 
  'redis://localhost:6379/0/cache', { 
    expires_in: 90.minutes 
  }

You also need to add the redis.rb initializer in config/initializers:

$redis = Redis::Namespace.new("wikicat", :redis => Redis.new)

Now, all Redis functionality is available across the entire app through the $redis global. If you run the Rails console you can try it yourself:

$ rails console

If you want to test Redis you can try the following test key and value:

$redis.set("test_key", "Hello World!")

Then, to retrieve the key, simply run:

$redis.get("test_key")

The same mechanisms are used behind the scenes when getting and setting values within the view or the controller. If certain conditions are met, you can execute the get or set methods to retrieve or store values in the Redis cache instead of querying your database directly.

To set a certain expiration time on the Redis cache you can simply call:

# Expire the cache, every 3 hours      
$redis.expire("categories",3.hour.to_i)

Please note that whether you decide to use Redis or another key/value store for your cache mechanism is independent from how you decide to cache in your app. Once you have set the cache_store and added the redis-rails gem, Rails will automagically switch to Redis for caching fragments instead of using memcached.   

Scaling Is Not Necessarily Difficult and Painful

The intention of this chapter was to show how it is possible to scale a Rails application gracefully without being limited by the very same framework.

Scaling a Rails application involves having a clear architectural picture of your service and keeping things relatively simple.

Most of the claims against being able to scale a Rails application trace back to the story of Twitter abandoning Ruby on Rails because it could not support the growing user base.

First of all, we are not in 2008 anymore. Rails is a mature enough technology to make scaling easy if well planned and architectured. Furthermore, the microservices/microapps approach to building complex applications is already a scaling paradigm. You keep your applications minimal and simple enough that you can easily extract complex logic into a different service if you need to, or isolate slow actions into background jobs.

A background job is some task that needs to be executed asynchronously by the server. This can be a slow action or a long series of processes that will otherwise block an HTTP request/response for a longer time than is needed or ideal. We want to send the client an answer as soon as possible, even if this means telling the client that it needs to wait and will have to check back to get the status of a resource.

The most common approach to running these longer “jobs” is to hand them off to a separate process to execute them independently of the web server. There are plenty of great libraries out there that provide the “background job” functionality for you.

Background Jobs on RoR

There are different libraries to run background jobs on Ruby. The most popular of these are Resque and Sidekiq; check out their GitHub pages for details.

The idea behind writing microapplications is to avoid writing complex applications that are difficult to run and maintain. If a service has to perform only a limited number of functions, its logic can be kept simpler and therefore easier to scale, evolve, integrate, and modify. The number of tests of a microservice does not need to grow exponentially, as happens with a monolithic application once it has reached a certain level of complexity.

Therefore, microservices can be tested faster and more frequently, which means they will be more robust and the code will be less prone to errors.

Wrapping Up

This chapter has been about scaling our platform and our Rails applications. I have tried to bust some myths about the concept that Rails doesn’t scale. In the next chapter, we will talk about the privacy and security of a Rails app and of data in general.