github-ad.png

How this single controller shielded our Rails app from Tuesday's Amazon S3 outage

While many popular services on the internet were severely affected during Tuesday’s Amazon S3 outage, SupportBee remained very much functional. We continued to import emails forwarded by our users and our users continued to use SupportBee’s dashboard to write emails to their customers. Our users could even upload and send attachment files! We faced Amazon S3 outages before and have been resilient to these outages for years. We achieved this resilience with a relatively small change to the way our Rails app serves files and I’d like to elaborate on that today.

First, let’s start with attachment files. Attachment files are the most popular kind of files in SupportBee. Here’s how the Attachment model in SupportBee looks like

# app/models/attachment.rb

class Attachment < ActiveRecord::Base
  mount_uploader :file, AttachmentUploader
  store_in_background :file

  before_create :set_file_tmp_host

  def set_file_tmp_host
    self.set_file_tmp_host = Socket.gethostname
  end

  def in_s3?
    self.file_tmp_host == nil
  end

  def in_different_web_server?
    !in_s3? && (self.file_tmp_host != Socket.gethostname)
  end

  def access_token
    REDIS.get("attachments:#{id}:access_token") || generate_access_token
  end

  def tmp_file_path
    file.root.join(file.cache_dir, file_tmp).to_s
  end

  def tmp_file_url
    web_server_subdomain, domain = file_tmp_host, SB_CONFIG["domain"]

    url_helpers = Rails.application.routes.url_helpers
    url_helpers.file_attachment_url(self, host: domain, protocol: APP_PROTOCOL, subdomain: web_server_subdomain, access_token: access_token)
  end

  private

  def generate_access_token
    access_token = SupportBee::Utils::RandomTokenGenerator.generate
    REDIS.set("attachments:#{id}:access_token", access_token)
    REDIS.expire("attachments:#{id}:access_token", 5.minutes)
    access_token
  end
end

Let me explain the first few lines to give you context.

We use the excellent carrierwave gem to store files in Amazon S3

mount_uploader :file, AttachmentUploader

and the carrierwave-backgrounder gem to do that outside of the request-response cycle

store_in_background :file

The callback in the next line stores the web server to which a user uploads the attachment file

before_create :set_file_tmp_host

def set_file_tmp_host
  self.file_tmp_host = Socket.gethostname
end

To serve attachment files to users, we’ve an AttachmentController.

Here’s a simplified version of how that looks like

# app/controllers/attachment_controller.rb

class AttachmentsController < ActionController::Base
# /attachments/:id/file
  def file
    @attachment = Attachment.find(params[:id])
    if @attachment.in_s3?
      redirect_to_s3
    elsif @attachment.in_different_web_server?
      redirect_to_different_web_server
    else
      send_attachment_file
    end
  end

  private

  def redirect_to_s3
    redirect_to attachment.file_url # Redirect the user to Amazon S3
  end

  def redirect_to_different_web_server
    # If an attachment with id 1 is on the www1 web server, redirect the user to www1.supportbee.com/attachments/1/file
    redirect_to @attachment.tmp_file_url
  end

  def send_attachment_file
    send_file(@attachment.tmp_file_local_path, :type => @attachment.content_type)
  end
end

If an attachment file is in S3, the AttachmentsController redirects the user to S3

if @attachment.in_s3?
  redirect_to_s3

If an attachment file wasn’t uploaded to S3 because of an outage, the AttachmentsController simply redirect the user to the web server that has the file

elsif @attachment.in_different_web_server?
  redirect_to_different_web_server
else
  send_attachment_file
end

Thanks to the AttachmentsController, even during an Amazon S3 outage, our users could access attachment files they recently uploaded (like the attachments files they uploaded when drafting an email for example). And because our users could upload and access attachment files irrespective of the availability of Amazon S3, they could continue using SupportBee’s dashboard to write emails to their customers even during Amazon S3 outages.

Having an AttachmentsController like the one above will shield many Rails apps from Amazon S3 outages. In our case, there was one additional step necessary for achieving resilience. At SupportBee, in addition to users, we’ve Sidekiq workers that read attachment files to send emails. To prevent email delivery jobs from failing, we changed our Sidekiq workers to access attachment files the same way our users do

# app/services/attachment_services/read_attachment_file.rb

module AttachmentServices
  class ReadAttachmentFile
    def initialize(attachment)
      @attachment = attachment
    end

    def execute
      return read_from_s3 if @attachment.in_s3?
      begin
        return read_from_different_web_server if @attachment.in_different_web_server?
        return read_from_local_temporary_storage
      rescue => Attachment::ReadFailed
        # Read may fail if a background Sidekiq worker uploaded the file to Amazon S3 at the same time as the read.
        # In that case, just read the file from Amazon S3.
        @attachment.reload
        read_from_s3
      end
    end

    private

    def read_from_s3
      @attachment.read
    end

    def read_from_different_web_server
      response = http.get(@attachment.tmp_file_url)
      raise Attachment::ReadFailed unless response.success?
      response.body
    end

    def read_from_local_temporary_storage
      File.read(@attachment_model.tmp_file_path)
      rescue
      raise Attachment::ReadFailed
    end

    def http
      @http || = Faraday.new(request: { timeout: 10, open_timeout: 3 }) do |connection|
      connection.request :url_encoded
      connection.use FaradayMiddleware::FollowRedirects, limit: 3
      connection.adapter Faraday.default_adapter
    end
  end
  end
end

Any part of our codebase that reads an attachment file uses the above ReadAttachmentFile service to do it.

Here’s a mailer using the ReadAttachmentFile service for example

email = Email.find(email_id)
attachment_bodies = email.attachments.each { |a| AttachmentServices::ReadAttachmentFile.execute(a) }

We reuse the AttachmentsController and the ReadAttachmentFile service to access other kinds of files in SupportBee, like raw email files. And because we do this, we automatically introduce resiliency in those places.

Benefits and Caveats

There are significant benefits to this approach. Apart from being shielded from Amazon S3 outages, our code is safe from temporary local network issues (and the associated errors) that prevent our servers from reaching external services like Amazon S3.

There is one caveat. During the duration of an Amazon S3 outage, which usually lasts for 2 - 3 hours, our users cannot access attachment files of older emails because those files would have already been uploaded to Amazon S3. Given everything else in SupportBee remains functional during the outage, we think it’s a fair tradeoff.

Wrapping Up

I’d like to give a shoutout to our CEO Hana Mohan who proposed the idea (despite moving away from dev, she remains the best engineer among us) and our former CTO Avinasha Shastry for implementing it in a few hours during a previous outage. Thanks to their efforts, we didn’t have a stressful day despite an ongoing Amazon S3 outage.

If you enjoyed reading the post, do upvote it on Reddit and if you’ve any thoughts you’d like to share, feel free to leave a comment below.


github-ad.png
blog comments powered by Disqus