Terraform. Useful tricks.

We started using Terraform a few months ago as a way to create consistency and repeatability in the way we manage our infrastructure. Terraform is pretty new and not completely mature. While we are learning how to use it best, we picked up a few useful tricks along the way. I was going to call these best practices, but I don't think we have been practitioners long enough to really know if these are even good practices :-) Hopefully by sharing some of our learnings here, people can pick up a few things or tell us what we are doing wrong.

Use of Make

While Terraform is a single binary and is as easy as terraform plan or terraform apply to run, you need a better strategy to run it with anything bigger than a few machines. Once your infrastructure is bigger than a few machines, you will probably want to break it down in smaller logical chunks. Also, terraform is pretty finicky about what directory it needs to be called from -- in part because of the way it loads files and in part because of where it looks for its state file. It quickly makes sense to use something to wrap the work in tasks. You can probably use ant, gradle or grunt, but these would add more dependencies to your project. So, back to the basics with make. Makefile files to manage tasks (and dependencies between tasks) have been around for a long time, and make is available on pretty much any Unix based platform (and that includes MacOS).
Depending on how you decide to organize your infrastructure, you can create tasks to manage its different parts as simply as:

  • make prod or make qa
  • make app or make api

Break down our infrastructures by services and environments

While in theory it might be nice to think you can manage your entire infrastructure with a single setup and potentially a single command, practice proves different. You will quickly want to logically separate your infrastructure setup. For one, it will reduce the complexity of the number of files you have to manage. It will also make it easier if something goes bad. You don't want a single corruption of the state file to prevent you from managing your entire infrastructure. Or even worse, a bad command taking down part or all of your infrastructure.

So far, we have decided to break down our Terraform project by services (app, API, etc.) and within services by environment (production, staging, etc.). With each service being independent and often managed by different teams, this was an obvious choice. The break down by environment lets us test changes before we apply them to our production infrastructure.

This approach does have some drawbacks. You will find yourself duplicating quite a bit of code. Modules are here to help, but while we use them in each setup, we haven't explored using global ones yet. Also, if you have pieces of your infrastructure shared across all your services, you won't always be able to programmatically reference them.

Save shared state in S3 using versioning

It is pretty well documented that as soon as you have more than 0 persons working with Terraform, you will want to centralize your state file. Since we are hosted at AWS, S3 was the obvious choice. With the use of make you can ensure to always pull the latest state before doing anything:

.PHONY: setup plan apply

setup:
  @echo "Getting state file from S3"
  @terraform remote config -backend=s3 \
    -backend-config="bucket=<bucket-name>" \
    -backend-config="key=<s3-key)" \
    -backend-config="region=<aws-region>"

plan: setup
  @terraform plan
  
apply: setup
  @terraform apply

From time to time, it's possible you will corrupt your state file. And that's no bueno. So, we enabled S3 object versioning on all our Terraform state files. This way, if anything goes wrong, we can always go back to a known stable state.

Delete shared state between runs

This one is probably not a best practice per se. Because we use multiple AWS accounts (for PROD v.s QA), it's not uncommon for us to run Terraform against one AWS account and then different account across multiple targets. On a couple of occasions, this caused corruption of our state file. Now, we delete our local state file with each terraform run: once at the very beginning just in case something was left from a previous (failed) run and at the end once we are done. With the use of make it's easy to pull the latest state file each time (see above).

.PHONE: setup

setup:
  @echo "Clean up local state"
  @rm -rf */**/.terraform
  # Other setup
  
plan: setup
  # Stuff to do
  @rm -rf */**/.terraform

apply: setup
  # Stuff to do
  @rm -rf */**/.terraform

Use ${args} to select target

The terraform command offers a few options. One that's particularly useful when doing development is -target=resource. It limits Terraform operations to that particular resource and its dependencies. When you manage a rather large infrastructure, this is useful during development to limit output to something that's easier to read and debug. We integrate it into our Makefile with:

apply: setup
  @terraform apply ${args}

This allows us to call make with:

> make api args="-target=api_loadbalancer"

 

Know more tricks?

As the saying goes, that's it for now folks! Do you know of any other useful Terraform tricks? Drop us a note.

 

If you like this article or this blog don't forget to like it and share it and follow us at http://devs.traackr.com or https://medium.com/traackr-devs


Raft Visualization

Raft Visualization


Round-robin Cross Subdomain AJAX Requests

Well, that’s a mouthful! But what is it and what is it useful for? If you have done any web development in the last 10 years you have most likely used AJAX to make your web pages or, even more likely your web based app, more responsive and dynamic. You also probably know that most browsers limit the number of concurrent AJAX requests that can be made back to the origin server (i.e. the same server the page is served from). Usually that number of concurrent requests is 6 to 8 (sometimes a bit more) depending of the browser.

 

This might be fine for what you are trying to do, but if your page has a lot of dynamic components or if you are developing a single page web app, this might make the initial loading of your page, and all its content, a bit slower. After your first 6 (or 8) AJAX requests have fired, any subsequent AJAX query back to that same origin server will be queued and wait. Obviously your AJAX requests should be fairly fast, but even if your AJAX requests are just 50 milliseconds, any request after the initial 6 or 8 would have to wait 50 milliseconds when queued before executing. So in total a request could take 50 milliseconds (waiting) + 50 milliseconds executing = 100 milliseconds. This short waiting times add up pretty quickly.

 

The solution? Make these requests look like they are going to different origin servers so that the browser will execute them in parallel, even if they are, in fact, going to the same server. This will allow more AJAX requests to run in parallel, and therefore speed up your site/app. Here I will show you how to accomplish this using JQuery (but the same strategy can be used with other frameworks).

Cross subdomain AJAX requests with a single server

Basic idea

The basic idea is pretty simple. Say you are serving your pages from webapp.traackr.com. You will want to send some of the AJAX requests through webapp1.traackr.com, some through webapp2.traackr.com, some through webapp3.traackr.com …etc. You get the idea. And we will set it up so that all these hosts are, in fact, the same server.

What I will show you here is how to do this transparently so you don’t have to manually set up each AJAX request and decide where it should go. Our little trick will randomly round-robin through all the available hosts. This will work even if your application runs on a single server and the best part is that if you need to scale and upgrade to multiple redundant servers you will have little if anything to change.

 

Easy enough, right? Well of course, there is just one more little thing. When you start making AJAX requests to a host different than the original host, you open the door to security vulnerabilities. I won’t go into the specific details of the potential security issues, but suffice it to say that your browser won’t let you make these cross-domain (subdomain in our case) requests. Luckily, CORS comes to the rescue. CORS is a rule-based mechanism that allows for secure cross-domain requests. I will show you how to setup the proper CORS headers so everything works seamlessly.

The setup

First things first - you will have to create DNS entries for all these subdomain hosts. Because we are doing this with a single server, you need to create DNS entries for webapp1.traackr.com thru webapp3.traackr.com that all resolve to the same host as webapp.traackr.com (they can be AAA or CNAME records). All set? Moving on.

The code

We need to accomplish two things:

  1. (Almost) Randomly round-robin the AJAX request to the various hosts we have just defined
  2. Modify AJAX requests so they can work across subdomains with CORS

Round-robin your AJAX requests

Here we are going to leverage the excellent JQuery framework, so I’m assuming you are using JQuery for all your AJAX requests.

 

The following Javascript code should be executed as soon as your document is ready. This piece of code uses global AJAX settings in JQuery to hijack all AJAX requests made with JQuery. Once we identify an AJAX request going to the origin server, we rewrite it right before it hits the wire (beforeSend()).

 

Almost there. Now we need to make sure your server can handle these requests when they come via one of the aliases we defined.

Enable CORS for your AJAX requests

Before we look at the CORS headers needed for these cross-domain AJAX requests to work, you need to understand how cross-domain AJAX requests are different than regular AJAX requests. Because of CORS, your browser needs to ensure the target server will accept and respond to cross-domain AJAX requests. Your browser does this by issuing OPTIONS requests. They are called ‘preflighted’ requests. These requests are very similar to GET or POST requests except that the server is not required to send anything in the body of the response. The HTTP headers are the only thing that matters in these requests. This diagram illustrates the difference:

So you need to make sure your server returns the proper headers. Remember in the current set up webapp.traackr.com and webapp[1-3].traackr.com are the same server. There are many ways to do this. Here is one that leverages Apache .htaccess config:

Because we allow credentials to be passed in these cross-domain requests, the origin allowed header ('Access-Control-Allow-Origin’) must specify a full hostname. You can not use ’*’, this is a requirement of the CORS specification.

 

An important note here. Because these OPTIONS calls are made before each AJAX request, you want to make sure they are super fast (i.e. a few milliseconds). Since the only thing that matters are the headers (see above), I recommend you serve a static empty file for these requests. Here is another .htaccess config that will do the trick (make sure options.html exists and is empty):

And that is it! Pop open your JavaScript console and look your AJAX requests being routed through the various aliases and being executed in parallel.

Be warned

So what trade-offs are you are making? Remember this is engineering, there are always trade-offs! Well all of a sudden you might get twice (or more) as many requests hitting your server in parallel. Make sure it can handle the load.

What’s next?

There are probably a few things you can improve on. I will mention two we use at Traackr but will leave their details as an exercise for the reader:

  • You will probably want to avoid hardcoding your list of servers in the script.
  • Some corporate firewalls do not allow OPTIONS requests. This can cause this entire approach to fail. The script we use at Traackr will actually detect errors in OPTIONS requests and will fall back to regular AJAX requests.

Use a load balancer and let it scale

What can you do if/when you need to scale? One easy approach that doesn’t require you to change much is to put 2 or more servers behind a load balancer (maybe an Elastic Load Balancer on AWS). Then update your DNS so that all your webapp[1-3].traackr.com URL are now pointing to your load balancer.

 

What magic happens then? Browsers will make more parallel requests, just as we have described all along here. But now your load balancer will round robin each of these requests to the many servers you have running behind it. Magically you are spreading the load (and the love).

 

Thank you to the entire awesome Traackr engineering team for the help with this post. We are Traackr, the global and ultimate Influencer Management Platform. Everything you need to discover your influencers, manage key relationships, and measure their impact on your business. Check our blog. We are hiring.


The Why and How of Ansible and Docker

The Why and How of Ansible and Docker


108 byte CSS Layout Debugger

108 byte CSS Layout Debugger

Setting up your Mac

Setting up your Mac

Just Dave: Not the Yahoo! way

Just Dave: Not the Yahoo! way

Chrome issues

Chrome issues

The Simpsons Love Traackr

The Simpsons are very impressed with Traackr (@ SF office)