Mongo Follower, a MongoDB export and oplog event generator for Java

Traackr recently open sourced Mongo Follower, our easy to use harness for synchronizing your application with a Mongo database. It provides a simple interface to efficiently export all documents from a Mongo collection seamlessly followed by an indefinite stream of Mongo oplog events. Your location in the oplog is saved to disk allowing for clean restarts with no data loss.

At Traackr, we use MongoDB to store social data and keep that data synchronized with a separate analytics database. Over the years, we've used various tools to keep these two data stores synchronized -- MongoRiver, MongoConnector, Mongolastic, scripts, cron jobs and a lot of carefully crafted code. More and more, we've realized that utilizing the Mongo oplog to synchronize this data and keep our applications decoupled makes things easier to implement and less error prone. After creating several internal apps which use the oplog, we decided to build a generic library to make building and maintaining those applications even easier. We're calling it Mongo Follower.

How does it work?

A producer thread pulls data from Mongo, and a dispatch thread asynchronously sends that data to a special listener object. Processing happens in two phases: export and oplog. The export phase is only needed for the initial run. A Mongo query requests all documents from your collection, which are then sent to the 'exportDocument' callback. Once all documents are exported, the oplog phase begins.

The export phase is only needed for the initial run, a mongo query requests all documents from your collection which are then sent to the 'exportDocument' callback. Once all documents are exported the oplog phase begins.

The Mongo oplog acts like any other collection in MongoDB, so setting up the oplog tail is very similar to the initial export. The first difference is that the cursor is configured with cursorType(CursorType.TailableAwait). This means the cursor remains open even after reading the last record because more will become available when the collection changes. The query is also configured to skip part of the oplog in case of a restart. This is done using a timestamp file loaded from the disk allowing for restarts. This file is automatically updated at regular intervals. Oplog documents have a very special format indicating an insert, delete or update. Mongo Follower parses this format and calls corresponding insert, delete and update callbacks to make data easier to consume.

How do you use it?

Mongo Follower is currently available on github and maven central and can be included in your project like any other maven library:

Once the library has been included, you can implement the MongoEventListener interface. An instance of this listener will be provided to the Mongo Follower process:

To start Mongo Follower, we've created a Runner utility, and it can be used in a couple ways. The easiest way is to pass in a FollowerConfig object:

That's all there is to it! Additional details can be found on github.

What's next?

We're excited to try out a more reactive approach to dealing with our data. We can decouple creation and setup code when adding special documents, re-index documents to Elasticsearch whenever there are changes, optimize performance by observing usage patterns, or even test out alternative storage backends for some of our larger collections.

There are a handful of improvements we'd like to make, specifically:

  • Multiple listeners
  • Tailing and exporting multiple databases simultaneously
  • Built-in metrics
  • Cluster / Shard management

Even without those features, we're eager to continue leveraging the Mongo oplog to make our lives easier.


Adventures in Two Factor Authentication

Off we go!

Another hack week was upon us, and I had a definite idea in mind. I wanted to add two factor authentication to our web application. Having enabled it on web sites I use regularly, I was curious to see how difficult it was to implement and learn about the best practices associated with it. A bare bones implementation turned out to be remarkably trivial.

Our application runs on PHP, and it took minimal Google-fu to determine the RobThree/TwoFactorAuth library was the best option to try first -- well maintained, great documentation. One quick

composer require RobThree/TwoFactorAuth

to pull the library in, and it was right into setting it up. First hurdle reached! Our authentication code needed some tender loving refactoring to be able to support having two factor authentication enabled for your login or not. Ah... the satisfaction of being able to clean code up including squashing a bug that has long irritated you...

Once the authentication code was cleaned up, actually adding two factor authentication was trivial. It took more time to refactor our existing code than it did to use the library to generate the shared secret, display the QR code for scanning, and verify the entered token against the shared secret.

Which thirds to invite to the party?

By default, the PHP library reaches out to an external Google service to generate the QR code. However, you can use a local library to generate it if security of sending the shared secret over the wire is a concern. Hrm... Should I be concerned about that? Should I use an external third party service for generating QR codes, or should I compose in another third party library into our application?

Did a bit of research and found no compelling argument either way. Thinking it through, let's say I use an external third party and send the shared secret to them. What is the attack surface? They could sniff and record the secret, but they get no other details. How would they tie it back and use it with an actual username? They would need access to our database or server to make that association. If they have that, the game is already up, and they likely don't care or need the secret anyway.

What other drawbacks might there be then? There are two that spring to mind. First is the need to make an HTTP request to the service to get the QR code back. You are now dependent on the service being up and the network being fast and stable between you and the service. Second is the (likely) closed nature of the service; you have no way of vetting the code to make sure there are no shenanigans behind the curtain.

Given all of that, I stuck with the default Google service. Google knows just a little bit about keeping a service up and running, and their QR code service has likely been battle tested as much if not more than any other third party library I could use locally. However, the RobThree documentation provides an easy example of using a local library should the need or desire to switch arise.

Drifting away...

It was demo day; time to practice before presenting my work to the rest of the team! I'll just plug in my username and password... Type in the token from my device, and... Wait... Why am I not logged in? What did I break with my last commit?! All my work is for naught! I'm a frauuuud!

OK... OK... calm down... There must be a logical explanation. What would make a token work yesterday but not today? This is a time based protocol... My phone is setup to automatically keep my time correct. Let's check the time on my server... AHA! My local vagrant instance is not running any time synchronization and has decided to drift away by about an hour. One quick NTP query to fix that, and... We're back in business! Whew... Crisis averted; demo proceeds without issue.

I've learned an important lesson about TOTP. Make sure your server is automatically keeping the time correct!

Dude... Where's my phone?

If you are like me, you have used two factor authentication on other sites and have come across "recovery codes". These are meant to be written down and stored offline; if you were to ever lose the device that generated the token for the site, you could enter one of these codes to complete authentication and establish the shared secret on a new device. While I had an idea in mind of how this could be implemented, I wanted to see if there were any established best practices people follow. How many codes should be generated? How should you generate each code? What is the best way to securely store them?

After some digging, I found... nothing. I've seen sites provide anywhere from five to ten codes with the codes varying in length. I have yet to implement this feature but am leaning to at least generating five codes. However I end up generating them, I will store them exactly like passwords -- as encrypted salted hashes.

I need more...

One other flow I have encountered with other two factor enabled sites is the need to input multiple tokens if the server seems to think your token is out of sync. Again, I could find no information on best practices. How many tries should you give the user before entering this flow? How many tokens should you ask for? Does this really add any security value? For now, I am unlikely to implement this flow. While some big names use it, I'm not seeing much benefit from this flow versus just allowing the user to keep entering a single token.

That's all folks!

The RobThree library made it easy to add two factor authentication to our application. Follow the examples in the documentation, and you'll be up and running in no time. However, there seem to be no best practices (none I could easily surface at least) around the items like recovery codes and "out of sync... give us more than one token to proceed". Did I miss anything? Have any best practices from your own implementations? Sound off in the comments and let me know.


Staatus: Exploring the Slack API

Back in April, Slack released a cool new feature that allows you to set a status message on your account. Some of the examples they provide are:

  • In a meeting
  • Commuting
  • Out sick
  • Vacationing
  • Working remotely

"This is awesome!," I thought. "Now I can do what did on AIM in 1997!"

Slack did sprinkle this status in various places in the UI, but it's still a pain to go searching for it, especially if you haven't chatted with a person in a while. Plus, there doesn't seem to be an easy way to see a compiled status for a group (e.g. @engineering-team).

Enter: Staatus

I built Staatus on a PHP framework called Silex. It's a microframework based on Symfony that provides the guts for building simple single-file apps. Here's an example:

require_once __DIR__.'/../vendor/autoload.php';

$app = new Silex\Application();

$app->get('/hello/{name}', function($name) use($app) {
    return 'Hello '.$app->escape($name);
});

$app->run();

Staatus allows you to see a compiled status of an individual, a particular team, or everyone in a channel. The code is over on our Github.

I have to hand it to the devs at Slack. The API was super-easy to work with. I made use of these REST commands:

With that, I was able to display both status emojis/messages, but also a blue diamond if the person is online:

The Code

Coding this up was very straightforward. I'd say the only quirk I ran into is that the calls take too long, and Slack does not like that. If your plugin doesn't return within a few seconds, you get an error message. My plugin is hosted on Heroku, so between the lag of Heroku waking up instances and the time it takes to run all the API calls (which can amount to many, especially when Staatus is run against a team name), I needed to figure out a way to return a delayed response. Normally, in a single-threaded language like PHP, that's not easy. Most frameworks follow a very "serial" manner of generating a response:

  1. A request comes in (possibly with parameters).
  2. A response is generated.
  3. The response is returned and the request dies.

So if step 2 was taking too long, how was I going to work around Slack's request time limits? The answer comes in Silex's $app->finish() method. It is documented as:

A finish application middleware allows you to execute tasks after the Response has been sent to the client (like sending emails or logging)

Here's how I used it:

$process = function(Request $request) use($app) {

// first, validate token
 $token = $request->get('token');
 if (empty($token) || $token !== VERIFY_TOKEN) {
 $app->abort(403, "Invalid token.");
 }

$response = [
 'response_type' => 'ephemeral',
 'text' => 'Gathering staatus...'];

// just return; the rest of the processing will happen in finish()

return $app->json($response);
};

$app->get('/', $process); // for testing
$app->post('/', $process);

$app->finish(function (Request $request, Response $response) use($app) {
...
}

As you can see, my $process method returns immediately and I do the rest of the processing in a finish() method. This allows me to take as long as I'd like (give or take) to return a response back to Slack using a one-time response URL they provide. (Technically, you can't take as long as you'd like to respond. Slack has a hard limit of "up to 5 times within 30 minutes"):

$response = [
 'response_type' => 'ephemeral',
 'text' => $text];

$client2 = new Client(['base_uri' => $responseUrl]);
 $response2 = $client2->post('', [
 'json' => $response,
 'verify' => false
 ]);

And that's all, folks. So, go ahead and download Staatus, deploy it on Heroku, and you'll be partying like it's 1997 all over again. BOOYAH!


Round-robin Cross Subdomain AJAX Requests

Well, that’s a mouthful! But what is it and what is it useful for? If you have done any web development in the last 10 years you have most likely used AJAX to make your web pages or, even more likely your web based app, more responsive and dynamic. You also probably know that most browsers limit the number of concurrent AJAX requests that can be made back to the origin server (i.e. the same server the page is served from). Usually that number of concurrent requests is 6 to 8 (sometimes a bit more) depending of the browser.

 

This might be fine for what you are trying to do, but if your page has a lot of dynamic components or if you are developing a single page web app, this might make the initial loading of your page, and all its content, a bit slower. After your first 6 (or 8) AJAX requests have fired, any subsequent AJAX query back to that same origin server will be queued and wait. Obviously your AJAX requests should be fairly fast, but even if your AJAX requests are just 50 milliseconds, any request after the initial 6 or 8 would have to wait 50 milliseconds when queued before executing. So in total a request could take 50 milliseconds (waiting) + 50 milliseconds executing = 100 milliseconds. This short waiting times add up pretty quickly.

 

The solution? Make these requests look like they are going to different origin servers so that the browser will execute them in parallel, even if they are, in fact, going to the same server. This will allow more AJAX requests to run in parallel, and therefore speed up your site/app. Here I will show you how to accomplish this using JQuery (but the same strategy can be used with other frameworks).

Cross subdomain AJAX requests with a single server

Basic idea

The basic idea is pretty simple. Say you are serving your pages from webapp.traackr.com. You will want to send some of the AJAX requests through webapp1.traackr.com, some through webapp2.traackr.com, some through webapp3.traackr.com …etc. You get the idea. And we will set it up so that all these hosts are, in fact, the same server.

What I will show you here is how to do this transparently so you don’t have to manually set up each AJAX request and decide where it should go. Our little trick will randomly round-robin through all the available hosts. This will work even if your application runs on a single server and the best part is that if you need to scale and upgrade to multiple redundant servers you will have little if anything to change.

 

Easy enough, right? Well of course, there is just one more little thing. When you start making AJAX requests to a host different than the original host, you open the door to security vulnerabilities. I won’t go into the specific details of the potential security issues, but suffice it to say that your browser won’t let you make these cross-domain (subdomain in our case) requests. Luckily, CORS comes to the rescue. CORS is a rule-based mechanism that allows for secure cross-domain requests. I will show you how to setup the proper CORS headers so everything works seamlessly.

The setup

First things first - you will have to create DNS entries for all these subdomain hosts. Because we are doing this with a single server, you need to create DNS entries for webapp1.traackr.com thru webapp3.traackr.com that all resolve to the same host as webapp.traackr.com (they can be AAA or CNAME records). All set? Moving on.

The code

We need to accomplish two things:

  1. (Almost) Randomly round-robin the AJAX request to the various hosts we have just defined
  2. Modify AJAX requests so they can work across subdomains with CORS

Round-robin your AJAX requests

Here we are going to leverage the excellent JQuery framework, so I’m assuming you are using JQuery for all your AJAX requests.

 

The following Javascript code should be executed as soon as your document is ready. This piece of code uses global AJAX settings in JQuery to hijack all AJAX requests made with JQuery. Once we identify an AJAX request going to the origin server, we rewrite it right before it hits the wire (beforeSend()).

 

Almost there. Now we need to make sure your server can handle these requests when they come via one of the aliases we defined.

Enable CORS for your AJAX requests

Before we look at the CORS headers needed for these cross-domain AJAX requests to work, you need to understand how cross-domain AJAX requests are different than regular AJAX requests. Because of CORS, your browser needs to ensure the target server will accept and respond to cross-domain AJAX requests. Your browser does this by issuing OPTIONS requests. They are called ‘preflighted’ requests. These requests are very similar to GET or POST requests except that the server is not required to send anything in the body of the response. The HTTP headers are the only thing that matters in these requests. This diagram illustrates the difference:

So you need to make sure your server returns the proper headers. Remember in the current set up webapp.traackr.com and webapp[1-3].traackr.com are the same server. There are many ways to do this. Here is one that leverages Apache .htaccess config:

Because we allow credentials to be passed in these cross-domain requests, the origin allowed header ('Access-Control-Allow-Origin’) must specify a full hostname. You can not use ’*’, this is a requirement of the CORS specification.

 

An important note here. Because these OPTIONS calls are made before each AJAX request, you want to make sure they are super fast (i.e. a few milliseconds). Since the only thing that matters are the headers (see above), I recommend you serve a static empty file for these requests. Here is another .htaccess config that will do the trick (make sure options.html exists and is empty):

And that is it! Pop open your JavaScript console and look your AJAX requests being routed through the various aliases and being executed in parallel.

Be warned

So what trade-offs are you are making? Remember this is engineering, there are always trade-offs! Well all of a sudden you might get twice (or more) as many requests hitting your server in parallel. Make sure it can handle the load.

What’s next?

There are probably a few things you can improve on. I will mention two we use at Traackr but will leave their details as an exercise for the reader:

  • You will probably want to avoid hardcoding your list of servers in the script.
  • Some corporate firewalls do not allow OPTIONS requests. This can cause this entire approach to fail. The script we use at Traackr will actually detect errors in OPTIONS requests and will fall back to regular AJAX requests.

Use a load balancer and let it scale

What can you do if/when you need to scale? One easy approach that doesn’t require you to change much is to put 2 or more servers behind a load balancer (maybe an Elastic Load Balancer on AWS). Then update your DNS so that all your webapp[1-3].traackr.com URL are now pointing to your load balancer.

 

What magic happens then? Browsers will make more parallel requests, just as we have described all along here. But now your load balancer will round robin each of these requests to the many servers you have running behind it. Magically you are spreading the load (and the love).

 

Thank you to the entire awesome Traackr engineering team for the help with this post. We are Traackr, the global and ultimate Influencer Management Platform. Everything you need to discover your influencers, manage key relationships, and measure their impact on your business. Check our blog. We are hiring.


Introducing Full Stack JS Amber - A boilerplate with Ember on the client and a Node Server API » { 100PercentJS }

Introducing Full Stack JS Amber - A boilerplate with Ember on the client and a Node Server API » { 100PercentJS }

Building a Recipe Search Site with Angular and Elasticsearch

Building a Recipe Search Site with Angular and Elasticsearch

Git tips from the trenches

Git tips from the trenches

Page.js

Page.js

11 Best Practices for Low Latency Systems

11 Best Practices for Low Latency Systems