Antoine Girbal's Corner: How to speed up MongoDB Map Reduce by 20x

Antoine Girbal's Corner: How to speed up MongoDB Map Reduce by 20x

EasyMock Delegation: Testing Twitter... without Twitter

image

So you’ve decided the pain of maintaining your third-party unit tests have finally outweighed the pain of reworking them into something more predictable, and you’ve entered into the wild world of mock objects. There’s no way around it, really… it was bound to happen. Look, even if your app didn’t talk to a third-party service, you’d still want to prevent any possible variation in the environment in which your test is running, outside of the exact thing you’re testing, making all things absolutely predictable. This way if your tests fail, you’ll know exactly where to look. But why am I telling you this? You are all pros! There are many resources out there for the “hows” and “whys” of mock objects, but if you’re already familiar with the concepts and libraries, this post is for you.

This is a post about how to use EasyMock for mocking Twitter, and how you can use Delegation to allow you to accomplish certain things more directly in dealing with complex implementations you’d otherwise have to do yourself.

Disclaimer: We’re using EasyMock, But you might also wanna check out jMock and Mockito as other options.

The Use Case

So let’s say you’ve built an app that integrates with a thirdparty API such as Twitter, and so you’ll probably want to test your app against specific responses from Twitter to make sure the business logic is performing correctly. For example, let’s just say every time Justin Bieber gets retweeted, your app sets off a fireworks show. Now, your unit tests want to test this behavior and you are using EasyMock to mock whatever home grown REST client you’ve written, returning the mock JSON response you’ve put together. You set your Expectations and Answers, and voila! You’ve mocked Twitter!

But then you realize that popular thirdparty Java clients exist for a reason, and instead of going through the process maintaining every single endpoint that Twitter provides, you decide to switch to something like Twitter4J, which makes integrating with Twitter very easy (it really does). So you figure, “I’ll just Mock Twitter4J”, and it should be easy, right? Wrong. Here’s why:

Twitter4J’s User Timeline endpoint is defined as:

ResponseList<Status> Twitter.getUserTimeline()

And this is the code you expect to write:

ResponseList<Status> list = ... // Figure out how to construct a ResponseList object with the Statuses you want.
Twitter mockTwitter = createMock(Twitter.class);expect(mockTwitter.getUserTimeline(EasyMock.isA(String.class), EasyMock.isA(Paging.class))).andReturn(list);

… but you you find yourself in a rabbit hole. At this point you’ve just painfully realized that to create the ResponseList object with custom JSON is not exactly trivial and here’s why: ResponseList is an interface, and so is Status. Are there implementations? Well, if you follow the Twitter4J source far down enough, you’ll find a series of factories that translate the JSON responses from the HttpResponse objects into implementations of the return types we’re working with. It’s a lot of code to write! If you want to put all the control of the response in your hands, because essentially you’re going to have to recreate what’s being done under the hood for Twitter4J. So, I’ve put together some code so you can easily mock Twitter4J’s getUserTimeline(), in a way that puts control back in your hands that uses some of EasyMock’s tricks, without writing a lot of code. So without further ado:

* Drumroll *

Step 1: Define Your JSON Response

This is the hard part :) Creating your own custom tweets as JSON Strings is such a pain. So do yourself a favor and run a few calls against Twitter, copy/paste, escape all those quotes, and make any necessary changes via grep.. etc. So let’s say in your unit test, you’ll want the first Twitter UserTimeline call to returns 30 statuses from Oprah, the second call to return 20 statuses from Justin Bieber, and the last call returns nothing, your String array looks something like:

public static String OPRAH_RESPONSE  = "[ { ... // 30 statuses from Oprah
public static String JBIEBER_RESPONSE = "[ { ..// 20 statuses from Biebs
public static String EMPTY_RESPONSE = "[]"

String expectedJsonResponses = new String[]{ OPRAH_RESPONSE, JBIEBER_RESPONSE, EMPTY_RESPONSE };

And let’s turn the String array into JSON:

JSONArray array = new JSONArray(twitterRawJSONResponse);

Step 2: Create A Method Which Mocks ResponseList

If you’ve taken a look at the Twitter4J source, you’ll notice that ResultStatus is also an extension of java.util.List and there are a bunch of methods that are already defined by the List interface, and we’re going to use EasyMock to delegate those particular methods by ResponseList to an implementation of List that we can more easily create and work with… an ArrayList. Creating an ArrayList, is a lot less verbose than recreating Twitter4J’s ResponseList. Go ahead and check out the source, it’s rather complex.

This is what EasyMock refers to as Delegation. EasyMock describes delegation as:

“ <Delegation allows mock objects> to delegate the call to a concrete implementation of the mocked interface that will then provide the answer. The pros are that the arguments found in EasyMock.getCurrentArguments() for IAnswer are now passed to the method of the concrete implementation. This is refactoring safe. The cons are that you have to provide an implementation which is kind of doing a mock manually… Which is what you try to avoid by using EasyMock. It can also be painful if the interface has many methods.”

… however in this case, it works so beautifully!  Instead of redefining a behavior from scratch, you can just delegate to a similar object that already has the behavior defined, except your delegate is completely in your control.

Another scenario in which you would want to use Delegation could be around a database cursor, for example MongoDB’s DBCursor.   It implements the Iterator interface.  If you were mocking some database behavior and you had a pre-constructed list of objects you wanted work with, instead of trying to mock the behavior of DBCursor.next() and manually plot out what each subsequent next() would return (because remember, mock objects is a recording that needs to be replayed in a certain order), you can just delegate the behavior to a more easily constructed type that you can work with.   

Let’s see it in action and create an ArrayList object and populate it with some tweets!

final List<Status> list = new ArrayList<Status>();
if (array.length() > 0) {
  for (int x = 0; x < array.length(); x++) {
    JSONObject obj = (JSONObject) array.get(x);
    Status status = DataObjectFactory.createStatus(obj.toString());
    list.add(status);
  }
}

Now we’re going to create a mock of the ResponseList interface and delegate the methods being used by Twitter4J over to our List that we just created:

ResponseList<Status>  responseList = EasyMock.createMock(ResponseList.class);

Now you can use Delegation! Since you’re mocking the ResponseList and you’ll probably want to treat your ResponseList as a regular ArrayList to iterate through your results, all you have to do is simply delegate those method calls to your list object:

EasyMock.expect(responseList.toArray(EasyMock.isA(Status[].class))).andDelegateTo(list).anyTimes();

Your list contains the custom Status objects you’ve generated from your JSON, and they’ve been loaded into a list, so all you’d need to do is just delegate the work to ArrayList.  This way,  you don’t have to write custom code for every single Twitter scenario you want to test. Simply load the JSON into objects, parse them into Status objects, put them in a collection and delegate ResponseList’s behavior to that collection.

Let’s put all this code into a method called: buildTwitterServiceStatusResponseList(String jsonResponse)

Step 3. Mock Twitter4J

Remember this from above?

String expectedJsonResponses = new String[]{ OPRAH_RESPONSE, JBIEBER_RESPONSE, EMPTY_RESPONSE };

Let’s pass those into the above method and create a mock Twitter4J:

Twitter twitter = EasyMock.createMock(Twitter.class);
IExpectationSetters<ResponseList<Status>> expectation = EasyMock.expect(twitter.getUserTimeline(EasyMock.isA(String.class), EasyMock.isA(Paging.class)));
for (String resp : expectedJsonResponses) {
     expectation.andReturn(buildTwitterServiceStatusResponseList(resp)).times(1);
}
EasyMock.replay(twitter);

That’s all there really is to it. I’ve put a gist together where you can see the code in it’s entirety.

Remember: If your code uses other methods for ResponseList, you’ll have to create an expectation for those method calls that you can either Answer, Return, or DelegateTo…  or EasyMock will fail.

In summary, delegation in EasyMock is useful, if you care about the results that are coming back from your mocked service, and constructing a delegate is trivial compared to actually working with the objects themselves.

At TRAACKR our team spends a LOT of time thinking and implementing the ins and outs, as well as the dos and don'ts of interfacing with third party API’s and so we hope to provide engineers with some of the tools of the trade that will hopefully make your lives a little easier.


Redis and Memcached benchmarking on AWS

Redis and Memcached benchmarking on AWS

Proper Application Logging

Proper Application Logging

Badass JavaScript: WebKit.js: Yes it has finally happened! Browser Inception is now possible.

Badass JavaScript: WebKit.js: Yes it has finally happened! Browser Inception is now possible.

Redis and Lua

Redis and Lua

RabbitMQ + Spring = Easy Integration

If you’ve never heard of RabbitMQ, it is a robust messaging system (written in Erlang) based on the AMQP standard, which is a wire-level protocol designed to be cross platform and language agnostic. We utilize RabbitMQ here at Traackr in our back-end technology stack, where we offload work that is not required for immediate consumption by our system. Not only have we found RabbitMQ to be incredibly fast and durable, but it also offers a slick UI and administrative controls for examining the state of your message queues and server.

We chose to integrate with RabbitMQ using the Spring AMQP project (http://www.springsource.org/spring-amqp). The Spring AMQP project provides all the necessary classes for setting up RabbitMQ transaction managers, connection factories, message templates, listeners, etc. As always, Spring provides terrific documentation and getting your application to successfully send and receive messages is a piece of cake - http://static.springsource.org/spring-amqp-net/docs/1.0.x/reference/html/amqp.html

Now, regarding message consumers, Spring recommends creating your listener containers with a bean definition (via XML) so that it can simply run in the background:

<object name="MessageListenerContainer" type="Spring.Messaging.Amqp.Rabbit.Listener.SimpleMessageListenerContainer, Spring.Messaging.Amqp.Rabbit">
    <property name="ConnectionFactory" ref="RabbitConnectionFactory"/>
    <property name="Queue" value="some.queue"/>
    <property name="MessageListener" ref="SomeListener"/> </object>

However, we did things a little bit differently here at Traackr, and here is why you should too. We create and manage our own listener containers via our own Service, which resembles something like:

@Service
public class MyRabbitMQListenerService
{
    ...
    private Map<String, SimpleMessageListenerContainer> messageListenerContainers = new HashMap...
    ...
    @Override
    public void afterPropertiesSet() { // spring bean initializer
        // pseudo-code
        for (String queue : my_list_of queues) {
            SimpleMessageListenerContainer container = SimpleMessageListenerContainer();
            container.queue = queue;
        container.MessageListener = new listener_delegate;
           ... set other stuff
           messageListenerContainers.put(queue, container);

             // Start by default (or not if you're no fun)
            container.start();
        }
    }
   
    public SimpleMessageListenerContainer getListenerContainerForQueue(String queue) {
        return this.messageListenerContainers.get(queue);
    }
}

This allows us to quickly access the SimpleMessageListenerContainer registered to each queue. One of the great things about the SimpleMessageListenerContainer class is that it provides methods for starting and stopping message consumption…

Now, suppose we expose two REST-ful endpoints that look something like this:

@RequestMapping(value = "/startMessageContainer/{queue}", method = RequestMethod.GET)
@RequestMapping(value = "/stopMessageContainer/{queue}", method = RequestMethod.GET)

With some extra code to connect the dots, we can now pause and resume message consumption for any queue at will … Awesome stuff!

@RequestMapping(value = "/stopMessageContainer/{queue}", method = RequestMethod.GET)
@ResponseBody
String stopMessageContainer(@PathVariable("queue") String queue) {
   // Quick & dirty
   myRabbitMQListenerService.getListenerContainerForQueue(queue).stop();
   return "success";
}

What’s the advantage of this, you might ask? Well, suppose the responsibility of the message consumer is to take the message payload and send that information to a 3rd party service. So, what happens when that 3rd party service goes down? Well, most likely an exception will be raised by the message consumer, triggering a roll-back by the RabbitMQ transaction manager, and the message will get put back on the queue. Now if this 3rd party service is down for a day, this cycle of roll-backs would continue until the external service was back up (unfortunately, RabbitMQ has no notion of a dead letter queue). If we were aware of the problem, we could easily shut down consumption on this queue and prevent all the unnecessary network chatter.

Even better, what if the message consumer imposed thresholds for certain types of errors that when met would automatically stop themselves. Using the above approach of encapsulation, we can achieve that, and that is some VERY powerful stuff.


Using Traackr API

Today Engage121 announced they are launching a new version of their product that integrates with Traackr: Engage121 Launches Version 2.1

How do they do that you might ask? Well, very easy, they are using our awesome API. I thought I would show you how you can do it to. We are going to build a little Traackr widget from one of our alpha lists, Cloud Computing. The widget will display random posts from the list on a web page.

First of, the HTML for the page. Let’s keep it simple. We load JQuery because we will need it later to load the A-List via the API and display the posts.

The body contains a simple DIV and TABLE where we will display the image for the influencer and the text of the post.

<!DOCTYPE html>
<html>
    <head>
        <title>AList Widget</title>
        <script type=“text/javascript”
            src=“https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js”>
        </script>
    </head>
    <body>
        <h1>A-List Widget</h1>

        <!– alist title –>
        <div id=“alist-title”><i>Loading</i></div>

        <!– random post to display –>
        <div id=“alist-post" style="display: none; margin-top: 15px;”>
            <table><tr>
                <td>
                    <!– author's image –>
                    <img id=“author" src=”“/>    
                </td>
                <td>
                    <!– post text –>
                    <div id="post”></div>
                </td>
            </tr></table>
        </div>

    </body>
</html>

Now, the fun part. The trick it load to load the A-List via our API, here is the link for it. If you are a Traackr customer, this link is accessible from your campaign’s setting.

Once we have loaded the A-List, we can simply call the Javascript function show_post() every 5 seconds to load a new post. We select each post by randomly selecting 1 influencer from the list, then randomly select 1 channel from this influencer and finally 1 random post. Here is what it looks like:

<script type=“text/javascript”>
            $(document).ready(function(){
                $.ajax({
                    url: 'http://alist.traackr.com/influencers/all/4233.json’,
                    data: {sec: '2728ea00020714632aa811e6f4a89e3a’},
                    dataType: 'jsonp’,
                    jsonp: 'jsonpcallback’,
                    success: function(data) { show_alist(data); }
                });
            });

            alist = null;

            var show_alist = function(data) {
                // read list and display title
                alist = data;
                $(’#alist-title’).html(alist.name);
                setTimeout(show_post, 5000);
            } // End function show_alist()

            var show_post = function() {
                // Find random influencer
                current_influencer = Math.floor(Math.random() * (alist.list.length - 1));
                influencer = alist.list[current_influencer];
                // Find random channel
                current_channel = Math.floor(Math.random() * (influencer.channels.length - 1));
                channel = influencer.channels[current_channel];
                // Find channel has posts
                if ( channel.posts.length > 0 ) {
                    // FInd random post
                    current_post = Math.floor(Math.random() * (channel.posts.length - 1));
                    // get data
                    img  = influencer.pics.small;
                    post = channel.posts[current_post].title;
                    url  = channel.posts[current_post].url;
                    // display
                    $(’#alist-post’).hide();
                    $(’#author’).attr(‘src’, img);
                    $(’#post’).html(’<a target=“_blank" href=”' + url + ’“>' + post + ’</a>’);
                    $(’#alist-post’).fadeIn(750);
                    setTimeout(show_post, 5000);
                }
                else {
                    setTimeout(show_post, 100);
                }
            }
        </script>

15 min in the oven at 350 and we are done. Check out the final result

And the best part about it? Traackr’s A-Lists refresh automatically weekly, so without having to do anything, just come back every week and discover new content.

That’s all folks!