How our Rails 5 app served more than a million people in 2,5 weeks.

In the last 2,5 weeks, our little side-project ispokemongoavailableyet.com served:

  • 2.785.168 pageviews
  • 1.170.496 unique visitors
  • With a peak of 1.700 concurrent visitors (as far as we know)

Monday morning cumulative

Bob has written a story about the chain of events, the things we’ve learned and our next steps.

In this post, I’ll briefly go over the tech behind this site and the things we’ve done along the way to keep everything running smoothly.

Our basic stack

We’ve created this Pokémon Go availability tracker with Rails 5.0. In essence, this app contains a controller to render the list of countries, a controller to handle the signups and a background job that uses the iTunes Search API for every country in the App Store territories list to check if the app is available and queue the notification emails when it is.

We’re using Sidekiq to handle the background jobs and the mailer queue. For scheduling our App Store checker, we’re using sidekiq-cron.

Our application runs on a VPS at TransIP, we're using a quite powerful server (quad-core, 16GB of RAM) which we share with many other (staging/testing) Rails apps.

We’re using Intercity Next to build, deploy and scale. Intercity Next uses Dokku, which made it an absolute breeze to scale our app when we needed to.

First hiccup

The first few days, everything was humming along just fine. Pageloads < 1sec and we were handling ~100 concurrent visitors at ease.

On Monday July 11th, we decided to add our project to Product Hunt. This is where things took off. Our project became featured and got picked up all over the world. For example, a national paper in Peru wrote about it, CNN Chile dedicated a small post to our website. But also some blog in Austria, Malaysia, Taiwan and many others. There were even some YouTubers who decided to “vlog” about it.

Our 50-100 concurrent visitors quickly turned into 700+ concurrent visitors, and this is where our website started to get in trouble. Our single 5-thread worker wasn’t able to serve all request anymore. People were getting 502 errors.

😥

This was when we made two simple and quick changes to tackle this problem from two angles:

1: Increase the number of workers to be able to serve more requests

Simply changing one line and redeploying added three more workers and multiplied the number of requests our website could handle by four!

Scaling with Dokku

2: Reduce the number of requests for each visit

We were using this gem for rendering the flag images, which works well, but in our case, we needed 155 flag images on a single page, which basically meant 155 requests to fetch the images. By swapping this gem with the sprite from this gem we managed to reduce the number of requests by (you guessed it) 154.

Sprite optimisation

After making these two changes, our website kept running smoothly, even when we hit the 1700+ concurrent visitors mark.

Second hiccup

When the first batch of email notifications were supposed to be sent out, we hit a Postgres connection pool limit. With the default pool size being 5 and Sidekiq running on 25 threads by default, you can see how this could result in exceeding the connection pool limit.

Luckily, this was only a matter of increasing the pool size in our database.yml and everything was connecting fine again!

Current status

It was nice to be able to test our own infrastructure like this. At this point, we’re still serving around 100k page views per day, but the number is slowly declining, simply because there are more and more countries where PokemonGoIsAvailableYet.

Questions or comments?

Do you have any questions or comments? Please let me know on Twitter via @JoshuaJansen or send me an email at joshua@firmhouse.com.