Jaco Labs: NodeJS + Docker => The Missing Manual

Danni FriedlandBehind The Scenes, docker, nodejs, technical4 Comments

Preface

When we started Jaco we set up a goal to be able to handle massive amounts of data (for a startup). To do that, we realized our system will need to support hundreds of thousands of events per second.

As we started receiving more traffic, splitting up components into different parts started to make more sense as it provided us with an easier way to reason about code and scale different parts of the application depending on specific load.

“Building a startup is hard, building a startup that needs to collect hundreds of thousands of simultaneous events per second – is fun.”

This happened side-by-side to two things: Docker was getting embraced by the industry, and the micro-services architecture re-emerged from the depths of system-architecture patterns (although it was not in any way new). Both architectures seemed like a natural fit for us.

I won’t discuss the specifics of why we chose these two architectures. You are welcome to ask for the details in the comments.

While building and developing our Docker+NodeJS infrastructure, we realized that the Docker+NodeJS guides tend to be along the lines of “See how easy it is to run a ‘hello world’ in docker”. There were no guides on how to run more complicated NodeJS applications (and their development environment counterparts).

In this post, I’ll share with you what we learned using these two technologies/patterns. I’ll keep updating this blog post with new patterns as they emerge.

In this guide I’ll cover the following:

  • Building a complete Docker+NodeJS environment
  • Connecting shared NodeJS modules to the env
  • Allowing access to all sub-modules using a reverse-proxy

The basic – Our simple NodeJS Dockerfile

Most of it is pretty straightforward. Let’s skip to the interesting parts:

Note that we copy only the package.json. This is a very basic docker-layer caching technique, but here we copy it to /home/app/code, and not to /home/app like other tutorials suggest. We’ll see how this works for development and automatic restart later on.

Next, we install the packages. Note the –silent –progress=false part, as it’s known to improve NPM installation time 2x-3x. [Issue #11283]

Next, installing bower components:

We’re also going to install bower components (we’ll not get into the discussion of whether we should commit bower_components or install it inside docker) nothing exciting here.
I separated bower into its own copy/run as it’s likely to change at a different rate than package.json. I put our bower_components below the npm install layer since they change faster than our package.json dependencies. If your situation is different, think about changing the order.

Running the system using docker-compose

docker-compose is probably one of my favorite tools for docker management. It has the perfect mix of convention and configuration that feels, almost always, just right.

Let’s take a look at our simple docker-compose.yml

You can ignore most of it, it’s fairly simple. This is a verbose option and it can be simplified a bit, but since we’re going to build on this I wanted to include the full version.

The interesting parts here are:

If you’re unfamiliar with anonymous volumes this volume syntax might look weird to you. Go ahead and read this more detailed post.

TL;DR: This code creates a volume at /home/app/code, and mounts our current directory into it. Unfortunately, since we installed our node_modules and bower_components inside the container, this also overrides them (by copying over them), essentially making them disappear.

A lot of tutorials out there overcome this issue by installing node_modules in a directory below (i.e /home/app/node_modules) but there is a much simpler solution:

This creates an anonymous volume, essentially making the previous available files there – bringing back our node_modules (Yay!)

This trick allows us to keep the architecture inside docker as it would have been in a regular pre-docker app.

Using Træfɪk as a reverseproxy/loadbalancer

In their own words:

Træfik is a modern HTTP reverse proxy and load balancer made to deploy microservices with ease. It supports several backends (Docker, Swarm, Mesos/Marathon, Kubernetes, Consul, Etcd, Zookeeper, BoltDB, Rest API, file…) to manage its configuration automatically and dynamically.”

Træfik is a modern era, super simple and extendable load-balancer with an awesome community behind it.

Attaching Træfik to your service is extremely simple. First, let’s take a look at our Træfik docker-compose:

It’s not important to understand exactly what’s going on here, you can use it as is.

Next, you need to add some Traefik related labels to your service:

This special labels are fairly self-explanatory, and set various Traefik settings.

Our final docker-compose.yml looks like this:

The cool part – Handling shared modules

Let’s assume that you have an internal shared module called “DAL”, that provides you with internal Data Layer abstractions. This shared module is on npm/github, and the various services all depend on it as a regular external component in package.json.

Problem #1:

During development, you want to be able to quickly iterate and change the DAL, and have all dependent modules automatically restart and work with the new version. Since we installed our node_modules on the docker image, and each image is hard-wired to a single DAL version, this is currently not possible.

Solution #1:

Assuming your directory is structured as follows:

Where both service_1 and service_2 depend on DAL, deploying this services is as easy as defining a dependency in your package.json. But what about local development while using docker?

If you ever worked with shared component and NPM, your mind might immediately shout “npm link“. But this won’t work, as docker copy currently does not follow symbolic links.

Another thing you need to keep in mind is that we need to re-install the DAL’s node_modules for each service since each service might run a different NodeJS version or Linux version. This rules out the shared data volume approach.

Our solution was to add 2 new volumes to the container:

Then add a new file called start_dev.sh, which is in charge of bootstrapping our dev environments:

And change the start command to:

The purpose of the start_dev.sh is to install the local copy of dal into the new anonymous volume (with a few optimizations):

This code checks if we already installed the dal into this service. If we did, it skips npm altogether (as even doing npm install when nothing changes will slow your service boot time).

That’s it! you can now start your environment by running docker-compose up and accept your new role as dev-ops king.

Conclusion

The entire premise of docker is to have an environment parity between your development and production—and it’s getting pretty close there. However, you still need to jump through a lot of hoops before being able to deploy large scale micro-service architecture to docker.

Have a question? leave a comment.

4 Comments on “Jaco Labs: NodeJS + Docker => The Missing Manual”

  1. Pingback: Eddie, you keep talking like a link, I'm gonna slap you like a link. - Mr.Blond - Magnus Udbjørg

  2. Pingback: Tudo o que você precisa saber para rodar Node.js com Docker - Waldemar Neto Blog Waldemar Neto Blog

Leave a Reply

Your email address will not be published. Required fields are marked *