Augment your Ruby back end with Elixir

Ruby is a wonderful language to develop back ends. However concurrency is not its strength. We present here our experience with Elixir, aiming at making up for this weakness, within our specifications tool Pericles.

At first there was Ruby

We talked about it in our previous article Backing back ends we are avid lovers of Ruby. This is thus no surprise we chose it to build Pericles, our internal API specification,whose first version was up and running within a few weeks. More and more people ported their project, and the proxy feature became a blockbuster. One day, a team working for a customer with SLOW (>20s 😬) webservices plugged its test suite on the proxy. And it was at that moment we finally reached the limit of Ruby.

The choice of Elixir

Our problem was caused by requests taking forever as they were waiting for the answer of slow webservices. So we were looking for a tech good at async. Node, Akka, Rust were all doing a great job, but Elixir seemed like the perfect candidate for us. Elixir is built on the Erlang VM, a VM (Virtual Machine) built for handling ton loads of concurrent connections as Whatsapp traffic. But Erlang syntax is a bit harsh, so lovely people at Platformatec (Ruby lovers too) built on top of this VM a language inspired by the Ruby syntax, and called it Elixir.

A train ride and a dozen of commits

It all started at the end of July, with a long train ride from Paris to Toulouse. I opened for the first time the awesome Elixir school website, and could not stop reading it for 4 hours in a row. Not only was the syntax familiar to me, but the toolset was also largely similar to Ruby (mix is a combination of bundle and rake, plug an equivalent of rack).

From the very first chapters I only had one thing in mind, to start the first project. But the network being what it is in trains, Erlang VM was definitely too much. I had to stick with the 30s load of the next chapter webpage, and thus read everything I could about this language, without being able to execute a single line of code. Next day at work, I jumped on my terminal and installed Erlang VM and Elixir tools in a blink (gods of network are so unfair).

Bootstrapping my project - writing a proxy - was a piece of cake thanks to the familiar tooling and syntax. Thanks to this existing one, I was able to think the project in a functional way, piped functions, immutable data. However some syntaxes really puzzled me at first, the pattern matching in function arguments for instance. Guess what our friend is doing below?

defp put_resp_headers(conn, []), do: conn
defp put_resp_headers(conn, [{header, value} | rest]) do
conn
  |> Conn.put_resp_header(header |> String.downcase, value)
  |> put_resp_headers(rest)
end

Answer: It works recursively, extracting one header/value from the second parameter at each call, until the second parameter is empty.

Working my way through a lot of unknown stuff (it was good after all this time to be a beginner again and learn at every step), it took me only a dozen of commits to have a first prototype plugged on Pericles database. A few more commits to handle edge cases, add testing and make it configurable, and it was ready to be deployed. Once again our old fellow Heroku did not disappoint us. Only one git push to make Elixir proxy available on the web.

No tech is perfect

Of course everything did not work at the first trial and we faced some obstacles along the way. Investigating an issue where timeout triggered after 5s, which matters because remember I’m waiting for my 20s webservices.

It turned out that the parameter called timeout, that I used in the HTTP client, was not at all was I expected. The correct one was recv_timeout. I suppose once more mature the library will have less ambiguous name for its parameters.

But debatable parameter names is not the only tangle when you play with the last cool tech. Perfectly working on my machine, battle-tested with unit tests, I was pretty confident when I deployed my first version of the proxy. I sent at it the first few requests, got the answer from the proxied webservice, but no report was persisted to my database.

I won’t go into detail, but to sum it up, it’s like a function returning normally, but where half of the work is missing. Happening only in the remote environment, it required me a new deploy at each try to play on a parameter or comment a portion of code. As handy as Heroku is, it ended up in hours of debugging.

I finally found a fix, which consisted in creating the report before setting the response of my connection. No part of the documentation mentioned a possible effect of the connection response setting, neither that this effect would happen only in production.

I asked the Elixir community for some explanations about my weird experience, but nothing came out of it yet.

One last bug was hard to fix. After a great deal of debugging, I found that an occasional crash was caused by a bug in the Erlang underlying library of my Elixir HTTP client, which is still open. Fortunately when you leverage open source libraries, it means you can contribute and fix things by yourself. However here it meant contributing to a very specific issue an unknown language.

It’s worth the sweat

Without being flawless, Elixir really is an excellent choice for Rubyists stuck with concurrency performance issues. The learning curve is considerably lowered if you come from Ruby, and the performance improvement you can benefit is huge.

The strategy to recode only a small portion of our application was retrospectively a very good tradeoff, as it was a small investment, with a huge impact on the overall performance of the system. If you take this road, of course you’ll face similar issues to those we presented, but in our perspective it’s worth the sweat.