About Erlang/OTP and Multi-core performance in particular – Kenneth Lundin

Posted on by

I attended an awesome talk by Kenneth Lundin about Erlang/OTP at the Erlang Factory in London. The main topic was SMP and it’s improvements it in the latest release(s). That’s exactly one of the main reasons for Erlang, parallelize computations on many cores, without worrying about locks in shared memory.

Some of the issues they’ve been working on:

  1. Erlang now detects CPU Topology automatically at startup.
  2. Multiple run-queues
  3. You can lock schedulers to logical CPU’S
  4. Improved message passing – reduced lock time

They improved more things of course but considering SMP these are the most important ones.

  1. Erlang now detects the CPU topology of your system automatically at startup. You may still override this automatic setup using:
    erl +sct L0-3c0-3
  2. Multiple run queues … what does that mean? We should first take a look at how Erlang does SMP:
    • Erlang without SMP:
      Without SMP support the Erlang VM had one Scheduler for one runqueue. So all the jobs were pushed on one queue and fetched by one scheduler.
    • Erlang SMP / before R13
      They started more schedulers that were pulling jobs from one queue. Sounds more parallel but still not performing as good as desired on many cores.
    • Erlang SMP R13
      Several schedulers like in the former solution but each of them has it’s own runqueue. The problem with this approach is that it can of course happen that you end up with some empty and some full queues because of the different runtime of the processes. So they build something called migration logic that is controlling and balancing the different runqueues.

    They migration logic does:

    • collect statistics about the maxlength of all scheduler’s runqueues
    • setup migration paths
    • Take away jobs from full-load schedulers and pushing jobs on low load scheduler queues

    Running on full load or not! If all schedulers are not fully loaded, jobs will be migrated to schedulers with lower id’s and thus making some schedulers inactive.

    This makes perfectly sense because the more schedulers and runqueues you need the more migrating has to be done. Using SMP support with many schedulers makes only sense if you’re really optimizing for many cores and you will have decreased performance on systems with few cores.

  3. Binding schedulers to CPU’s is really worth looking at it. The more cores your CPU has the more important it’ll be and the more performance improvement you’ll gain. You can force the erlang VM to do scheduler binding by:
    erl +sbt db
    2> erlang:system_info(scheduler_bindings).
    fabrizio@machine:~$ erl +sbt db
    1> erlang:system_info(scheduler_bindings).

Benchmark - Scheduler Binding - Kenneth Lundin
Source: presentation Kenneth Lundin – Erlang-Factory

You can test and benchmark SMP using following flags:
fabrizio@machine:~$ erl -smp disable       //default is auto
fabrizio@machine:~$ erl +S 2:4               //Number of Schedulers : Schedulers online

With erlang:system_info/1 you can use the following atoms

# cpu_topology
# multi_scheduling
# scheduler_bind_type
# schedulers_online

The ones marked with # can be set using system_flag/2

3 thoughts on “About Erlang/OTP and Multi-core performance in particular – Kenneth Lundin

  1. Pingback: About Erlang/OTP and Multi-core performance

  2. If you think you pocket it during the opposing player’s side, you will lose sport.
    Now this will give them exclusive idea on what to positively write about and the way in which long the write-up
    will be. You use your Madden Coins to buy cards to
    form your team, and you can create a deck of support cards, player cards, and a
    coach card, that you can form your ultimate team.

Leave a Reply

Your email address will not be published. Required fields are marked *