Ruby threads, gotcha with local vars and shared state

I end up doing a fair amount of work with multi-threading in ruby. (There is some multi-threaded concurrency in Umlaut, bento_search, and traject).  Contrary to some belief, multi-threaded concurrency can be useful even in MRI ruby (which can’t do true parallelism due to the GIL), for tasks that spend a lot of time waiting on I/O, which is the purpose in Umlaut and bento_search (in both cases waiting on external HTTP apis). Traject uses multi-threaded concurrency for true parallelism in jruby (or soon rbx) for high performance.

There’s a gotcha with ruby threads that I haven’t seen covered much. What do you think this code will output from the ‘puts’?

value = 'original'

t = Thread.new do
  sleep 1
  puts value
end

value = 'changed'

t.join

It outputs “changed”.   The local var `value` is shared between both threads, changes made in the primary thread effect the value of `value` in the created thread too.  This is an issue not unique to threads, but is a result of how closures work in ruby — the local variables used in a closure don’t capture the fixed value at the time of closure creation, they are pointers to the original local variables. (I’m not entirely sure if this is traditional for closures, or if some other languages do it differently, or the correct CS terminology for talking about this stuff).  It confuses people in other contexts too, but can especially lead to problems with threads.

Consider a loop which in each iteration prepares some work to be done, then dispatches to a thread to actually do the work.  We’ll do a very simple fake version of that, watch:

threads = []
i = 0
10.times do
  # pretend to prepare a 'work order', which ends up in local
  # var i
  i += 1
  # now do some stuff with 'i' in the thread
  threads << Thread.new do
    sleep 1 # pretend this is a time consuming computation
    # now we do something else with our work order...
    puts i
  end
end

threads.each {|t| t.join}

Do you think you’ll get “1”, “2”, … “10” printed out? You won’t. You’ll get 10 10’s. (With newlines in random places becuase of interleaving of ‘puts’, but that’s not what we’re talking about here). You thought you dispatched 10 threads each with different values for ‘i’, but the threads are actually all sharing the same ‘i’, when it changes, it changes for all of them.

Oops.

Ruby stdlib Thread.new has a mechanism to deal with this, although like much in ruby stdlib (and much about multi-threaded concurrency in ruby), it’s under-documented. But you can pass args to Thread.new, which will be passed to the block too, and allow you to avoid this local var linkage:

require 'thread'

value = 'original'

t = Thread.new(value) do |t_value|
  sleep 1
  puts t_value
end

value = 'changed'

t.join

Now that prints out “original”. That’s the point of passing one or more args to Thread.new.

You might think you could get away with this instead:

require 'thread'

value = 'original'

t = Thread.new do
  # nope, not a safe way to capture the value, there's
  # still a race condition
  t_value = value
  sleep 1
  puts t_value
end

value = 'changed'

t.join

While that will seem to work for this particular example, there’s still a race condition there, the value could change before the first line of the thread block is executed, part of dealing with concurrency is giving up any expectations of what gets executed when, until you wait on a `join`.

So, yeah, the arguments to Thread.new. Which other libraries involving threading sometimes propagate. With a concurrent-ruby ThreadPoolExecutor:

work = 'original'
pool = Concurrent::FixedThreadPool.new(5)
pool.post(work) do |t_work|
  sleep 1
  puts t_work # is safe
end

work = 'new'

pool.shutdown
pool.wait_for_termination

And it can even be a problem with Futures from ruby-concurrent. Futures seem so simple and idiot-proof, right? Oops.

value = 100

future = Concurrent::Future.execute do
  sleep 1
  # DANGER will robinson!
  value + 1
end

value = 200

puts future.value # you get 201, not 101!

I’m honestly not even sure how you get around this problem with Concurrent::Future, unlike Concurrent::ThreadPoolExecutor it does not seem to copy stdlib Thread.new in it’s method of being able to pass block arguments. There might be something I’m missing (or a way to use Futures that avoids this problem?), or maybe the authors of ruby-concurrent haven’t considered it yet either? I’ve asked the question of them.  (PS: The ruby-concurrent package is super awesome, it’s still building to 1.0 but usable now; I am hoping that it’s existence will do great things for practical use of multi-threaded concurrency in the ruby community).

This is, for me, one of the biggest, most dangerous, most confusing gotchas with ruby concurrency. It can easily lead to hard-to-notice, hard-to-reproduce, and hard-to-debug race condition bugs.

Leave a comment