I’ve been messing with ruby/rails ActiveRecord multi-threaded concurrency for a while. It’s a topic that doesn’t seem to get much attention. Perhaps this means it also doesn’t get much use — although my blog posts on it are always the most viewed entries on my blog.
Since then, there’s some good news and some bad news for multi-threaded use of ActiveRecord.
The good news is that ActiveRecord 3-2-stable branch (presumably the next 3.2.x release) is the most bug-free and performant ActiveRecord yet.
The bad news is that Rails/ActiveRecord plans to eliminate support for a particular manner of making multi-threaded use of AR, a manner that I believe is a common one that should be supported, and which I use personally anyhow. I believe will be 4.x, targetted within the next year or so?).
The Good News
ActiveRecord in the rails 3-2-stable branch, presumably to become 3.2.3 when it’s released, is the best ActiveRecord for multi-threaded use yet.
Rails 3.0 already had an ActiveRecord more robust than previously for multi-threaded concurrency. From Rails 3.2.2 there were more commits from tenderlove and others improving things, here’s some to ConnectionPool, the main relevant class, leading up to 3.2.2.
I got my multi-threaded-use-of-ActiveRecord app up to 3.2.2, but it was still behaving in some ways that didn’t seem right to me. I decided to spend the past few weeks trying to get to the bottom of it (and yeah, it was a good portion of the past few weeks, multi-threaded coding is hard and debugging is harder, don’t let anyone tell you otherwise). And I found a couple of semi-major bugs, that are now fixed in 3-2-stable and presumably will become the next 3.2.1 release.
One is related to a regression beginning 3.2.0 specifically where `with_connection` would sometimes not properly release it’s connection back to the pool. (Aaron Patterson/tenderlove actually fixed the problem here and here , and possibly another couple subsequent related commits). (So, yeah, 3.2.0-3.2.2 is pretty broken for concurrency, here’s hoping we’ll see a 3.2.3 soon off of 3-2-stable).
Another was related to some deep ruby concurrency stuff, but the effect was that clients checking out connections would, in a certain race condition, end up ‘giving up’ on a connection long before the timeout. I think this problem probably existed since at least rails 2.3, I’d been noticing issues I now attribute to this one for quite a while. (it answers this mystery I posted recently)
Both are fixed in 3-2-stable branch, along with perhaps a few other commits related to ConnectionPool, the chief operator in ActiveRecord concurrency.
Off the 3-2-stable branch, my app using multi-threaded concurrency is now behaving how I’d expect, and the best yet. Having lots of threads using `with_connection` to for brief spurts of ActiveRecord activity — seems to work pretty well. (While ConnectionPool gives you three options for appropriate use, for actual multi-threaded use you really want to stick to the `with_connection` option, anything else is asking for trouble).
Many thanks to Aaron Patterson for helping me figure out how ConnectionPool worked, tolerating my stupid questions and spurious bug reports, and helping get fixes for these legit issues committed to 3-2-stable pretty quickly, once what we figured out what was going on.
It’s a comment from Aaron on the latter issue that I learned the Bad News
The Bad News
Current rails master branch is what’s destined to be the next non-patch release of rails, which I believe will be called Rails 4.0 (no 3.3 is planned).
Code in master changes the design of ActiveRecord under multi-threaded concurrency in some significant ways, which rule out at least one class of solution using multi-threaded ActiveRecord.
I learned of this in a comment from Aaron on the second issue, or else I wouldn’t know of it either; I’m glad I do now to do some ‘succcession’ planning for a Rails 4 world, so I’ll share it with you. tenderlove notes that:
In master, we assume that threads have a 1:1 relationship with connections. So if you have 100 threads that need to access the database, you should crank the connection pool up to 100. If you have more worker threads than connections, you need to use a Queue (or some other thread safe data structure) to manage communication between them.
At first I couldn’t figure out what this would mean, it so didn’t match my understanding of how to deal with multi-threading. (and then I found it a bit astounding, as you’ll see in my followup comments that perhaps didn’t belong on that issue about a specific problem). So I went to look at the actual ConnectionPool code in master (fixed-linking to version of master that exists as I write this).
Okay, what he means by “1:1 relationship” becomes clear. Previously, if a thread tried a connection #checkout and all connections were in use, #checkout would block on a mutex condition variable to wait for a connection (waiting no longer than the @timeout, eventually giving up and raising if needed).
In master, instead, there’s no ConditionVariable, no wait/signal. If a checkout is attempted and there aren’t any available connections, an exception will be immediately raised.
what does this mean you can’t do anymore?
The design this rules out? A bunch of threads that check out connections for very short periods of time. There are fewer connections available than threads, but they take em out for only short periods of time, so can share em just fine.
This is actually a pretty common design for accessing contended bottleneck resources, ‘lock’ the resource for as short a period as you can get away with, so it can be shared. It’s even a pretty common design on other platforms for database connections specifically — I believe the Tomcat JDBC Conneciton Pool and the Java c3p0 JDBC connection pool both use this kind of block-on-checkout-when-full strategy to support this kind of concurrent resource sharing.
yeah, rails doesn’t do it itself, but you still could before
Now, that’s not the design Rails itself uses for interacting with ActiveRecord, and isn’t really the ‘default’ easiest way to do things. If you are using Rails `config.threadsafe!` to get concurrent request handling, then you don’t get this design — you need as many connections in the pool as you could possibly have overlapping concurrent requests anyhow. This is maybe a larger flaw in ActiveRecord, that would take much more rearchitecture to fix — I think possibly DataMapper uses this design more ‘baked in’, only checking out DB connections for individual DataMapper actions that require them, and not requiring the actual client to worry about it at all (can anyone confirm or reject that? I’m not sure).
But you can (at least in 3-2-stable) use this design with AR, and it can work out pretty well — if you are using ActiveRecord in a non-Rails application (it’s a great ORM, why not?) or using Rails in odd ways (like me, not concurrent request handling but yes multi-threaded ActiveRecord), it works out pretty nice.
But not in Rails master/4. If you have 100 threads that might at some point want an ActiveRecord database connection, you need 100 connections in your pool. Becuase the nature of multi-threaded programming is you can’t predict in what sequence those threads may run, they might all end up wanting the connection concurrently.
I might have a program whose design requires 100 threads, and each one just needs an occasional database call here and there (that is in fact what I have, more or less) — I don’t have a 100 CPU’s of course, but they’re all maybe doing some intensive stuff blocking on I/O and want to give up the CPU when blocking on I/O. A multi-threaded design can be convenient. But they don’t really need 100 database connections, maybe my database can support 20 connections for this app just fine but not 100. In 3-2-stable, those 100 threads can share 20 connections no problem, so long as they’re well behaved checking out the connection only for short periods of time. In master/rails4, you need 100 connections in your pool.
This is made even more inconvenient by the current lack of a feature in ActiveRecord to ever give up connections — connections aren’t opened until they’re needed (I think?), but once opened they are never closed on idleness or anything (at least not unless/until the underlying database closes them on a timeout on it’s end). Certainly this could be improved in AR, with a reaper cleaning up idle connections that haven’t been used in a while, so you can handle ‘peak’ usage without needing to hold on to that number of connections forever.
make a complicated worker pool system?
Alternately, I think what tenderlove is maybe suggesting is you can write your own thread-safe library for… I’m not sure what or how to do it. All I can think of is that you’d have a limited number of long-running threads that actually do ActiveRecord calls, and any other threads would never do ActiveRecord calls but just send directives to those long-running threads to do so. Unless there’s some trick I’m not thinking of, this would be a significant challenge to write, when the existing architecture worked just fine for me — modulo bugs in AR that finally seem to have all been taken care of. That’s part of what makes this frustrating — a feature which finally became mature, with years of attention to make it robust and performant, gets removed.
There’s a reason many other db abstraction layer’s connection pools support what rails did in 3-2-stable, it’s a solid pattern.
Dissapointment and Attempted Sober Evaluation
There’s been a lot of hating on Rails lately, that I won’t bother finding and linking to. I’ve generally been a defender of Rails with regard to that stuff. What Rails is trying to as a framework is hard to do, and Rails ain’t perfect, but it constantly gets better. Yeah, Rails in 3.x has made some steps that add complexity and learning curve, which is unfortunate — but I saw exactly why those steps were made, what the corresponding benefits were, the benefits seemed reasonable, there was no obvious way to get the benefits without the downsides — I understood what was going on, the trade-offs seemed reasonable.
This is the first time I’ve seen a decision I simply do not understand, which seems just entirely wrong-headed to me, and it’s certainly frustrating. Why does rails master/4 remove the cooperative connection sharing behavior from the pool? I’m not entirely sure. The removal of this functionality from ConnectionPool takes out a couple dozen lines at most. Perhaps just because writing correct multi-threaded code is hard, and rails committers want to get out of the business. But it’s a business you need to be in if you want to support multi-threaded code.
Perhaps nobody really needs real multi-threaded support anyhow. Maybe Rails should just get out of the business. (will config.threadsafe! stick around?). I do think this change makes AR less of a serious contender in the field of ORM’s for non-Rails usage, which is too bad, I actually really like AR.
These days, fibers is the new ruby hotness. ActiveRecord doesn’t do much of a job of supporting cooperative connection sharing among fibers in 3-2-stable or master; it’d certainly be nice if it did; maybe backing away from supporting multi-thread is necessary to support multi-fiber instead without over-complicating the code to support both. I dunno. (At the moment, even if AR did support cooperative connection sharing among fibers better, in my analysis using fibers with rails is still not for the faint of heart, but presumably it will continue to get better supported and easier.)
Maybe it was the right choice. But it’s sure a pain for me. I’ve spent a lot of time learning ActiveRecord. I actually rather like ActiveRecord (esp in Rails 3.x, it’s pretty sweet). I don’t want to give it up and learn something else. In particular, I’ve spent a whole lot of time on and off for the past few months (mostly ‘on’ the past couple weeks) debugging concurrency issues in rails 3.2 — if I had known I was just remodelling a house scheduled for demolition, I would have perhaps spent my time preparing to move to a new house instead.
Alternate Options for the future?
Okay, so you, like me, may find your architecture no longer viable in rails4. (Alternately, I may be the only one on the planet currently actually trying to do concurrent use of rails, ha). If so, what are our options?
- Convince rails core team to reconsider. Seems like @tenderlove is the decider here and he’s dead-set against it, not going to happen.
- Switch from ActiveRecord to DataMapper or Sequel. Both are relatively mature (these days) products. Both can be used with Rails, although using a non-official component like this means you run the risk of future dependency conflict issues and such. Both claim to support multi-threaded use just fine, although experience shows that multi-threaded is hard, and only experience will show. There are various ways for an ORM-layer to ‘support multi-threading’, and I can’t find documentation on exactly the strategy either of these products is using, which worries me. But they’re both definitely worth investigating.
- Re-architect your app to not use multiple threads. In many cases, there’s a better solution. multi-threaded programming is tricky. In my own case, each approach had various pro’s and con’s, but I was still thinking that on the whole, the cost-benefit for a multi-threaded approach was still best. (I could have been wrong). These developments may change my analysis of course.
- Provide a ‘third party’ gem which customizes ActiveRecord to restore the missing functionality. It’s not entirely clear to me how feasible/maintainable this would be –I think AR would need some documented/supported hooks to use your own ConnectionPool implementation or strategy within ConnectionPool, as suggested by @tenderlove and @drogus in the issue where @tenderlove veto’d continuing the 3-2-stable behavior. I wouldn’t want to try and monkey patch in the customization without documented support, AR ConnectionPool historically changes things up enough to make that a dependency/maintanance nightmare.
- Writing this post and thinking about it more, I may have come up with a way to write your own very light wrapper on top of AR ConnectionPool to provide it’s own `MyThing.with_connection` method that manages N-to-M connections and then calls the underlying ConnectionPool.with_connection only after access is granted. Have to think about that more. But man, AR used to do it for me!
A Brief History of ActiveRecord Concurrency
As I recall it from my experience. Spot-checked a couple things in source repo’s, but I may have some details or exact timelines wrong I haven’t fact checked myself everywhere.
At some hypothetical point, rails/activerecord may not have worked multi-threaded at all, maybe in rails 1.0, but that’s before I started messing with it.
Rails 2.1-2.2, thread-safe AR (but not ActionPack), but not pretty
When I first came in was around rails 2.1 and 2.2. At this point multi-threaded use was supported, sort of — at least it was thread-safe. If you used some magic config incantations anyway. Each thread would wind up with it’s own brand new connection to the database, there was no pooling. So you’d wind up with a potentially unbounded number of connections to the database, depending on number of concurrent threads. It wasn’t entirely easy (or well-documented) to figure out how to close these connections, and yet they’d never be re-used by any other thread, so if you were creating your own threads it was easy to get in a situation of an ever-increasing number of open network connections to the db. If you were not creating threads yourself, but just using Rails, it would probably take care of that for you — but probably still open up a brand new connection to the db for every request that came in.
It was around this time, in 2007, I wrote my first blog post on threading with ActiveRecord, explaining the trials and tribulations in rails 2.1/2.2. It’s historically my most viewed blog post, and it continues to get a steady stream of hits actually, so that suggests that at least there is (or was) interest in multi-threading with Rails/ActiveRecord. (Looking back now, I even have a comment thanking me for my post from Jose Valim, who went on to eventually be a rails committer but at that time was not ‘famous’, I think?!)
Around that time there was a lot of attention in the ruby and rails community to threading issues — I think because jruby was finally starting to mature, a ruby implementation which could use java’s superior threading. Rails committer Michael Koziarski even applied some patches to problems I mentioned in my blog post. Someone (I forget who) emailed me asking if they could use my blog post in a conference presentation they made at some ruby conf about the state of concurrency in rails.
Rails 2.3: Thread-safe request handling and a connection pool
And meanwhile behind the scenes people were working on making rails better for multi-threaded concurrency. Rails 2.1/2.2 supported a certain kind of thread-safe access to ActiveRecord. But it didn’t support concurrent request handling, which was where a lot of people’s interest was for whatever reason. (For my own use case, not so important, but that was the hot thing to talk about at the time, I believe mostly with the idea of jruby and java deployment).
Rails 2.3 took a big step there. Concurrent request handling was supported for the first time. And instead of Rails 2.1/2.2’s ‘brand new network connection to the db for every thread’ model, we saw the first introduction of the ConnectionPool, which kept a pool of network-connected connections, and let individual threads check them in or out — using much the same strategy and methods that still see in the ConnectionPool in rails 3-2-stable.
And there was much excitement and posts, hey, really, Rails really can do things concurrently now. And then people saying but what about the GIL. And other people saying, yeah, but jruby. Here’s a good summary. And just a few people saying, even without the GIL thread-safety and threading can be useful to have a thread blocked on I/O give up the CPU until it come sback (including me!). And then we realized that the ruby mysql adapter broke this. Along with a buncha other C extensions. And then people fixed that, phew. And then we got MRI 1.9 that at least did OS threads, although still with the GIL.
Ruby and Rails both have had such a checkered history with supporting multi-threaded concurrency, that we still get periodic “No, really, guys, ruby and/or rails can do multi-threaded stuff” posts.
The thing with threading in general, is just like there’s no one answer to “is it thread safe” (depends on what you want to do how), there’s no one answer to “are threads okay in ruby?” — depends on what you want to do how. But I submit that for a large class of things (wanting to context switch in I/O block, mainly, which is in fact often quite useful — don’t forget that threads were invented in programming langauges before there even were multi-core machines!) — even MRI 1.9 is pretty decent. But you’ve got to know what you’re doing.
And for a large class of things you might want to do with threads, Rails 3.x, especially 3-2-stable with bugfixes (but not 3.2.0-3.2.2 with significant bug) is just fine too. But yeah, you’ve got to have a clue what you’re doing.
rails future (4/current master)
A class of things with a signficant chunk cut out in ruby 4/master. I don’t understand why rails would want to take a step backwards, but ce la vie. Rails has a reputation for ‘not taking threading seriously’ (says Yehuda), which I think is not entirely deserved in modern rails with `mysql2` etc. But this step makes me think it’s at least partially deserved. Maybe rails4 will do something nifty with fiber-based concurrency, I dunno.