Kaminari dangerous for high page counts

Note: this is an old blog post, the problem has long since been fixed in kaminari, as noted at the kaminari issue referenced below. -jrochkind, feb 2013

kaminari is a rails plugin with a very nice API for pagination.

When migrating my app from rails2 to rails3, I replaced the under-maintained and not-as-flexible-as-desired will_paginate library with kaminari. (This was actually a change done in the Blacklight engine gem, not my local app, but same difference).

Kaminari turned out to be the culprit for the most pathological (but not all)  of my previously reported slowdown.

Turns out kaminari’s design has some significant problems displaying pagination if there are a large number of total pages. I have a large number of possible hits. In the most pathological case, you could get a total hit count of 3 million. Now, not all of these are returned from the db (actually, from Solr), but the total number of pages is still ~300k.

And the kaminari  _paginator partial will call Paginator#each_page which will iterate over the entire range for this page count, and in fact create a Kaminari PageProxy object for each one (even though all but a handful of them will not be displayed in the pagination).

It is not surprising that this lead to GC problems. Heck, even the iteration/object-creation alone may be a performance problem just in terms of CPU time, even without accounting for memory and GC of all those objects created.

Reported in a Kaminari ticket.  In the meantime, I guess I need to not use kaminari for my pagination.

Now, kaminari was responsible for the lion’s share of my performance problems, the truly pathological (20s response time in MRI!) results. But I’m not over the hill yet, even taking Kaminari out I still have significant performance problems in MRI (~4s response times, when my Rails2 app had ~1s response times. Yes, the response times were largish even in Rails2, largely becuase of external calls to APIs of services neccesary to create the response, where those services are not as fast as I’d like. But ~1s was acceptable, although undesirable. ~20s is obviosuly not, and neither is ~4s really).

Now, if I use a GC tuned REE instead of MRI, the performance starts to approach the rails2 version (under MRI).  So that suggests to me the problem is probably still too-many-objects-created related, as yet still not discovered. I might end up deploying with gc-tuned REE and calling it good enough, but I worry that’s still just papering over some undiscovered underlying problem, which will just cause problems down the road if it’s not found and fixed.

Some numbers

I found and used the excellent memprof utility to get a count of objects created during a particular segment of code (in this case the call to kaminari’s #paginate helper method). Works fine on an RVM-installed 1.8 MRI or REE, just as advertised. (I still haven’t gotten far with new_relic, despite everyone recommending it. I think the free new_relic account doesn’t include the tools that would be useful for this sort of diagnostic, although the paid one might).

With a per_page of 10 and a total count of 3,095,569 , kaminari indeed creates hundreds of thousands of objects in #paginate. With a Memprof.start and Memprof.stats wrapping the call to #paginate:

418328 /home/rochkind/.rvm/gems/ree-1.8.7-2011.03/gems/kaminari-0.12.4/lib/kaminari/helpers/paginator.rb:37:__varmap__
209164 /home/rochkind/.rvm/gems/ree-1.8.7-2011.03/gems/kaminari-0.12.4/lib/kaminari/helpers/paginator.rb:79:Array
209164 /home/rochkind/.rvm/gems/ree-1.8.7-2011.03/gems/kaminari-0.12.4/lib/kaminari/helpers/paginator.rb:38:Kaminari::Helpers::Paginator::PageProxy
209164 /home/rochkind/.rvm/gems/ree-1.8.7-2011.03/gems/kaminari-0.12.4/lib/kaminari/helpers/paginator.rb:38:Hash

Hundreds of thousands of objects created. No surprise that this leads to disastrous performance.

(Not quite sure what the __varmap__ thing is, some kind of anonymous class? Not sure why it memprof logs 209164 PageProxy’s created, slightly less than what I’d expect from reviewing the code, total_pages (~300k), but at any rate it’s the same order of magnitude, and clearly problematic)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s