Cleaning up the Rails backtrace cleaner; Or, The Engine Stays in the Picture!

Rails has for a while included a BacktraceCleaner that removes some lines from backtraces, and reformats others to be more readable.

(There’s an ActiveSupport::BacktraceCleaner, although the one in your app by default is actually a subclass of that, which sets some defaults, Rails::BacktraceCleaner. That’s a somewhat odd way to implement Rails defaults on an AS::BacktraceCleaner, but oh well).

This is pretty crucial, especially since recent versions of Rails can have pretty HUGE call stacks, due to reliance on Rack middleware and other architectural choices.

I rely on clean stack traces in the standard Rails dev-mode error page, in my log files of fatal uncaught exceptions — but also in some log files I write myself, where I catch and recover from an exception, but want to log where it came from anyway, ideally with a clean stacktrace. `Rails.backtrace_cleaner.clean( exception.backtrace )`

A few problems I had with it though:

  • Several of my apps are based on kind of ‘one big Rails engine’. (Blacklight, Umlaut).  The default cleaner will strip out any lines that aren’t part of the local app, but I really want to leave the ‘main engine’ lines in. That was my main motivation to look into this, but as long as I was at it, a couple other inconveniences…
  • The default cleaner nicely reformats lines from gems to remove the filepath to the gem dir, and replace with just the name of the gem. But this didn’t seem to work for gems listed in Bundler as :path (or, I think, :github ?), that don’t live in the standard gem repo. And that ‘main engine gem’ would often be checked out thus, especially in development.
  • Stack trace lines that come from ERB templates include a dynamically generated internal method name, which is really long and makes the stack trace confusing — the line number in the ERB file is really all we need. (At first I thought the Rails ‘render template pattern filter’ was meant to deal with that, but I think it’s meant for something else)

Fortunately, you can remove and add/or your own silencers (which remove lines from the stack trace), and filters (which reformat stack trace lines) from the ActiveSupport/Rails::BacktraceCleaner.

Here’s what I’ve done to make it the way I want. I wanted to add it directly built into Umlaut (a Rails Engine), so this is written to go in Umlaut’s `< Rails::Engine` class. But you could do something similar in a local app, probably in the `initializers/backtrace_silencers.rb` file that Rails has left as a stub for you already.

Note that all filters are executed before silencers, so your silencer has to be prepared to recognize already-filtered input.

module Umlaut
  class Engine < Rails::Engine
    engine_name "umlaut"

    #...

    initializer "#{engine_name}.backtrace_cleaner" do |app|
      engine_root_regex = Regexp.escape (self.root.to_s + File::SEPARATOR)

      # Clean those ERB lines, we don't need the internal autogenerated
      # ERB method, what we do need (line number in ERB file) is already there
      Rails.backtrace_cleaner.add_filter do |line|
        line.sub /(\.erb:\d+)\:in `__.*$/, "\\1"
      end

      # Remove our own engine's path prefix, even if it's
      # being used from a local path rather than the gem directory.
      Rails.backtrace_cleaner.add_filter do |line|
        line.sub(/^#{engine_root_regex}/, "#{engine_name} ")
      end

      # Keep Umlaut's own stacktrace in the backtrace -- we have to remove Rails
      # silencers and re-add them how we want.
      Rails.backtrace_cleaner.remove_silencers!

      # Silence what Rails silenced, UNLESS it looks like
      # it's from Umlaut engine
      Rails.backtrace_cleaner.add_silencer do |line|
        (line !~ Rails::BacktraceCleaner::APP_DIRS_PATTERN) &&
        (line !~ /^#{engine_root_regex}/  ) &&
        (line !~ /^#{engine_name} /)
      end
    end

    #...
  end
end

 

 

Posted in General | Leave a comment

Cardo is a really nice free webfont

Some of the fonts on google web fonts aren’t that great. And I’m not that good at picking the good ones from the not-so-good ones on first glance either.

Cardo is a really nice old-style serif font that I originally found recommended on some list of “the best of google fonts”.

It’s got a pretty good character repertoire for latin text (and I think Greek). The Google Fonts version doesn’t seem to include Hebrew, even though some other versions might?  For library applications, the more characters the better, and it should have enough to deal stylishly with whatever letters and diacritics you throw at it in latin/germanic languages, and all the usual symbols (currency, punctuation; etc).

I’ve used it in a project that my eyeballs have spent a lot of time looking at (not quite done yet), and been increasingly pleased by it, it’s nice to look at and to read, especially on a ‘retina’ display. (I wouldn’t use it for headlines though)

Posted in Uncategorized | Leave a comment

Defeating IE forced ‘compatibility mode’

We recently deployed a new version of our catalog front end (Rails, Blacklight), which is based on Bootstrap 3 CSS.

Bootstrap3 supports IE10 fine, IE9 mostly, and IE8 . IE8 has no media queries out of the box, so columns will be collapsed to single-column small-screen versions in Bootstrap3’s mobile-first CSS — although you can use the third party respond.js to bring media queries to IE8.  We tested IE8 with respond.js, and everything

IE7 according to bootstrap “should look and behave well enough… though not officially supported.”  We weren’t aware of any on-campus units that still had IE7 installed (although we certainly can’t say with certainty there aren’t any), and in general decided that IE7 was old enough that we were comfortable no longer supporting it (especially if the alternative was essentially not upgrading to latest version of Blacklight).

I did do some limited testing with IE7, and found that our Bootstrap3-based app definitely, as expected, fell back to a single column view on all monitor sizes (IE7 lacks media queries).   In a limited skim, all functionality did seem available, although some screen areas on some pages could look pretty jumbled and messy.

Meanwhile, however, Bootstrap also says that “Bootstrap is not supported in the old Internet Explorer compatibility modes.”

What we did not anticipate is that some units in our large and hetereogenous academic/medical organization(s) use, not only a fairly old version of IE (we were able to convince them to upgrade from IE8 to IE9, but no further) — but also one that was configured by group policy to use ‘compatibility mode’ for all websites. IE9 would have been great — but ‘compatibility mode’ not so much.

They reported that the upgraded catalog was unuseable on their browsers.

The bootstrap web site recommend adding a meta tag to your pages to “be sure you’re using the latest rendering mode for IE”:

<!-- note: not what we ended up doing or recommend -->
<meta http-equiv="X-UA-Compatible" content="IE=edge"

However, we didn’t have much luck getting this to work. Google research suggested that it probably would have worked if it is placed immediately after the opening <head> tag (and not in a conditional comment), to make sure IE encounters it before its  ‘rendering mode’ is otherwise fixed.   But this seemed fragile and easy for us to accidentally break with future development, especially when there’s no good way to have an automated test ensuring this is working, and we don’t have access to an IE configured exactly like theirs to test ourselves either. 

What did work, was sending that as an actual HTTP header. “X-UA-Compatible: IE=edge,chrome=1″

In a Rails4 app, this can be easily configured in your config/application.rb:

 config.action_dispatch.default_headers.merge!({
'X-UA-Compatible' => 'IE=edge,chrome=1'
})

After adding this header, affected users reported that the catalog site was displaying manageably again. 

Also, I discovered that I could mimic the forced compatibility mode at least to some extent in my own IE11, by clicking on the settings sprocket icon, choosing “Compatibility View Settings”, and then adding our top level domain to “Websites you’ve added to Compatibility View.”  Only top-level domains are accepted there. This did succesfully force our catalog to be displayed in horrible compatibility mode — but only until we added that header. I can’t say this is identical to an IE9 set by group policy to display all websites in compatibility mode, but in this case it seemed to behave equivalently. 

I think, with enough work on our CSS, we could have made the site display in an ugly but workable single-column layout even in IE8 with compatibility mode. It wasn’t doing that initially, many areas of pages were entirely missing. But it probably would have been quite a bit of work, and with this simple alternate solution it’s displaying much better than we ever could have reached with that approach. 

 

Posted in Uncategorized | Leave a comment

UIUC and Academic Freedom

Professor Steven Salaita was offered a job at the University of Illinois in Urbana-Champaign (UIUC), as associate professor of American Indian Studies, in October 2013. He resigned his previous position at Virginia Tech, and his partner also made arrangements to move with him. 

On August 1 2014, less than a month before classes were to begin, the UIUC Chancellor rescinded the offer, due to angry posts he had made on Twitter about Israel’s attack on Gaza. 

This situation seems to me to be a pretty clear assault on academic freedom. I don’t think the UIUC or it’s chancellor dispute these basic facts — Chancellor Wise’s letter and the Board of Trustees statement of support for the Chancellor claim that “The decision regarding Prof. Salaita was not influenced in any way by his positions on the conflict in the Middle East nor his criticism of Israel”, but is somewhat less direct in explaining on what grounds ‘the decision’ was made, but imply that Salaita’s tweets constituted “personal and disrespectful words or actions that demean and abuse either viewpoints themselves or those who express them,” and that this is good cause to rescind a job offer (that is, effectively fire a professor).  (Incidentally, Salaita has a proven history of excellence in classroom instruction, including respect for diverse student opinions). 

[I have questions about what constitutes "demeaning and abusing viewpoints themselves", and generally thought that "demeaning viewpoints themselves", although never one's academic peers personally, was a standard and accepted part of scholarly discourse. But anyway.]

I’ve looked through Salaita’s tweets, and am not actually sure which ones are supposed to be the ones justifying effective dismissal.   I’m not sure Chancellor Wise or the trustees are either.  The website Inside Higher Ed made an open records request and received emails indicating that pressure from U of I funders motivated the decision — there are emails from major donors and university development (fund-raising) administrators pressuring the Chancellor to get rid of Salaita. 

This raises academic freedom issues not only in relation to firing a professor because of his political beliefs; but also issues of faculty governance and autonomy, when an administrator rescinds a job offer enthusiastically made by an academic department because of pressure from funders. 

I’ve made no secret of my support for Palestinian human rights, and an end to the Israeli occupation and apartheid system.  However, I stop to consider whether I would have the same reaction if a hypothetical professor had made the same sorts of tweets about the Ukraine/Russia conflict (partisan to either side), or tweeting anti-Palestinian content about Gaza instead. I am confident I would be just as alarmed about an assault on academic freedom. However, the fact that it’s hard to imagine funders exerting concerted pressure because of a professor’s opinions on Ukraine — or a professor’s anti-Palestinian opinions — is telling about the political context here, and I think indicates that this really is about Salaita’s “positions on the conflict in the Middle East and his criticism of Israel.”

So lots of academics are upset about this. So many that I suspected, when this story first developed, the UIUC would clearly have to back down, but instead they dug in further. The American Association of University Professors (AAUP) has expressed serious concern about violations of Salaita’s academic freedom — and the academic freedom of the faculty members who selected him for hire. The AAUP also notes that they have “long objected to using criteria of civility and collegiality in faculty evaluation,” in part just because of how easy it is to use those criteria as a cover for suppression of political dissent. 

The Chronicle of Higher Ed, in a good article covering the controversy, reports that “Thousands of scholars in a variety of disciplines signed petitions pledging to avoid the campus unless it reversed its decision to rescind the job offer,” and some have already carried through on their pledge of boycott. Including David J. Blacker, director of the Legal Studies Program and a professor of Philosophy at the University of Deleware, who cancelled an appearance in a prestigious lecture series. The UIUC Education Justice project cancelled a conference due to the boycott. The executive council of the Modern Language Association has sent a letter to UIUC urging them to reconsider. 

This isn’t a partisan issue. Instead, it’s illustrative of the increasingly corporatized academy, where administrative decisions in deference to donor preferences or objections take precedence over academic freedom or faculty decisions about their own departmental hiring and other scholarly matters.  Also, the way the university was willing to rescind a job offer due to political speech after Salaita had resigned his previous position, reminds us of the general precarity of junior faculty careers, and the lack of respect and dignity faculty receive from university administration.  

A variety of disciplinary-specific open letters and boycott pledges have been started in support of Salaita.

I think librarians have a special professional responsibility to stand up for academic freedom.  

Dr. Sarah T. Roberts, a UIUC LIS alumnus and professor of Media Studies at Western University in Ontario, hosts a pledge in support of Salaita from LIS practitioners, students and scholars, with a boycott pledge to “not engage with the University of Illinois at Urbana-Champaign, including visiting the campus, providing workshops, attending conferences, delivering talks or lectures, offering services, or co-sponsoring events of any kind.”  

I’ve signed the letter, and I encourage you to consider doing so as well. I know I see at least one other signer I know from the Code4Lib community already.   I think it is important for librarians to take action to stand up for academic freedom. 

Posted in Uncategorized | Leave a comment

Columbian student faces jail time for sharing scholarly thesis

Columbia strengthened their copyright laws were strengthened in 2006, basically at U.S. demands as part of a free trade agreement. 

As a result, according to Nature News Blog,Diego Gómez Hoyos , a Columbian student, faces jail time for posting someone elses thesis on Scribd. 

In the U.S., of course, ‘grey’ sharing of copyrighted scholarly work without permission is fairly routine. We call it ‘grey’ only because everyone does it, and so far publishers in the U.S. have shown little inclination to stop it, when it’s being done amongst scholars on a one-by-one basis — not because it’s legal in the U.S. If you google (scholar) search recent scholarly publications, you can quite frequently find ‘grey’ publically accessible copies on the public internet, including on Scribd.  

What is done routinely by scholars in the U.S. and ignored, gets you a trial and possible jail time in Columbia — because of laws passed to satisfy the U.S. in ‘free trade’ agreements.  This case may start going around the facebooks as “copyright out of control”, and it is that, but it’s also about how neo-colonialism is alive and well, what’s good for the metropole isn’t good for the periphery, and ‘free trade’ agreements are never about equality. 

http://blogs.nature.com/news/2014/08/student-may-be-jailed-for-posting-scientists-thesis-on-web.html

Student may be jailed for posting scientist’s thesis on web
Posted on behalf of Michele Catanzaro

 

A Colombian biology student is facing up to 8 years in jail and a fine for sharing a thesis

by another scientist on a social network.

 

Diego Gómez Hoyos posted the 2006 work, about amphibian taxonomy, on Scribd in 2011. An undergraduate at the time, he had hoped that it would help fellow students with their fieldwork. But two years later, in 2013, he was notified that the author of the thesis was suing him for violating copyright laws. His case has now been taken up by the Karisma Foundation, a human rights organization in Bogotá, which has launched a campaign called “Sharing is not a crime”.

 

[...]

 

Gómez says that he deleted the thesis from the social network as soon as he was notified of the legal proceedings. But the case against him is rolling on, with the most recent hearing taking place in Bogotá in May. He faces between 4 and 8 years in jail if found guilty. The next hearing will be in September.

 

The student, who is currently studying for a master’s degree in conservation of protected areas at the National University of Costa Rica in Heredia, refuses to reveal who is suing him. He says he does not want to “put pressure on this person”. “My lawyer has tried unsuccessfully to establish contacts with the complainant: I am open to negotiate and get to an agreement to move this issue out of the criminal trial,” he told Nature.

 

The case has left Gómez feeling disappointed. “I thought people did biology for passion, not for making money,” he says. “Now other scientists are much more circumspect [about sharing publications].”

 

Posted in General | Leave a comment

Google Scholar Alerts notifies me of a citation to me

So I still vainly subscribe to Google Scholar Alerts results on my name, although the service doesn’t work too well today. 

Today (after retrurning from summer vacation), I found an alert in my inbox to Googlization of Libraries edited by edited by William Miller, Rita Pellen. 

Except oddly, the Google Books version wasn’t searchable so I could find where my name was mentioned. (But clearly Google has/had text at one point to generate the alert for me!).  

But the Amazon copy was searchable. Amazon doesn’t let you copy and paste from books, but I’ll retype. 

Of course, some aspects of this comparison do not fit. For example, it is unlikely that the existence of Google Scholar is going to “dumb down” research (It might, however, make possible the distribution of less reputable research, unfinished manuscripts, etc. Scholars like Jonathan Rochkind have explored this concept. [32]).

From Standing on the Shoulders of Libraries by Charlie Potter in, Googlization of Libraries, edited by William Miller and Rita Pellen, Routledge 2009. Page 18. 

I don’t actually recall exploring that concept.  Let’s see what the cite is…  doh, the page of citations for that chapter isn’t included in the Amazon preview. Let’s see Google… afraid not, Google wouldn’t show me the page either. 

I wonder how many scholars are doing research like this, from the freely avaiable previews from Google/Amazon, giving up when they run up against the wall.  

Maybe I’ll ILL the book, Amazon search says I’m cited a few more times in other chapters, although it won’t show me them. 

Posted in General | Leave a comment

ActiveRecord Concurrency in Rails4: Avoid leaked connections!

My past long posts about multi-threaded concurrency in Rails ActiveRecord are some of the most visited posts on this blog, so I guess I’ll add another one here; if you’re a “tl;dr” type, you should probably bail now, but past long posts have proven useful to people over the long-term, so here it is.

I’m in the middle of updating my app that uses multi-threaded concurrency in unusual ways to Rails4.   The good news is that the significant bugs I ran into in Rails 3.1 etc, reported in the earlier post have been fixed.

However, the ActiveRecord concurrency model has always made it too easy to accidentally leak orphaned connections, and in Rails4 there’s no good way to recover these leaked connections. Later in this post, I’ll give you a monkey patch to ActiveRecord that will make it much harder to accidentally leak connections.

Background: The ActiveRecord Concurrency Model

Is pretty much described in the header docs for ConnectionPool, and the fundamental architecture and contract hasn’t changed since Rails 2.2.

Rails keeps a ConnectionPool of individual connections (usually network connections) to the database. Each connection can only be used by one thread at a time, and needs to be checked out and then checked back in when done.

You can check out a connection explicitly using `checkout` and `checkin` methods. Or, better yet use the `with_connection` method to wrap database use.  So far so good.

But ActiveRecord also supports an automatic/implicit checkout. If a thread performs an ActiveRecord operation, and that thread doesn’t already have a connection checked out to it (ActiveRecord keeps track of whether a thread has a checked out connection in Thread.current), then a connection will be silently, automatically, implicitly checked out to it. It still needs to be checked back in.

And you can call `ActiveRecord::Base.clear_active_connections!`, and all connections checked out to the calling thread will be checked back in. (Why might there be more than one connection checked out to the calling thread? Mostly only if you have more than one database in use, with some models in one database and others in others.)

And that’s what ordinary Rails use does, which is why you haven’t had to worry about connection checkouts before.  A Rails action method begins with no connections checked out to it; if and only if the action actually tries to do some ActiveRecord stuff, does a connection get lazily checked out to the thread.

And after the request had been processed and the response delivered, Rails itself will call `ActiveRecord::Base.clear_active_connections!` inside the thread that handled the request, checking back connections, if any, that were checked out.

The danger of leaked connections

So, if you are doing “normal” Rails things, you don’t need to worry about connection checkout/checkin. (modulo any bugs in AR).

But if you create your own threads to use ActiveRecord (inside or outside a Rails app, doesn’t matter), you absolutely do.  If you proceed blithly to use AR like you are used to in Rails, but have created Threads yourself — then connections will be automatically checked out to you when needed…. and never checked back in.

The best thing to do in your own threads is to wrap all AR use in a `with_connection`. But if some code somewhere accidentally does an AR operation outside of a `with_connection`, a connection will get checked out and never checked back in.

And if the thread then dies, the connection will become orphaned or leaked, and in fact there is no way in Rails4 to recover it.  If you leak one connection like this, that’s one less connection available in the ConnectionPool.  If you leak all the connections in the ConnectionPool, then there’s no more connections available, and next time anyone tries to use ActiveRecord, it’ll wait as long as the checkout_timeout (default 5 seconds; you can set it in your database.yml to something else) trying to get a connection, and then it’ll give up and throw a ConnectionTimeout. No more database access for you.

In Rails 3.x, there was a method `clear_stale_cached_connections!`, that would  go through the list of all checked out connections, cross-reference it against the list of all active threads, and if there were any checked out connections that were associated with a Thread that didn’t exist anymore, they’d be reclaimed.   You could call this method from time to time yourself to try and clean up after yourself.

And in fact, if you tried to check out a connection, and no connections were available — Rails 3.2 would call clear_stale_cached_connections! itself to see if there were any leaked connections that could be reclaimed, before raising a ConnectionTimeout. So if you were leaking connections all over the place, you still might not notice, the ConnectionPool would clean em up for you.

But this was a pretty expensive operation, and in Rails4, not only does the ConnectionPool not do this for you, but the method isn’t even available to you to call manually.  As far as I can tell, there is no way using public ActiveRecord API to clean up a leaked connection; once it’s leaked it’s gone.

So this makes it pretty important to avoid leaking connections.

(Note: There is still a method `clear_stale_cached_connections` in Rails4, but it’s been redefined in a way that doesn’t do the same thing at all, and does not do anything useful for leaked connection cleanup.  That it uses the same method name, I think, is based on misunderstanding by Rails devs of what it’s doing. See Fear the Reaper below. )

Monkey-patch AR to avoid leaked connections

I understand where Rails is coming from with the ‘implicit checkout’ thing.  For standard Rails use, they want to avoid checking out a connection for a request action if the action isn’t going to use AR at all. But they don’t want the developer to have to explicitly check out a connection, they want it to happen automatically. (In no previous version of Rails, back from when AR didn’t do concurrency right at all in Rails 1.0 and Rails 2.0-2.1, has the developer had to manually check out a connection in a standard Rails action method).

So, okay, it lazily checks out a connection only when code tries to do an ActiveRecord operation, and then Rails checks it back in for you when the request processing is done.

The problem is, for any more general-purpose usage where you are managing your own threads, this is just a mess waiting to happen. It’s way too easy for code to ‘accidentally’ check out a connection, that never gets checked back in, gets leaked, with no API available anymore to even recover the leaked connections. It’s way too error prone.

That API contract of “implicitly checkout a connection when needed without you realizing it, but you’re still responsible for checking it back in” is actually kind of insane. If we’re doing our own `Thread.new` and using ActiveRecord in it, we really want to disable that entirely, and so code is forced to do an explicit `with_connection` (or `checkout`, but `with_connection` is a really good idea).

So, here, in a gist, is a couple dozen line monkey patch to ActiveRecord that let’s you, on a thread-by-thread basis, disable the “implicit checkout”.  Apply this monkey patch (just throw it in a config/initializer, that works), and if you’re ever manually creating a thread that might (even accidentally) use ActiveRecord, the first thing you should do is:

Thread.new do 
   ActiveRecord::Base.forbid_implicit_checkout_for_thread!

   # stuff
end

Once you’ve called `forbid_implicit_checkout_for_thread!` in a thread, that thread will be forbidden from doing an ‘implicit’ checkout.

If any code in that thread tries to do an ActiveRecord operation outside a `with_connection` without a checked out connection, instead of implicitly checking out a connection, you’ll get an ActiveRecord::ImplicitConnectionForbiddenError raised — immediately, fail fast, at the point the code wrongly ended up trying an implicit checkout.

This way you can enforce your code to only use `with_connection` like it should.

Note: This code is not battle-tested yet, but it seems to be working for me with `with_connection`. I have not tried it with explicitly checking out a connection with ‘checkout’, because I don’t entirely understand how that works.

DO fear the Reaper

In Rails4, the ConnectionPool has an under-documented thing called the “Reaper”, which might appear to be related to reclaiming leaked connections.  In fact, what public documentation there is says: “the Reaper, which attempts to find and close dead connections, which can occur if a programmer forgets to close a connection at the end of a thread or a thread dies unexpectedly. (Default nil, which means don’t run the Reaper).”

The problem is, as far as I can tell by reading the code, it simply does not do this.

What does the reaper do?  As far as I can tell trying to follow the code, it mostly looks for connections which have actually dropped their network connection to the database.

A leaked connection hasn’t necessarily dropped it’s network connection. That really depends on the database and it’s settings — most databases will drop unused connections after a certain idle timeout, by default often hours long.  A leaked connection probably hasn’t yet had it’s network connection closed, and a properly checked out not-leaked connection can have it’s network connection closed (say, there’s been a network interruption or error; or a very short idle timeout on the database).

The Reaper actually, if I’m reading the code right, has nothing to do with leaked connections at all. It’s targeting a completely different problem (dropped network, not checked out but never checked in leaked connections). Dropped network is a legit problem you want to be handled gracefullly; I have no idea how well the Reaper handles it (the Reaper is off by default, I don’t know how much use it’s gotten, I have not put it through it’s paces myself). But it’s got nothing to do with leaked connections.

Someone thought it did, they wrote documentation suggesting that, and they redefined `clear_stale_cached_connections!` to use it. But I think they were mistaken. (Did not succeed at convincing @tenderlove of this when I tried a couple years ago when the code was just in unreleased master; but I also didn’t have a PR to offer, and I’m not sure what the PR should be; if anyone else wants to try, feel free!)

So, yeah, Rails4 has redefined the existing `clear_stale_active_connections!` method to do something entirely different than it did in Rails3, it’s triggered in entirely different circumstance. Yeah, kind of confusing.

Oh, maybe fear ruby 1.9.3 too

When I was working on upgrading the app, I’m working on, I was occasionally getting a mysterious deadlock exception:

ThreadError: deadlock; recursive locking:

In retrospect, I think I had some bugs in my code and wouldn’t have run into that if my code had been behaving well. However, that my errors resulted in that exception rather than a more meaningful one, maybe possibly have been a bug in ruby 1.9.3 that’s fixed in ruby 2.0. 

If you’re doing concurrency stuff, it seems wise to use ruby 2.0 or 2.1.

Can you use an already loaded AR model without a connection?

Let’s say you’ve already fetched an AR model in. Can a thread then use it, read-only, without ever trying to `save`, without needing a connection checkout?

Well, sort of. You might think, oh yeah, what if I follow a not yet loaded association, that’ll require a trip to the db, and thus a checked out connection, right? Yep, right.

Okay, what if you pre-load all the associations, then are you good? In Rails 3.2, I did this, and it seemed to be good.

But in Rails4, it seems that even though an association has been pre-loaded, the first time you access it, some under-the-hood things need an ActiveRecord Connection object. I don’t think it’ll end up taking a trip to the db (it has been pre-loaded after all), but it needs the connection object. Only the first time you access it. Which means it’ll check one out implicitly if you’re not careful. (Debugging this is actually what led me to the forbid_implicit_checkout stuff again).

Didn’t bother trying to report that as a bug, because AR doesn’t really make any guarantees that you can do anything at all with an AR model without a checked out connection, it doesn’t really consider that one way or another.

Safest thing to do is simply don’t touch an ActiveRecord model without a checked out connection. You never know what AR is going to do under the hood, and it may change from version to version.

Concurrency Patterns to Avoid in ActiveRecord?

Rails has officially supported multi-threaded request handling for years, but in Rails4 that support is turned on by default — although there still won’t actually be multi-threaded request handling going on unless you have an app server that does that (Puma, Passenger Enterprise, maybe something else).

So I’m not sure how many people are using multi-threaded request dispatch to find edge case bugs; still, it’s fairly high profile these days, and I think it’s probably fairly reliable.

If you are actually creating your own ActiveRecord-using threads manually though (whether in a Rails app or not; say in a background task system), from prior conversations @tenderlove’s preferred use case seemed to be creating a fixed number of threads in a thread pool, making sure the ConnectionPool has enough connections for all the threads, and letting each thread permanently check out and keep a connection.

I think you’re probably fairly safe doing that too, and is the way background task pools are often set up.

That’s not what my app does.  I wouldn’t necessarily design my app the same way today if I was starting from scratch (the app was originally written for Rails 1.0, gives you a sense of how old some of it’s design choices are; although the concurrency related stuff really only dates from relatively recent rails 2.1 (!)).

My app creates a variable number of threads, each of which is doing something different (using a plugin system). The things it’s doing generally involve HTTP interactions with remote API’s, is why I wanted to do them in concurrent threads (huge wall time speedup even with the GIL, yep). The threads do need to occasionally do ActiveRecord operations to look at input or store their output (I tried to avoid concurrency headaches by making all inter-thread communications through the database; this is not a low-latency-requirement situation; I’m not sure how much headache I’ve avoided though!)

So I’ve got an indeterminate number of threads coming into and going out of existence, each of which needs only occasional ActiveRecord access. Theoretically, AR’s concurrency contract can handle this fine, just wrap all the AR access in a `with_connection`.  But this is definitely not the sort of concurrency use case AR is designed for and happy about. I’ve definitely spent a lot of time dealing with AR bugs (hopefully no longer!), and just parts of AR’s concurrency design that are less than optimal for my (theoretically supported) use case.

I’ve made it work. And it probably works better in Rails4 than any time previously (although I haven’t load tested my app yet under real conditions, upgrade still in progress). But, at this point,  I’d recommend avoiding using ActiveRecord concurrency this way.

What to do?

What would I do if I had it to do over again? Well, I don’t think I’d change my basic concurrency setup — lots of short-lived threads still makes a lot of sense to me for a workload like I’ve got, of highly diverse jobs that all do a lot of HTTP I/O.

At first, I was thinking “I wouldn’t use ActiveRecord, I’d use something else with a better concurrency story for me.”  DataMapper and Sequel have entirely different concurrency architectures; while they use similar connection pools, they try to spare you from having to know about it (at the cost of lots of expensive under-the-hood synchronization).

Except if I had actually acted on that when I thought about it a couple years ago, when DataMapper was the new hotness, I probably would have switched to or used DataMapper, and now I’d be stuck with a large unmaintained dependency. And be really regretting it. (And yeah, at one point I was this close to switching to Mongo instead of an rdbms, also happy I never got around to doing it).

I don’t think there is or is likely to be a ruby ORM as powerful, maintained, and likely to continue to be maintained throughout the life of your project, as ActiveRecord. (although I do hear good things about Sequel).  I think ActiveRecord is the safe bet — at least if your app is actually a Rails app.

So what would I do different? I’d try to have my worker threads not actually use AR at all. Instead of passing in an AR model as input, I’d fetch the AR model in some other safer main thread, convert it to a pure business object without any AR, and pass that in my worker threads.  Instead of having my worker threads write their output out directly using AR, I’d have a dedicated thread pool of ‘writers’ (each of which held onto an AR connection for it’s entire lifetime), and have the indeterminate number of worker threads pass their output through a threadsafe queue to the dedicated threadpool of writers.

That would have seemed like huge over-engineering to me at some point in the past, but at the moment it’s sounding like just the right amount of engineering if it lets me avoid using ActiveRecord in the concurrency patterns I am, that while it officially supports, it isn’t very happy about.

Posted in General | Leave a comment

SAGE retracts 60 papers in “peer review citation ring”

A good reminder that a critical approach to scholarly literature doens’t end with “Beall’s list“, and maybe doesn’t even begin there. I still think academic libraries/librarians should consider it part of their mission to teach students (and faculty) about current issues in trustworthiness of scholarly literature, and to approach ‘peer review’ critically.

http://www.uk.sagepub.com/aboutus/press/2014/jul/7.htm

London, UK (08  July 2014) – SAGE announces the retraction of 60 articles implicated in a peer review and citation ring at the Journal of Vibration and Control (JVC). The full extent of the peer review ring has been uncovered following a 14 month SAGE-led investigation, and centres on the strongly suspected misconduct of Peter Chen, formerly of National Pingtung University of Education, Taiwan (NPUE) and possibly other authors at this institution.

In 2013 the then Editor-in-Chief of JVC, Professor Ali H. Nayfeh,and SAGE became aware of a potential peer review ring involving assumed and fabricated identities used to manipulate the online submission system SAGE Track powered by ScholarOne Manuscripts™. Immediate action was taken to prevent JVC from being exploited further, and a complex investigation throughout 2013 and 2014 was undertaken with the full cooperation of Professor Nayfeh and subsequently NPUE.

In total 60 articles have been retracted from JVC after evidence led to at least one author or reviewer being implicated in the peer review ring. Now that the investigation is complete, and the authors have been notified of the findings, we are in a position to make this statement.

Some more summary from retractionwatch.com, which notes this isn’t the first time fake identities have been fraudulently used in peer review.

Posted in General | Leave a comment

Botnet-like attack on EZProxy server

So once last week, and then once again this week, I got reports that our EZProxy server was timing out.

When it happened this week, I managed to investigate while the problem was still occuring, and noticed that the EZProxy process on the server was taking ~100% of available CPU, it was maxing out the CPU. As normally the EZProxy process doesn’t get above 10 or 20% of CPU in `top`, even during our peak times, something was up.

Looking at the EZProxy logs, I noticed a very high volume of requests logged with the “%r” LogFormat placeholder as eg:

"GET http://proxy1.library.jhu.edu:80http://ib.adnxs.com/ttj?id=3018854&size=728x90&cb=[CACHEBUSTER]&referrer=[REFERRER_URL]&pubclick=[INSERT_CLICK_TAG] HTTP/1.0"

ib.adnxs.com seems to be related to serving ads; and these requests were coming from hundreds of different IP’s.  So first guess is that this is some kind of bot-net trying to register clicks on web ads for profit. (And that guess remains my assumption).

I was still confused about exactly what that logged request meant — two URLs jammed together like that, what kind of request are these clients actually making?

Eventually OCLC EZProxy support was able to clarify that this is what’s logged when a client tries to make a standard HTTP Proxy request to EZProxy, as if EZProxy were an standard HTTP Proxy server. Ie,

curl --proxy proxy1.library.jhu.edu:80 http://ib.adnxs.co/ttj...

Now, EZProxy isn’t a standard HTTP Proxy server, so does nothing with this kind of request.  My guess is that some human or automated process noticed a DNS hostname involving the word ‘proxy’, and figured it was worth a try to sic a bot army on it. But it’s not accomplishing what it wanted to accomplish, this ain’t an open HTTP proxy, or even a standard HTTP proxy at all.

But, the sheer volume of them was causing problems. Apparently EZProxy needs to run enough  logic in order to determine it can do nothing with this request that the volume of such requests were making EZProxy go to 100% CPU utilization, even though it would do nothing with them.

It’s not such a large volume of traffic that it overwhelms the OS network stack or anything; if I block all the IP addresses involved in EZProxy config with `RejectIP`, then everything’s fine again, CPU utilization is back to <10%.  It’s just EZProxy the app that is having trouble dealing with all these.

So first, I filed a feature/bug request with OCLC/EZProxy, asking EZProxy to be fixed/improved here, so if something tries making a standard HTTP Proxy request against it, it ignores it in a less CPU-intensive way, so it can ignore a higher volume of these requests.

Secondly, our local central university IT network security thinks they may have the tools to block these requests at the network perimeter, before they even reach our server.  Any request that looks like a standard HTTP Proxy request can be blocked at the network perimeter before it even reaches our server, as there is no legitimate reason for these requests and nothing useful that can be done with them by EZProxy.

If all this fails, I may need to write a cronjob script which regularly scans the EZProxy logs for lines that look like standard HTTP Proxy requests, notes the IP’s, and then automatically adds the IP’s to an EZProxy config file with `RejectIP` (restarting EZProxy to take effect).  This is a pain, would have some delay before banning abusive clients (you don’t want to go restarting EZProxy ever 60 seconds or anything), and would possibly end up banning legitimate users (who are infected by malware? But they’d stay banned even after they got rid of the malware. Who have accidentally configured EZProxy as an HTTP Proxy in their web browser, having gotten confused? But again, they’d stay banned even after they fixed it).

I guess another alternative would be putting EZProxy behind an apache or nginx reverse proxy, so we could write rules in the front-end web server to filter out these requests before they make it to EZProxy.

Or having a log scanning cronjob, which actually blocks bad ip’s with the OS iptables (perhaps using the ‘fail2ban’ script), rather than with EZProxy `RejectIP` config (thus avoiding need for an EZProxy restart when adding blocked IP’s).

But the best solution would be EZProxy fixing itself to not take excessive CPU when under a high volume of HTTP Proxy requests, but simply ignore them in a less CPU-intensive way. I have no idea how likely it is for OCLC to fix EZProxy like this.

 

 

Posted in General | Leave a comment

Ascii-ization transliteration built into ruby I18n gem in Rails

A while ago, I was looking for a way in ruby to turn text with diacritics (é) and ligatures (Æ) and other such things into straight ascii (e ; AE).

I found there were various gems that said they could do such things, but they all had problems. In part, because the ‘right’ way to do this… is really unclear in the general case, there’s all sorts of edge cases, and locale-dependent choices, and the giant universe of unicode to deal with.  In fact, it looks like at one point the Unicode/CLDR suite included such an algorithm, but it kind of looks like it’s been abandoned and not supported, with no notes as to why but I suspect the problem proved intractable.  (Some unicode libraries currently support it anyway; part of Solr actually does in one place; communication about these things seems to travel slowly).

For what I was working on before, I realized that “transliterating to ascii” wasn’t the right solution after all — instead, what I wanted was the Unicode Collation Algorithm, which you can use to produce a collation string, such that for instance “é” will transform to the same collation string as “e”, and “Æ” to the same collation string “AE” — but that collation string isn’t meant to be user-displayable, it won’t neccesarily actually be “e” or “AE”.  It can still be used for sorting or comparing in a “down-sampled to ascii” invariant way.  And, like most of the Unicode suite, it’s pretty well-thought-through and robust to many edge cases.

For that particular case of sorting or comparing in a “down-sampled to ascii invariant way”, you want to create a Unicode collation sort key, for :en locale, with “maximum level” set to 1. And it works swimmingly.  In ruby, you can do that with the awesome twitter_cldr gem — I contributed a patch to support maximum_level, which I think has made it into the latest version.

Anyway, after that lengthy preface explaining why you probably don’t really want to “transliterate to ascii” exactly, and it’s doomed to be imperfect and incomplete…

…I recently noticed that the ruby i18n gem, as used in Rails, actually has a transliterate-to-ascii feature built in. With some support for localization of transliteration rules that I don’t entirely understand.  But anyhow, if I ever wanted this function it in the future — knowing it’s going to be imperfect and incomplete — I’d use the one from I18n, rather than go hunting for the function in some probably less maintained gem.

I guess you might want to do this for creating ‘slugs’ in URL paths, becuase non-ascii in URL’s ends up being such a mess…  it would probably mostly work good enough for an app which really is mostly English, but if you’re really dealing heavily in non-ascii and especially non-roman text, it’s going to get more complicated than this fast. Anyway.

 I18n.transliterate("Ærøskøbing")
 # => "AEroskobing"

 # When it can't handle it, you get ? marks. 
 I18n.transliterate("日本語")
 # => "???"

Still haven’t figured out: How to get the ruby irb/pry/debugger console on my OSX workstation to let me input UTF8, which would make playing out stuff like this and figuring stuff out a lot easier!  Last time I tried to figure it out, I got lost in many layers of yak shaving involving homebrew, readline libraries, rebuilding ruby from source… and eventually gave up.  I am curious if every ruby developer on OSX has this problem, or if I’ve somehow wound up unique.

Posted in General | Leave a comment