“Is the semantic web still a thing?”

A post on Hacker News asks:

A few years ago, it seemed as if everyone was talking about the semantic web as the next big thing. What happened? Are there still startups working in that space? Are people still interested?

Note that “linked data” is basically talking about the same technologies as “semantic web”, it’s sort of the new branding for “semantic web”, with some minor changes in focus.

The top-rated comment in the discussion says, in part:

A bit of background, I’ve been working in environments next to, and sometimes with, large scale Semantic Graph projects for much of my career — I usually try to avoid working near a semantic graph program due to my long histories of poor outcomes with them.

I’ve seen uncountably large chunks of money put into KM projects that go absolutely nowhere and I’ve come to understand and appreciate many of the foundational problems the field continues to suffer from. Despite a long period of time, progress in solving these fundamental problems seem hopelessly delayed.

The semantic web as originally proposed (Berners-Lee, Hendler, Lassila) is as dead as last year’s roadkill, though there are plenty out there that pretend that’s not the case. There’s still plenty of groups trying to revive the original idea, or like most things in the KM field, they’ve simply changed the definition to encompass something else that looks like it might work instead.

The reasons are complex but it basically boils down to: going through all the effort of putting semantic markup with no guarantee of a payoff for yourself was a stupid idea.

The entire comment, and, really the entire thread, are worth a read. There seems to be a lot of energy in libraryland behind trying to produce “linked data”, and I think it’s important to pay attention to what’s going on in the larger world here.

Especially because much of the stated motivation for library “linked data” seems to have been: “Because that’s where non-library information management technology is headed, and for once let’s do what everyone else is doing and not create our own library-specific standards.”  It turns out that may or may not be the case, if your motivation for library linked data was “so we can be like everyone else,” that simply may not be an accurate motivation, everyone else doesn’t seem to be heading there in the way people hoped a few years ago.

On the other hand, some of the reasons that semantic web/linked data have not caught on are commercial and have to do with business models.

One of the reasons that whole thing died was that existing business models simply couldn’t be reworked to make it make sense. If I’m running an ad driven site about Cat Breeds, simply giving you all my information in an easy to parse machine readable form so your site on General Pet Breeds can exist and make money is not something I’m particularly inclined to do. You’ll notice now that even some of the most permissive sites are rate limited through their API and almost all require some kind of API key authentication scheme to even get access to the data.

It may be that libraries and other civic organizations, without business models predicated on competition, may be a better fit for implementation of semantic web technologies.  And the sorts of data that libraries deal with (bibliographic and scholarly) may be better suited for semantic data as well compared to general commercial business data.  It may be that at the moment libraries, cultural heritage, and civic organizations are the majority of entities exploring linked data.

Still, the coarsely stated conclusion of that top-rated HN comment is worth repeating:

going through all the effort of putting semantic markup with no guarantee of a payoff for yourself was a stupid idea.

Putting data into linked data form simply because we’ve been told that “everyone is doing it” without carefully understanding the use cases such reformatting is supposed to benefit and making sure that it does — risks undergoing great expense for no payoff. Especially when everyone is not in fact doing it.

GIGO

Taking the same data you already have and reformatting as “linked data” does not neccesarily add much value. If it was poorly controlled, poorly modelled, or incomplete data before — it still is even in RDF.   You can potentially add a lot more value and more additional uses of your data by improving the data quality than by working to reformat it as linked data/RDF.  The idea that simply reformatting it as RDF would add significant value was predicated on the idea of an ecology of software and services built to use linked data, software and services exciting enough that making your data available to them would result in added value.  That ecology has not really materialized, and it’s hardly clear that it will (and to the extent it does, it may only be if libraries and cultural heritage organizations create it; we are unlikely to get a free ride on more general tools from a wider community).

But please do share your data

To be clear, I still highly advocate taking the data you do have and making it freely available under open (or public domain) license terms. In whatever formats you’ve already got it in.  If your data is valuable, developers will find a way to use it, and simply making the data you’ve already got available is much less expensive than trying to reformat it as linked data.  And you can find out if anyone is interested in it. If nobody’s interested in your data as it is — I think it’s unlikely the amount of interest will be significantly greater after you model it as ‘linked data’. The ecology simply hasn’t arisen to make using linked data any easier or more valuable than using anything else (in many contexts and cases, it’s more troublesome and challenging than less abstract formats, in fact).

Following the bandwagon vs doing the work

Part of the problem is that modelling data is inherently a context-specific act. There is no universally applicable model — and I’m talking here about the ontological level of entities and relationships, what objects you represent in your data as distinct entities and how they are related. Whether you model it as RDF or just as custom XML, the way you model the world may or may not be useful or even usable by those in different contexts, domains and businesses.  See “Schemas aren’t neutral” in the short essay by Cory Doctorow linked to from that HN comment.  But some of the linked data promise is premised on the idea that your data will be both useful and integrate-able nearly universally with data from other contexts and domains.

These are not insoluble problems, they are interesting problems, and they are problems that libraries as professional information organizations rightly should be interested in working on. Semantic web/linked data technologies may very well play a role in the solutions (although it’s hardly clear that they are THE answer).

It’s great for libraries to be interested in working on these problems. But working on these problems means working on these problems, it means spending resources on investigation and R&D and staff with the right expertise and portfolio. It does not mean blindly following the linked data bandwagon because you (erroneously) believe it’s already been judged as the right way to go by people outside of (and with the implication ‘smarter than’) libraries. It has not been.

For individual linked data projects, it means being clear about what specific benefits they are supposed to bring to use cases you care about — short and long term — and what other outside dependencies may be necessary to make those benefits happen, and focusing on those too.  It means understanding all your technical options and considering them in a cost/benefit/risk analysis, rather than automatically assuming RDF/semantic web/linked data and as much of it as possible.

It means being aware of the costs and the hoped for benefits, and making wise decisions about how best to allocate resources to maximize chances of success at those hoped for benefits.   Blindly throwing resources into taking your same old data and sharing it as “linked data”, because you’ve heard it’s the thing to do,  does not in fact help.

Posted in General | 1 Comment

Google Scholar is 10 years old

An article by Steven Levy about the guy who founded the service, and it’s history:

Making the world’s problem solvers 10% more efficient: Ten years after a Google engineer empowered researchers with Scholar, he can’t bear to leave it

“Information had very strong geographical boundaries,” he says. “I come from a place where those boundaries are very, very apparent. They are in your face. To be able to make a dent in that is a very attractive proposition.”

Acharya’s continued leadership of a single, small team (now consisting of nine) is unusual at Google, and not necessarily seen as a smart thing by his peers. By concentrating on Scholar, Acharya in effect removed himself from the fast track at Google….  But he can’t bear to leave his creation, even as he realizes that at Google’s current scale, Scholar is a niche.

…But like it or not, the niche reality was reinforced after Larry Page took over as CEO in 2011, and adopted an approach of “more wood behind fewer arrows.” Scholar was not discarded — it still commands huge respect at Google which, after all, is largely populated by former academics—but clearly shunted to the back end of the quiver.

…Asked who informed him of what many referred to as Scholar’s “demotion,” Acharya says, “I don’t think they told me.” But he says that the lower profile isn’t a problem, because those who do use Scholar have no problem finding it. “If I had seen a drop in usage, I would worry tremendously,” he says. “There was no drop in usage. I also would have felt bad if I had been asked to give up resources, but we have always grown in both machine and people resources. I don’t feel demoted at all.”

Posted in General | 2 Comments

Catching HTTP OPTIONS /* request in a Rails app

Apache sometimes seems to send an HTTP “OPTIONS /*” request to Rails apps deployed under Apache Passenger.  (Or is it “OPTIONS *”? Not entirely sure). With User-Agent of “Apache/2.2.3 (CentOS) (internal dummy connection)”.

Apache does doc that this happens sometimes, although I don’t understand it.

I’ve been trying to take my Rails error logs more seriously to make sure I handle any bugs revealed. 404’s can indicate a problem, especially when the referrer is my app itself. So I wanted to get all of those 404’s for Apache’s internal dummy connection out of my log.  (How I managed to fight with Rails logs enough to actually get useful contextual information on FATAL errors is an entirely different complicated story for another time).

How can I make a Rails app handle them?

Well, first, let’s do a standards check and see that RFC 2616 HTTP 1.1 Section 9 (I hope I have a current RFC that hasn’t been superseded) says:

If the Request-URI is an asterisk (“*”), the OPTIONS request is intended to apply to the server in general rather than to a specific resource. Since a server’s communication options typically depend on the resource, the “*” request is only useful as a “ping” or “no-op” type of method; it does nothing beyond allowing the client to test the capabilities of the server. For example, this can be used to test a proxy for HTTP/1.1 compliance (or lack thereof).

Okay, sounds like we can basically reply with whatever we want to this request, it’s a “ping or no-op”.  How about a 200 text/plain with “OK\n”?

Here’s a line I added to my Rails routes.rb file that seems to catch the “*” requests and just respond with such a 200 OK.

  match ':asterisk', via: [:options], 
     constraints: { asterisk: /\*/ }, 
     to:  lambda {|env| [200, {'Content-Type' => 'text/plain'}, ["OK\n"]]}


Since “*” is a special glob character to Rails routing, looks like you have to do that weird constraints trick to actually match it. (Thanks to mbklein, this does not seem to be documented and I never would have figured it out on my own).

And then we can use a little “Rack app implemented in a lambda” trick to just return a 200 OK right from the routing file, without actually having to write a controller action somewhere else just to do this.

I have not yet tested this extensively, but I think it works? (Still worried if Apache is really requesting “OPTIONS *” instead of “OPTIONS /*” it might not be. Stay tuned.)

Posted in General | Leave a comment

Umlaut News: 4.0 and two new installations

Umlaut is, well, now I’m going to call it a known-item discovery layer, usually but not neccesarily serving as a front-end to SFX.

Umlaut 4.0.0 has been released

This release is mostly back-end upgrades, including:

  •  Support for Rails 4.x (Rails 3.2 included to make migration easier for existing installations, but recommend upgrading to Rails 4.1 asap, and starting with Rails 4.1 in new apps)
  • Based on Bootstrap 3 (Umlaut 3.x was Bootstrap 2)
  • internationalization/localization support
  • A more streamlined installation process with a custom installer

Recent Umlaut Installations

Princeton University has a beta install of Umlaut, and is hoping to go live in production soon.

Durham University (UK) has a beta/soft launch of Umlaut live. 

Posted in General | Leave a comment

Non-digested asset names in Rails 4: Your Options

Rails 4 removes the ability to produce non-digest-named assets in addition to digest-named-assets. (ie ‘application.js’ in addition to ‘application-810e09b66b226e9982f63c48d8b7b366.css’).

There are a variety of ways to work around this by extending asset compilation. After researching and considering them all, I chose to use a custom Rake task that uses the sprockets manifest.json file. In this post, I’ll explain the situation and the options.

The Background

The Rails asset pipeline, powered by sprockets, compiles (sass, coffeescript, others), aggregates (combines multiple source files into one file for performance purposes), and post-processes (minimization, gzip’ing) your assets.

It produces assets to be delivered to the client that are fingerprinted with a digest hash based on the contents of the file — such as ‘application-810e09b66b226e9982f63c48d8b7b366.css’.  People (and configuration) often refer to this filename-fingerprinting as “digested assets”.

The benefit of this is that because the asset filenames are guaranteed to change if their content changes, the individual files can be cached indefinitely, which is great. (You still probably need to adjust your web server configuration to take advantage of this, which you may not be doing).

In Rails3, a ‘straight’ named copy of the assets (eg `application.js`) were also produced, alongside the fingerprinted digest-named assets.

Rails4 stopped doing this by default, and also took away any ability to do this even as a configurable option. While I can’t find the thread now, I recall seeing discussion that in Rails3, the production of non-digest-named assets was accomplished through actually asking sprockets to compile everything twice, which made asset compilation take roughly twice as long as it should.   Which is indeed a problem.

Rather than looking to fix Sprockets api to make it possible to compile the file once but simply write it twice, Rails devs decided there was no need for the straight-named files at all, and simply removed the feature.

Why would you need straight-named assets?

Extensive and combative discussion on this feature change occurred in sprockets-rails issue #49.

The title of this issue reveals one reason people wanted the non-digest-named assets: “breaks compatibility with bad gems”.   This mainly applies to gems that supply javascript, which may need to generate links to assets, and not be produced to look up the current digest-named URLs.  It’s really about javascript, not ‘gems’, it can apply to javascript you’ve included without gemifying it too.

The Rails devs expression opinion on this issue believed (at least initially) that these ‘bad gems’ should simply be fixed, accomodating them was the wrong thing to do, as it eliminates the ability to cache-forever assets they refer to.

I think they under-estimate the amount of work it can take to fix these ‘bad’ JS dependencies, which often are included through multi-level dependency trees (requiring getting patches accepted by multiple upstreams) — and also basically requires wrapping all JS assets in rubygems that apply sprockets/rails-specific patches on top, instead of, say, just using bower.

I think there’s a good argument for accommodating JS assets which the community has not yet had the time/resources to make respect the sprockets fingerprinting. Still, it is definitely preferable, and always at least theoretically possible, to make all your JS respect sprockets asset fingerprinting — and in most of my apps, I’ve done that.

But there’s other use cases: like mine!

I have an application that needs to offer a Javascript file at a particular stable URL, as part of it’s API — think JS “widgets”.

I want it to go through the asset pipeline, for source control, release management, aggregation, SASS, minimization, etc. The suggestion to just “put it in /public as a static asset” is no good at all. But I need the current version available at a persistent  URL.

Rails 3, this Just Worked, since the asset pipeline created a non-digested name. In Rails 4, we need a workaround.  I don’t need every asset to have a non-digest-named version, but I do need a whitelist of a few that are part of my public API.

I think this is a pretty legitimate use case, and not one that can be solved by ‘fixing bad gems’. I have no idea if Rails devs recognize it or not.

(It’s been suggested that HTML emails linking to CSS stylesheets (or JS?) is another use case. I haven’t done that and don’t understand it well enough to comment. Oh, and other people want em for their static 500 error pages.)

Possible Workaround Options

So that giant Github Issue thread? At first it looks like just one of those annoying ones with continual argument by uninformed people that will never die, and eventually @rafaelfranca locked it. But it’s also got a bunch of comments with people offering their solutions, and is the best aggregation of possible workarounds to consider — I’m glad it wasn’t locked sooner. Another example of how GitHub qualitatively improves open source development — finding this stuff on a listserv would have been a lot harder.

The Basic Rake Task

Early in the thread, Rails core team member @guilleiguaran suggested a Rake task, which simply looks in the file system for fingerprinted assets and copies them over to the un-digest-named version. Rails core team member @rafaelfranca later endorsed this approach too. 

The problem is it won’t work. I’ve got nothing against a rake task solution. It’s easy to wire things up so your new rake task automatically gets called every time after `rake assets:precompile’, no problem!

The problem is that a deployed Rails app may have multiple fingerprinted versions of a particular asset file around, representing multiple releases. And really you should set things up this way —  because right after you do a release, there may be cached copies of HTML (in browser caches, or proxying caches including a CDN) still around, still referencing the old version with the old digest fingerprint. You’ve got to keep it around for a while.

(How long? Depends on the cache headers on the HTML that might reference it. The fact that sprockets only supports keeping around a certain number of releases, and not releases made within a certain time window, is a different discussion. But, yeah, you need to keep around some old versions).

So it’s unpredictable which of the several versions you’ve got hanging around the rake task is going to copy to the non-digest-named version, there’s no guarantee it’ll be the latest one. (Maybe it depends on their lexographic sort?). That’s no good.

Enhance the core-team-suggested rake task?

Before I realized this problem, I had already spent some time trying to implement the basic rake task, add a whitelist parameter, etc. So I tried to keep going with it after realizing this problem.

I figured, okay, there are multiple versions of the asset around, but sprockets and rails have to know which one is the current one (to serve it to the current application), so I must be able to use sprockets ruby API in the rake task to figure it out and copy that one.

  • It was kind of challenging to figure out how to get sprockets to do this, but eventually it was sort of working.
  • Except i started to get worried that I might be triggering the double-compilation that Rails3 did, which I didn’t want to do, and got confused about even figuring out if I was doing it.
  • And I wasn’t really sure if I was using sprockets API meant to be public or internal. It didn’t seem to be clearly documented, and sprockets and sprockets-rails have been pretty churny, I thought I was taking a significant risk of it breaking in future sprockets/rails version(s) and needing continual maintenance.

Verdict: Nope, not so simple, even though it seems to be the rails-core-endorsed solution. 

Monkey-patch sprockets: non-stupid-digest-assets

Okay, so maybe we need to monkey-patch sprockets I figured.

@alexspeller provides a gem to monkey-patch Sprockets to support non-digested-asset creation, the unfortunately combatively named non-stupid-digest-assets.

If someone else has already figured it out and packaged it in a gem, great! Maybe they’ll even take on the maintenance burden of keeping it working with churny sprockets updates!

But non-stupid-digest-assets just takes the same kind logic from that basic rake task, another pass through all the assets post-compilation, but implements it with a sprockets monkeypatch instead of a rake task. It does add a white list.  I can’t quite figure out if it’s still subject to the same might-end-up-with-older-version-of-asset problem.

There’s really no benefit just to using a monkey patch instead of a rake task doing the same thing, and it has increased risk of breaking with new Rails releases. Some have already reported it not working with the Rails 4.2.betas — I haven’t investigated myself to see what’s up with that, and @alexspeller doesn’t seem to be in any hurry to either.

Verdict: Nope. non-stupid-digest-assets ain’t as smart as it thinks it is. 

Monkey-patch sprockets: The right way?

If you’re going to monkey-patch sprockets and take on forwards-compat risk, why not actually do it right, and make sprockets simply write the compiled file to two different file locations (and/or use symlinks) at the point of compilation?

@ryana  suggested such code. I’m not sure how tested it is, and I’d want to add the whitelist feature.

At this point, I was too scared of the forwards-compatibility-maintenance risks of monkey patching sprockets, and realized there was another solution I liked better…

Verdict: It’s the right way to do it, but carries some forwards-compat maintenance risk as an unsupported monkey patch

Use the Manifest, Luke, erm, Rake!

I had tried and given up on using the sprockets ruby api to determine ‘current digest-named asset’.  But as I was going back and reading through the Monster Issue looking for ideas again, I noticed @drojas suggested using the manifest.json file that sprockets creates, in a rake task.

Yep, this is where sprockets actually stores info on the current digest-named-assets. Forget the sprockets ruby api, we can just get it from there, and make sure we’re making a copy (or symlinking) the current digested version to the non-digested name.

But are we still using private api that may carry maintenance risk with future sprockets versions?  Hey, look, in a source code comment Sprockets tells us “The JSON is part of the public API and should be considered stable.” Sweet!

Now, even if sprockets devs  remember one of them once said this was public API (I hope this blog post helps), and even if sprockets is committed to semantic versioning, that still doesn’t mean it can never change. In fact, the way some of rubydom treats semver, it doesn’t even mean it can’t change soon and frequently; it just means they’ve got to update the sprockets major version number when it changes. Hey, at least that’d be a clue.

But note that changes can happen in between Rails major releases. Rails 4.1 uses sprockets-rails 2.x which uses sprockets 2.x. Rails 4.2 — no Rails major version number change — will use sprockets-rails 3.x which, oh, still uses sprockets 2.x, but clearly there’s no commitment on Rails not to change sprockets-rails/sprockets major versions without a Rails major version change.

Anyway, what can you do, you pays your money and you takes your chances. This solution seems pretty good to me.

Here’s my rake task, just a couple dozen lines of code, no problem.

 Verdict: Pretty decent option, best of our current choices

The Redirect

One more option is using a redirect to take requests for the non-digest-named asset, and redirect it to the current digest-named asset.

@Intrepidd suggests using rack middleware to do that.   I think it would also work to just use a Rails route redirect, with lambda. (I’m kind of allergic to middleware.) Same difference either way as far as what your app is doing.

I didn’t really notice this one until I had settled on The Manifest.  It requires two HTTP requests every time a client wants the asset at the persistent URL though. The first one will touch your app and needs short cache time, that will then redirect to the digest-named asset that will be served directly by the web server and can be cached forever. I’m not really sure if the performance implications are significant, probably depends on your use cases and request volume. @will-r suggests it won’t work well with CDN’s though. 

Verdict: Meh, maybe, I dunno, but it doesn’t feel right to introduce the extra latency

The Future

@rafaelfranca says Rails core has changed their mind and are going to deal with “this issue” “in some way”. Although I don’t think it made it into Rails 4.2 after all.

But what’s “this issue” exactly? I dunno, they are not sharing what they see as the legitimate use cases to handle, and requirements on legitimate ways to handle em.

I kinda suspect they might just be dealing with the “non-Rails JS that needs to know asset URLs” issue, and considering some complicated way to automatically make it use digest-named assets without having to repackage it for Rails.  Which might be a useful feature, although also a complicated enough one to have some bug risks (ah, the story of the asset pipeline).

And it’s not what I need, anyway, there are other uses cases than the “non-Rails JS” one that need non-digest-named assets.

I just need sprockets to produce parallel non-digested asset filenames for certain whitelisted assets. That really is the right way to handle it for my use case. Yes, it means you need to know the implications and how to use cache headers responsibly. If you don’t give me enough rope to hang myself, I don’t have enough rope to climb the rock face either. I thought Rails target audience was people who know what they’re doing?

It doesn’t seem like this would be a difficult feature for sprockets to implement (without double compilation!).  @ryana’s monkeypatch seems like pretty simple code that is most of the way there.  It’s the feature what I need.

I considered making a pull request to sprockets (the first step, then probably sprockets-rails, needs to support passing on the config settings).  But you know what, I don’t have the time or psychic energy to get in an argument about it in a PR; the Rails/sprockets devs seem opposed to this feature for some reason.  Heck, I just spent hours figuring out how to make my app work now, and writing it all up for you instead!

But, yeah, just add that feature to sprockets, pretty please.

So, if you’re reading this post in the future, maybe things will have changed, I dunno.

Posted in General | 4 Comments

first rule of responding to support tickets

note to self:

Always, always, always, start with “thank you for reporting this problem.”

1. Because it’s true, we need the problem reports to know about problems, and too often people are scared to report problems because of past bad experiences, or don’t report problems because they figure someone else already has, or because they are busy and don’t have the time.

2. Because it gets the support interaction off to a good collegial cooperative start.

It works. Do it every time. Even when the problem being reported doesn’t make any sense and you’re sure (you think!) that it’s not a real problem.

If they give a good problem report with actual reproduction steps and a clear explanation of why the outcome is not what they expected, thank them extra special. 

Posted in General | 1 Comment

Rubyists, don’t forget about the dir glob!

If you are writing configuration to take a pattern to match against files in a file system…

You probably want Dir.globs, not regexes.  Dir.glob is in the stdlib. Dir.glob’s unix-shell-style patterns are less expressive than regexes, but probably expressive enough for anything you need in this use case, and much simpler to deal with for common patterns in this use case.

Dir.glob(“root/path/**/*.rb”)

vs.

%r{\Aroot/path/.*\.rb\Z}

Or

Dir.glob(“root/path/*.rb”)

vs.

…I don’t even feel like thinking about how to express as a regexp that you don’t want child directories, but only directly there.

Dir.glob will find matches from within a directory on local file system — but if you have a certain filepath in a string you want to test for a match against a dirglob, you can easily do that too with Pathname.fnmatch, which does not even require the string to exist in the local file system but can still check it for a match against a dirglob.

Some more info and examples from Shane da Silva, who points out some annoying inconsistent gotchas to be aware of.

Posted in General | 1 Comment

Umlaut 4.0 beta

Umlaut is an open source specific-item discovery layer, often used on top of SFX, and based on Rails.

Umlaut 4.0.0.beta2 is out! (Yeah, don’t ask about beta1 :) ).

This release is mostly back-end upgrades, including:

  • Support for Rails 4.x (Rails 3.2 included to make migration easier for existing installations, but recommend starting with Rails 4.1 in new apps)
  • Based on Bootstrap 3 (Rails 3 was Bootstrap 2)
  • internationalization/localization support
  • A more streamlined installation process with a custom installer

Anyone interested in beta testing? Probably most interesting if you have an SFX to point it at, but you can take it for a spin either way.

To install a new Umlaut app, see: https://github.com/team-umlaut/umlaut/wiki/Installation

Posted in General | Leave a comment

Cleaning up the Rails backtrace cleaner; Or, The Engine Stays in the Picture!

Rails has for a while included a BacktraceCleaner that removes some lines from backtraces, and reformats others to be more readable.

(There’s an ActiveSupport::BacktraceCleaner, although the one in your app by default is actually a subclass of that, which sets some defaults, Rails::BacktraceCleaner. That’s a somewhat odd way to implement Rails defaults on an AS::BacktraceCleaner, but oh well).

This is pretty crucial, especially since recent versions of Rails can have pretty HUGE call stacks, due to reliance on Rack middleware and other architectural choices.

I rely on clean stack traces in the standard Rails dev-mode error page, in my log files of fatal uncaught exceptions — but also in some log files I write myself, where I catch and recover from an exception, but want to log where it came from anyway, ideally with a clean stacktrace. `Rails.backtrace_cleaner.clean( exception.backtrace )`

A few problems I had with it though:

  • Several of my apps are based on kind of ‘one big Rails engine’. (Blacklight, Umlaut).  The default cleaner will strip out any lines that aren’t part of the local app, but I really want to leave the ‘main engine’ lines in. That was my main motivation to look into this, but as long as I was at it, a couple other inconveniences…
  • The default cleaner nicely reformats lines from gems to remove the filepath to the gem dir, and replace with just the name of the gem. But this didn’t seem to work for gems listed in Bundler as :path (or, I think, :github ?), that don’t live in the standard gem repo. And that ‘main engine gem’ would often be checked out thus, especially in development.
  • Stack trace lines that come from ERB templates include a dynamically generated internal method name, which is really long and makes the stack trace confusing — the line number in the ERB file is really all we need. (At first I thought the Rails ‘render template pattern filter’ was meant to deal with that, but I think it’s meant for something else)

Fortunately, you can remove and add/or your own silencers (which remove lines from the stack trace), and filters (which reformat stack trace lines) from the ActiveSupport/Rails::BacktraceCleaner.

Here’s what I’ve done to make it the way I want. I wanted to add it directly built into Umlaut (a Rails Engine), so this is written to go in Umlaut’s `< Rails::Engine` class. But you could do something similar in a local app, probably in the `initializers/backtrace_silencers.rb` file that Rails has left as a stub for you already.

Note that all filters are executed before silencers, so your silencer has to be prepared to recognize already-filtered input.

module Umlaut
  class Engine < Rails::Engine
    engine_name "umlaut"

    #...

    initializer "#{engine_name}.backtrace_cleaner" do |app|
      engine_root_regex = Regexp.escape (self.root.to_s + File::SEPARATOR)

      # Clean those ERB lines, we don't need the internal autogenerated
      # ERB method, what we do need (line number in ERB file) is already there
      Rails.backtrace_cleaner.add_filter do |line|
        line.sub /(\.erb:\d+)\:in `__.*$/, "\\1"
      end

      # Remove our own engine's path prefix, even if it's
      # being used from a local path rather than the gem directory.
      Rails.backtrace_cleaner.add_filter do |line|
        line.sub(/^#{engine_root_regex}/, "#{engine_name} ")
      end

      # Keep Umlaut's own stacktrace in the backtrace -- we have to remove Rails
      # silencers and re-add them how we want.
      Rails.backtrace_cleaner.remove_silencers!

      # Silence what Rails silenced, UNLESS it looks like
      # it's from Umlaut engine
      Rails.backtrace_cleaner.add_silencer do |line|
        (line !~ Rails::BacktraceCleaner::APP_DIRS_PATTERN) &&
        (line !~ /^#{engine_root_regex}/  ) &&
        (line !~ /^#{engine_name} /)
      end
    end

    #...
  end
end

 

 

Posted in General | Leave a comment

Cardo is a really nice free webfont

Some of the fonts on google web fonts aren’t that great. And I’m not that good at picking the good ones from the not-so-good ones on first glance either.

Cardo is a really nice old-style serif font that I originally found recommended on some list of “the best of google fonts”.

It’s got a pretty good character repertoire for latin text (and I think Greek). The Google Fonts version doesn’t seem to include Hebrew, even though some other versions might?  For library applications, the more characters the better, and it should have enough to deal stylishly with whatever letters and diacritics you throw at it in latin/germanic languages, and all the usual symbols (currency, punctuation; etc).

I’ve used it in a project that my eyeballs have spent a lot of time looking at (not quite done yet), and been increasingly pleased by it, it’s nice to look at and to read, especially on a ‘retina’ display. (I wouldn’t use it for headlines though)

Posted in Uncategorized | Leave a comment