Consider TTY::Command for all your external process/shell out needs in ruby

When writing a ruby app, I regularly have the need to execute and wait for an external non-ruby “command line” process. Sometimes I think of this as a “shell out”, but in truth depending on how you do it a shell (like bash or sh) may not be involved at all, the ruby process can execute the external process directly.  Typical examples for me are the imagemagick/graphicsmagick command line.

(Which is incidentally, I think, what the popular ruby minimagick gem does, just execute an external process using IM command line. As opposed to rmagick, which tries to actually use the system C IM libraries. Sometimes “shelling out” to command line utility is just simpler and easier to get right).

There are a few high-level ways built into ruby to execute external processes easily. Including the simple system and  backticks (`), which is usually what most people start with, they’re simple and right there for you!

But I think many people end up finding what I have, the most common patterns I want in a “launch and wait for external command line process” function are difficult with system and backticks.  I definitely want the exit value — I usually am going to wait to raise an exception if the exit value isn’t 0 (unix for “success”).   I usually want to suppress stdout/stderr from the external process (instead of having it end up in my own processes stdout/stderr and/or logs), but I want to capture them in a string (sometimes separate strings for stdout/stderr), because in an error condition I do want to log them and/or include them in an exception message. And of course there’s making sure you are safe from command injection vulnerabilities. 

Neither system nor backticks will actually give you all this.  You end up having to do Open3#popen3 to get full control. And it ends up pretty confusing, verbose, and tricky, to make sure you’re doing what you want, and without accidentally dead-blocking for some of the weirder combinations. In part because popen3 is just an old-school low-level C-style OS API being exposed to you in ruby.

The good news is @piotrmurrach’s TTY::Command will do it all for you. It’s got the right API to easily express the common use-cases you actually have, succinctly and clearly, and taking care of the tricky socket/blocking stuff for you.

One common use case I have is:  execute an external process. Do not let it output to stderr/stdout, but do capture the stderr/stdout in string(s). If the command fails,  raise with the captured stdout/stderr included (that I intentionally didn’t output to logs, but I wanna see it on error). Do it all with proper protection from command injection attack, of course.

TTY::Command.new(printer: :null).run('vips', 'dzsave', input_file_path_string)

Woah, done! run will already:

If the command fails (with a non-zero exit code), a TTY::Command::ExitError is raised. The ExitError message will include: the name of command executed; the exit status; stdout bytes; stderr bytes

Does exactly what I need, cause, guess, what, what I need is a very common use case and piotr recognized that, prob from his own work.

Want to not raise on the error, but still detect it and log stdout/stderr? No problem.

result = TTY::Command.new(printer: :null).run("vips", "dzsave", whatever)
if result.failed?
$stderr.puts("Our vips thing failed!!! with this output:\n #{result.stdout} #{result.stderr}")
end

If you want to not raise on error but still detect it, pass ENV, a bunch of other things, TTY::Command has got ya. Supply stdin too? No prob.  Supply a custom output formatter, so stuff goes to stdout/stderr but properly colorized/indented for your own command line utility, to look all nice and consistent with your other output? Yup. You even get a dry-run mode!

Ordinary natural rubyish options for just about anything I can think of I might want to do, and some things I hadn’t realized I might want to do until I saw em doc’d as options in TTY::Command. Easy-peasy.

In the past, I sometimes end up writing bash scripts when I’m writing something that calls a lot of external processes, cause bash seems like the suitable fit for that, it can be annoying and verbose to do a lot of that how you want in ruby script. Inevitably the bash script grows to the point that I’m looking up non-trivial parts of bash (I’m not an expert), and fighting with them, and regretting that I used bash.  In the future, when I have the thought “this might be best in bash”, I plan to try using just ruby with TTY::Command, I think it’ll lessen the pain of lots of external processes in ruby to where there’s no reason to even consider using bash.

 

Advertisements

Gem dependency use among Sufia/Hyrax apps

I have a little side project that uses the GitHub API (and a little bit of rubygems API) to analyze what gem dependencies and versions (from among a list of ‘interesting’ ones) are being used in a list of open Github repos with `Gemfile.lock`s, that I wrote out of curiosity regarding sufia/hyrax apps. I think it could turn into a useful tool for any ruby open source community using common dependencies to use to see what the community is up to.

It’s far from done, it just generates an ASCII report, and is missing many features I’d like. There are things I’m curious about that it doesn’t report on yet, like history of dependency use, how often do people upgrade a given dependency. And I’d like an interactive HTML interface that lets you slice and dice the data a bit (of people using a given gem, how many are also using another gem, etc).  And then maybe set it up so it’s on the public web and regularly updates itself.

But it’s been a couple of months since I’ve worked on it, and I thought just the current snapshot in limited ASCII report format was useful enough that I should share a report.

The report, intentionally, for now, does not tell you which repos are using which dependencies, it just gives aggregate descriptive statistics. (Although you could of course manually find that out from their open Gemfile.locks). I wanted to avoid seeming to ‘call out’ anyone for using old versions or whatever. Although it would be useful to know, so you can, say, get in touch with people using the same things or same versions as you, I wanted to get some community feedback first.  Thoughts on if it should?

I got the list of repos from various public lists of sufia or hyrax repos. Some things on the lists didn’t actually have open github repos at that address anymore — or had an open repo, but without a Gemfile.lock! Can only analyze with a Gemfile.lock in the repo. But I don’t really know which of these repos are in production, and which might be not yet, no longer, or never were.  If you have a repo you’d like me to add or remove from the list, let me know! Also any other things you might want the report to include or questions you might want to let it help you answer. Or additional ‘interesting’ gems you’d like included in the report?

I do think it’s pretty cool that the combination of machine-readable Gemfile.lock and the GitHub API lets us do some pretty cool stuff here! If I get around to writing an interactive HTML interface, I’m thinking of trying to do it all in static file Javascript. That would require rewriting some of the analysis tools I’ve already written in ruby, in JS, but might be a good project to experiment with, say, vue.js. I don’t have much fancy new-gen JS experience, and this is a nice isolated thing for trying it out.

I am not sure what to read into these results. They aren’t necessarily good or bad, they just are a statement of what things are, which I think is interesting and useful in itself, and helps us plan and coordinate. I do think it’s worth recognizing that when developers in the community are on old major versions of shared dependencies, it increases the cost for them to contribute back upstream, makes it harder to do as part of “scratching their own itch”, and probably decreases such contributions.  I also found it interesting how many repos use unreleased straight-from-github versions of some dependencies (17 of 28 do at least once), as well as the handful of gems that are fairly widely used in production but still don’t have a 1.0 release.

And here’s the ugly ascii report!

38 total input URLs, 28 with fetchable Gemfile.lock
total apps analyzed: 28
with dependencies on non-release (git or path) gem versions: 17
  with git checkouts: 16
  with local path deps: 1
Date of report: 2017-08-30 15:11:20 -0400


Repos analyzed:

https://github.com/psu-stewardship/scholarsphere
https://github.com/psu-stewardship/archivesphere
https://github.com/VTUL/data-repo
https://github.com/gwu-libraries/gw-sufia
https://github.com/gwu-libraries/scholarspace
https://github.com/duke-libraries/course-assets
https://github.com/ualbertalib/HydraNorth
https://github.com/ualbertalib/Hydranorth2
https://github.com/aic-collections/aicdams-lakeshore
https://github.com/osulp/Scholars-Archive
https://github.com/durham-university/collections
https://github.com/OregonShakespeareFestival/osf_digital_archives
https://github.com/cul/ac3_sufia
https://github.com/ihrnexuslab/research-repo
https://github.com/galterlibrary/digital-repository
https://github.com/chemheritage/chf-sufia
https://github.com/vecnet/vecnet-dl
https://github.com/vecnet/dl-discovery
https://github.com/osulibraries/dc
https://github.com/uclibs/scholar_uc
https://github.com/uvalib/Libra2
https://github.com/pulibrary/plum
https://github.com/curationexperts/laevigata
https://github.com/csuscholarworks/bravado
https://github.com/UVicLibrary/Vault
https://github.com/mlibrary/heliotrope
https://github.com/ucsdlib/horton
https://github.com/pulibrary/figgy


Gems analyzed:

rails
hyrax
sufia
curation_concerns
qa
hydra-editor
hydra-head
hydra-core
hydra-works
hydra-derivatives
hydra-file_characterization
hydra-pcdm
hydra-role-management
hydra-batch-edit
browse-everything
solrizer
blacklight-access_controls
hydra-access-controls
blacklight
blacklight-gallery
blacklight_range_limit
blacklight_advanced_search
active-fedora
active_fedora-noid
active-triples
ldp
linkeddata
riiif
iiif_manifest
pul_uv_rails
mirador_rails
osullivan
bixby
orcid



rails:
  apps without dependency: 0
  apps with dependency: 28 (100%)

  git checkouts: 0
  local path dep: 0

  3.x (3.0.0 released 2010-08-29): 1 (4%)
    3.2.x (3.2.0 released 2012-01-20): 1 (4%)

  4.x (4.0.0 released 2013-06-25): 16 (57%)
    4.0.x (4.0.0 released 2013-06-25): 2 (7%)
    4.1.x (4.1.0 released 2014-04-08): 1 (4%)
    4.2.x (4.2.0 released 2014-12-20): 13 (46%)

  5.x (5.0.0 released 2016-06-30): 11 (39%)
    5.0.x (5.0.0 released 2016-06-30): 8 (29%)
    5.1.x (5.1.0 released 2017-04-27): 3 (11%)

  Latest release: 5.1.4.rc1 (2017-08-24)



hyrax:
  apps without dependency: 20 (71%)
  apps with dependency: 8 (29%)

  git checkouts: 4 (50%)
  local path dep: 0

  1.x (1.0.1 released 2017-05-24): 4 (50%)
    1.0.x (1.0.1 released 2017-05-24): 4 (50%)

  2.x ( released unreleased): 4 (50%)
    2.0.x ( released unreleased): 4 (50%)

  Latest release: 1.0.4 (2017-08-22)



sufia:
  apps without dependency: 10 (36%)
  apps with dependency: 18 (64%)

  git checkouts: 8 (44%)
  local path dep: 0

  0.x (0.0.1.pre1 released 2012-11-15): 1 (6%)
    0.1.x (0.1.0 released 2013-02-04): 1 (6%)

  3.x (3.0.0 released 2013-07-22): 1 (6%)
    3.7.x (3.7.0 released 2014-02-07): 1 (6%)

  4.x (4.0.0 released 2014-08-21): 2 (11%)
    4.1.x (4.1.0 released 2014-10-31): 1 (6%)
    4.2.x (4.2.0 released 2014-11-25): 1 (6%)

  5.x (5.0.0 released 2015-06-06): 1 (6%)
    5.0.x (5.0.0 released 2015-06-06): 1 (6%)

  6.x (6.0.0 released 2015-03-27): 6 (33%)
    6.0.x (6.0.0 released 2015-03-27): 2 (11%)
    6.2.x (6.2.0 released 2015-07-09): 1 (6%)
    6.3.x (6.3.0 released 2015-08-12): 1 (6%)
    6.6.x (6.6.0 released 2016-01-28): 2 (11%)

  7.x (7.0.0 released 2016-08-01): 7 (39%)
    7.0.x (7.0.0 released 2016-08-01): 1 (6%)
    7.1.x (7.1.0 released 2016-08-11): 1 (6%)
    7.2.x (7.2.0 released 2016-10-01): 4 (22%)
    7.3.x (7.3.0 released 2017-03-21): 1 (6%)

  Latest release: 7.3.1 (2017-04-26)


curation_concerns:
  apps without dependency: 21 (75%)
  apps with dependency: 7 (25%)

  git checkouts: 1 (14%)
  local path dep: 1 (14%)

  1.x (1.0.0 released 2016-06-22): 7 (100%)
    1.3.x (1.3.0 released 2016-08-03): 2 (29%)
    1.6.x (1.6.0 released 2016-09-14): 3 (43%)
    1.7.x (1.7.0 released 2016-12-09): 2 (29%)

  Latest release: 2.0.0 (2017-04-20)



qa:
  apps without dependency: 11 (39%)
  apps with dependency: 17 (61%)

  git checkouts: 0
  local path dep: 0

  0.x (0.0.1 released 2013-10-04): 9 (53%)
    0.3.x (0.3.0 released 2014-06-20): 1 (6%)
    0.8.x (0.8.0 released 2016-07-07): 1 (6%)
    0.10.x (0.10.0 released 2016-08-16): 3 (18%)
    0.11.x (0.11.0 released 2017-01-04): 4 (24%)

  1.x (1.0.0 released 2017-03-22): 8 (47%)
    1.2.x (1.2.0 released 2017-06-23): 8 (47%)

  Latest release: 1.2.0 (2017-06-23)



hydra-editor:
  apps without dependency: 3 (11%)
  apps with dependency: 25 (89%)

  git checkouts: 2 (8%)
  local path dep: 0

  0.x (0.0.1 released 2013-06-13): 3 (12%)
    0.5.x (0.5.0 released 2014-08-27): 3 (12%)

  1.x (1.0.0 released 2015-01-30): 6 (24%)
    1.0.x (1.0.0 released 2015-01-30): 4 (16%)
    1.2.x (1.2.0 released 2016-01-21): 2 (8%)

  2.x (2.0.0 released 2016-04-28): 1 (4%)
    2.0.x (2.0.0 released 2016-04-28): 1 (4%)

  3.x (3.1.0 released 2016-08-09): 15 (60%)
    3.1.x (3.1.0 released 2016-08-09): 6 (24%)
    3.3.x (3.3.1 released 2017-05-04): 9 (36%)

  Latest release: 3.3.2 (2017-05-23)



hydra-head:
  apps without dependency: 1 (4%)
  apps with dependency: 27 (96%)

  git checkouts: 0
  local path dep: 0

  5.x (5.0.0 released 2012-12-11): 1 (4%)
    5.4.x (5.4.0 released 2013-02-06): 1 (4%)

  6.x (6.0.0 released 2013-03-28): 1 (4%)
    6.5.x (6.5.0 released 2014-02-18): 1 (4%)

  7.x (7.0.0 released 2014-03-31): 3 (11%)
    7.2.x (7.2.0 released 2014-07-18): 3 (11%)

  9.x (9.0.1 released 2015-01-30): 6 (22%)
    9.1.x (9.1.0 released 2015-03-06): 2 (7%)
    9.2.x (9.2.0 released 2015-07-08): 2 (7%)
    9.5.x (9.5.0 released 2015-11-11): 2 (7%)

  10.x (10.0.0 released 2016-06-08): 16 (59%)
    10.0.x (10.0.0 released 2016-06-08): 1 (4%)
    10.3.x (10.3.0 released 2016-09-02): 3 (11%)
    10.4.x (10.4.0 released 2017-01-25): 4 (15%)
    10.5.x (10.5.0 released 2017-06-09): 8 (30%)

  Latest release: 10.5.0 (2017-06-09)



hydra-core:
  apps without dependency: 1 (4%)
  apps with dependency: 27 (96%)

  git checkouts: 0
  local path dep: 0

  5.x (5.0.0 released 2012-12-11): 1 (4%)
    5.4.x (5.4.0 released 2013-02-06): 1 (4%)

  6.x (6.0.0 released 2013-03-28): 1 (4%)
    6.5.x (6.5.0 released 2014-02-18): 1 (4%)

  7.x (7.0.0 released 2014-03-31): 3 (11%)
    7.2.x (7.2.0 released 2014-07-18): 3 (11%)

  9.x (9.0.0 released 2015-01-30): 6 (22%)
    9.1.x (9.1.0 released 2015-03-06): 2 (7%)
    9.2.x (9.2.0 released 2015-07-08): 2 (7%)
    9.5.x (9.5.0 released 2015-11-11): 2 (7%)

  10.x (10.0.0 released 2016-06-08): 16 (59%)
    10.0.x (10.0.0 released 2016-06-08): 1 (4%)
    10.3.x (10.3.0 released 2016-09-02): 3 (11%)
    10.4.x (10.4.0 released 2017-01-25): 4 (15%)
    10.5.x (10.5.0 released 2017-06-09): 8 (30%)

  Latest release: 10.5.0 (2017-06-09)



hydra-works:
  apps without dependency: 13 (46%)
  apps with dependency: 15 (54%)

  git checkouts: 1 (7%)
  local path dep: 0

  0.x (0.0.1 released 2015-06-05): 15 (100%)
    0.12.x (0.12.0 released 2016-05-24): 1 (7%)
    0.14.x (0.14.0 released 2016-09-06): 2 (13%)
    0.15.x (0.15.0 released 2016-11-30): 2 (13%)
    0.16.x (0.16.0 released 2017-03-02): 10 (67%)

  Latest release: 0.16.0 (2017-03-02)



hydra-derivatives:
  apps without dependency: 2 (7%)
  apps with dependency: 26 (93%)

  git checkouts: 0
  local path dep: 0

  0.x (0.0.1 released 2013-07-23): 4 (15%)
    0.0.x (0.0.1 released 2013-07-23): 1 (4%)
    0.1.x (0.1.0 released 2014-05-10): 3 (12%)

  1.x (1.0.0 released 2015-01-30): 6 (23%)
    1.0.x (1.0.0 released 2015-01-30): 1 (4%)
    1.1.x (1.1.0 released 2015-03-27): 3 (12%)
    1.2.x (1.2.0 released 2016-05-18): 2 (8%)

  3.x (3.0.0 released 2015-10-07): 16 (62%)
    3.1.x (3.1.0 released 2016-05-10): 3 (12%)
    3.2.x (3.2.0 released 2016-11-17): 7 (27%)
    3.3.x (3.3.0 released 2017-06-15): 6 (23%)

  Latest release: 3.3.2 (2017-08-17)



hydra-file_characterization:
  apps without dependency: 3 (11%)
  apps with dependency: 25 (89%)

  git checkouts: 0
  local path dep: 0

  0.x (0.0.1 released 2013-09-17): 25 (100%)
    0.3.x (0.3.0 released 2013-10-24): 25 (100%)

  Latest release: 0.3.3 (2015-10-15)



hydra-pcdm:
  apps without dependency: 13 (46%)
  apps with dependency: 15 (54%)

  git checkouts: 0
  local path dep: 0

  0.x (0.0.1 released 2015-06-05): 15 (100%)
    0.8.x (0.8.0 released 2016-05-12): 1 (7%)
    0.9.x (0.9.0 released 2016-08-31): 14 (93%)

  Latest release: 0.9.0 (2016-08-31)



hydra-role-management:
  apps without dependency: 17 (61%)
  apps with dependency: 11 (39%)

  git checkouts: 0
  local path dep: 0

  0.x (0.0.1 released 2013-04-18): 11 (100%)
    0.2.x (0.2.0 released 2014-06-25): 11 (100%)

  Latest release: 0.2.2 (2015-08-14)



hydra-batch-edit:
  apps without dependency: 10 (36%)
  apps with dependency: 18 (64%)

  git checkouts: 0
  local path dep: 0

  0.x (0.0.1 released 2012-06-15): 1 (6%)
    0.1.x (0.1.0 released 2012-12-21): 1 (6%)

  1.x (1.0.0 released 2013-05-10): 10 (56%)
    1.1.x (1.1.0 released 2013-10-01): 10 (56%)

  2.x (2.0.2 released 2016-04-20): 7 (39%)
    2.0.x (2.0.2 released 2016-04-20): 1 (6%)
    2.1.x (2.1.0 released 2016-08-17): 6 (33%)

  Latest release: 2.1.0 (2016-08-17)



browse-everything:
  apps without dependency: 3 (11%)
  apps with dependency: 25 (89%)

  git checkouts: 3 (12%)
  local path dep: 0

  0.x (0.1.0 released 2013-09-24): 25 (100%)
    0.6.x (0.6.0 released 2014-07-31): 1 (4%)
    0.7.x (0.7.0 released 2014-12-10): 1 (4%)
    0.8.x (0.8.0 released 2015-02-27): 5 (20%)
    0.10.x (0.10.0 released 2016-04-04): 5 (20%)
    0.11.x (0.11.0 released 2016-12-31): 1 (4%)
    0.12.x (0.12.0 released 2017-03-01): 2 (8%)
    0.13.x (0.13.0 released 2017-04-30): 2 (8%)
    0.14.x (0.14.0 released 2017-07-07): 8 (32%)

  Latest release: 0.14.0 (2017-07-07)



solrizer:
  apps without dependency: 1 (4%)
  apps with dependency: 27 (96%)

  git checkouts: 1 (4%)
  local path dep: 0

  2.x (2.0.0 released 2012-11-30): 1 (4%)
    2.1.x (2.1.0 released 2013-01-18): 1 (4%)

  3.x (3.0.0 released 2013-03-28): 25 (93%)
    3.1.x (3.1.0 released 2013-05-03): 1 (4%)
    3.3.x (3.3.0 released 2014-07-17): 7 (26%)
    3.4.x (3.4.0 released 2016-03-14): 17 (63%)

  4.x (4.0.0 released 2017-01-26): 1 (4%)
    4.0.x (4.0.0 released 2017-01-26): 1 (4%)

  Latest release: 4.0.0 (2017-01-26)



blacklight-access_controls:
  apps without dependency: 12 (43%)
  apps with dependency: 16 (57%)

  git checkouts: 0
  local path dep: 0

  0.x (0.1.0 released 2015-12-01): 16 (100%)
    0.5.x (0.5.0 released 2016-06-08): 1 (6%)
    0.6.x (0.6.0 released 2016-09-01): 15 (94%)

  Latest release: 0.6.2 (2017-03-28)



hydra-access-controls:
  apps without dependency: 1 (4%)
  apps with dependency: 27 (96%)

  git checkouts: 0
  local path dep: 0

  5.x (5.0.0 released 2012-12-11): 1 (4%)
    5.4.x (5.4.0 released 2013-02-06): 1 (4%)

  6.x (6.0.0 released 2013-03-28): 1 (4%)
    6.5.x (6.5.0 released 2014-02-18): 1 (4%)

  7.x (7.0.0 released 2014-03-31): 3 (11%)
    7.2.x (7.2.0 released 2014-07-18): 3 (11%)

  9.x (9.0.0 released 2015-01-30): 6 (22%)
    9.1.x (9.1.0 released 2015-03-06): 2 (7%)
    9.2.x (9.2.0 released 2015-07-08): 2 (7%)
    9.5.x (9.5.0 released 2015-11-11): 2 (7%)

  10.x (10.0.0 released 2016-06-08): 16 (59%)
    10.0.x (10.0.0 released 2016-06-08): 1 (4%)
    10.3.x (10.3.0 released 2016-09-02): 3 (11%)
    10.4.x (10.4.0 released 2017-01-25): 4 (15%)
    10.5.x (10.5.0 released 2017-06-09): 8 (30%)

  Latest release: 10.5.0 (2017-06-09)



blacklight:
  apps without dependency: 0
  apps with dependency: 28 (100%)

  git checkouts: 0
  local path dep: 0

  4.x (4.0.0 released 2012-11-30): 2 (7%)
    4.0.x (4.0.0 released 2012-11-30): 1 (4%)
    4.7.x (4.7.0 released 2014-02-05): 1 (4%)

  5.x (5.0.0 released 2014-02-05): 10 (36%)
    5.5.x (5.5.0 released 2014-07-07): 2 (7%)
    5.9.x (5.9.0 released 2015-01-30): 1 (4%)
    5.11.x (5.11.0 released 2015-03-17): 1 (4%)
    5.12.x (5.12.0 released 2015-03-24): 1 (4%)
    5.13.x (5.13.0 released 2015-04-10): 1 (4%)
    5.14.x (5.14.0 released 2015-07-02): 2 (7%)
    5.18.x (5.18.0 released 2016-01-21): 2 (7%)

  6.x (6.0.0 released 2016-01-21): 16 (57%)
    6.3.x (6.3.0 released 2016-07-01): 1 (4%)
    6.7.x (6.7.0 released 2016-09-27): 5 (18%)
    6.10.x (6.10.0 released 2017-05-17): 6 (21%)
    6.11.x (6.11.0 released 2017-08-10): 4 (14%)

  Latest release: 6.11.0 (2017-08-10)



blacklight-gallery:
  apps without dependency: 4 (14%)
  apps with dependency: 24 (86%)

  git checkouts: 0
  local path dep: 0

  0.x (0.0.1 released 2014-02-05): 24 (100%)
    0.1.x (0.1.0 released 2014-09-05): 2 (8%)
    0.3.x (0.3.0 released 2015-03-18): 2 (8%)
    0.4.x (0.4.0 released 2015-04-10): 5 (21%)
    0.6.x (0.6.0 released 2016-07-07): 4 (17%)
    0.7.x (0.7.0 released 2017-01-24): 1 (4%)
    0.8.x (0.8.0 released 2017-02-07): 10 (42%)

  Latest release: 0.8.0 (2017-02-07)



blacklight_range_limit:
  apps without dependency: 24 (86%)
  apps with dependency: 4 (14%)

  git checkouts: 0
  local path dep: 0

  5.x (5.0.0 released 2014-02-11): 1 (25%)
    5.0.x (5.0.0 released 2014-02-11): 1 (25%)

  6.x (6.0.0 released 2016-01-26): 3 (75%)
    6.0.x (6.0.0 released 2016-01-26): 1 (25%)
    6.1.x (6.1.0 released 2017-02-17): 2 (50%)

  Latest release: 6.2.0 (2017-08-29)



blacklight_advanced_search:
  apps without dependency: 11 (39%)
  apps with dependency: 17 (61%)

  git checkouts: 0
  local path dep: 0

  2.x (2.0.0 released 2012-11-30): 2 (12%)
    2.1.x (2.1.0 released 2013-07-22): 2 (12%)

  5.x (5.0.0 released 2014-03-18): 9 (53%)
    5.1.x (5.1.0 released 2014-06-05): 7 (41%)
    5.2.x (5.2.0 released 2015-10-12): 2 (12%)

  6.x (6.0.0 released 2016-01-22): 6 (35%)
    6.0.x (6.0.0 released 2016-01-22): 1 (6%)
    6.1.x (6.1.0 released 2016-09-28): 2 (12%)
    6.2.x (6.2.0 released 2016-12-13): 3 (18%)

  Latest release: 6.3.1 (2017-06-15)



active-fedora:
  apps without dependency: 1 (4%)
  apps with dependency: 27 (96%)

  git checkouts: 1 (4%)
  local path dep: 0

  5.x (5.0.0 released 2012-11-30): 1 (4%)
    5.6.x (5.6.0 released 2013-02-02): 1 (4%)

  6.x (6.0.0 released 2013-03-28): 1 (4%)
    6.7.x (6.7.0 released 2013-10-29): 1 (4%)

  7.x (7.0.0 released 2014-03-31): 3 (11%)
    7.1.x (7.1.0 released 2014-07-18): 3 (11%)

  9.x (9.0.0 released 2015-01-30): 6 (22%)
    9.0.x (9.0.0 released 2015-01-30): 1 (4%)
    9.1.x (9.1.0 released 2015-04-16): 1 (4%)
    9.4.x (9.4.0 released 2015-09-03): 1 (4%)
    9.7.x (9.7.0 released 2015-11-30): 2 (7%)
    9.8.x (9.8.0 released 2016-02-05): 1 (4%)

  10.x (10.0.0 released 2016-06-08): 3 (11%)
    10.0.x (10.0.0 released 2016-06-08): 1 (4%)
    10.3.x (10.3.0 released 2016-11-21): 2 (7%)

  11.x (11.0.0 released 2016-09-13): 13 (48%)
    11.1.x (11.1.0 released 2017-01-13): 2 (7%)
    11.2.x (11.2.0 released 2017-05-18): 4 (15%)
    11.3.x (11.3.0 released 2017-06-13): 3 (11%)
    11.4.x (11.4.0 released 2017-06-28): 4 (15%)

  Latest release: 11.4.0 (2017-06-28)



active_fedora-noid:
  apps without dependency: 9 (32%)
  apps with dependency: 19 (68%)

  git checkouts: 0
  local path dep: 0

  0.x (0.0.1 released 2015-02-14): 1 (5%)
    0.3.x (0.3.0 released 2015-07-14): 1 (5%)

  1.x (1.0.1 released 2015-08-06): 3 (16%)
    1.0.x (1.0.1 released 2015-08-06): 1 (5%)
    1.1.x (1.1.0 released 2016-05-10): 2 (11%)

  2.x (2.0.0 released 2016-11-29): 15 (79%)
    2.0.x (2.0.0 released 2016-11-29): 8 (42%)
    2.2.x (2.2.0 released 2017-05-25): 7 (37%)

  Latest release: 2.2.0 (2017-05-25)



active-triples:
  apps without dependency: 3 (11%)
  apps with dependency: 25 (89%)

  git checkouts: 0
  local path dep: 0

  0.x (0.0.1 released 2014-04-29): 25 (100%)
    0.2.x (0.2.0 released 2014-07-01): 3 (12%)
    0.6.x (0.6.0 released 2015-01-14): 2 (8%)
    0.7.x (0.7.0 released 2015-05-14): 7 (28%)
    0.11.x (0.11.0 released 2016-08-25): 13 (52%)

  Latest release: 0.11.0 (2016-08-25)



ldp:
  apps without dependency: 6 (21%)
  apps with dependency: 22 (79%)

  git checkouts: 0
  local path dep: 0

  0.x (0.0.1 released 2013-07-31): 22 (100%)
    0.2.x (0.2.0 released 2014-12-11): 1 (5%)
    0.3.x (0.3.0 released 2015-04-03): 1 (5%)
    0.4.x (0.4.0 released 2015-09-18): 4 (18%)
    0.5.x (0.5.0 released 2016-03-08): 3 (14%)
    0.6.x (0.6.0 released 2016-08-11): 6 (27%)
    0.7.x (0.7.0 released 2017-06-12): 7 (32%)

  Latest release: 0.7.0 (2017-06-12)



linkeddata:
  apps without dependency: 10 (36%)
  apps with dependency: 18 (64%)

  git checkouts: 0
  local path dep: 0

  1.x (1.0.0 released 2013-01-22): 12 (67%)
    1.1.x (1.1.0 released 2013-12-06): 7 (39%)
    1.99.x (1.99.0 released 2015-10-31): 5 (28%)

  2.x (2.0.0 released 2016-04-11): 6 (33%)
    2.2.x (2.2.0 released 2017-01-23): 6 (33%)

  Latest release: 2.2.3 (2017-08-27)



riiif:
  apps without dependency: 21 (75%)
  apps with dependency: 7 (25%)

  git checkouts: 0
  local path dep: 0

  0.x (0.0.1 released 2013-11-14): 2 (29%)
    0.2.x (0.2.0 released 2015-11-10): 2 (29%)

  1.x (1.0.0 released 2017-02-01): 5 (71%)
    1.4.x (1.4.0 released 2017-04-11): 3 (43%)
    1.5.x (1.5.0 released 2017-07-20): 2 (29%)

  Latest release: 1.5.1 (2017-08-01)



iiif_manifest:
  apps without dependency: 24 (86%)
  apps with dependency: 4 (14%)

  git checkouts: 1 (25%)
  local path dep: 0

  0.x (0.1.0 released 2016-05-13): 4 (100%)
    0.1.x (0.1.0 released 2016-05-13): 2 (50%)
    0.2.x (0.2.0 released 2017-05-03): 2 (50%)

  Latest release: 0.2.0 (2017-05-03)



pul_uv_rails:
  apps without dependency: 26 (93%)
  apps with dependency: 2 (7%)

  git checkouts: 2 (100%)
  local path dep: 0

  2.x ( released unreleased): 2 (100%)
    2.0.x ( released unreleased): 2 (100%)

  No rubygems releases



mirador_rails:
  apps without dependency: 28 (100%)
  apps with dependency: 0

  git checkouts: 0
  local path dep: 0

  Latest release: 0.6.0 (2017-08-02)



osullivan:
  apps without dependency: 27 (96%)
  apps with dependency: 1 (4%)

  git checkouts: 0
  local path dep: 0

  0.x (0.0.2 released 2015-01-16): 1 (100%)
    0.0.x (0.0.2 released 2015-01-16): 1 (100%)

  Latest release: 0.0.3 (2015-01-21)



bixby:
  apps without dependency: 26 (93%)
  apps with dependency: 2 (7%)

  git checkouts: 0
  local path dep: 0

  0.x (0.1.0 released 2017-03-30): 2 (100%)
    0.2.x (0.2.0 released 2017-03-30): 2 (100%)

  Latest release: 0.2.2 (2017-08-07)



orcid:
  apps without dependency: 27 (96%)
  apps with dependency: 1 (4%)

  git checkouts: 1 (100%)
  local path dep: 0

  0.x (0.0.1.pre released 2014-02-21): 1 (100%)
    0.9.x (0.9.0 released 2014-10-27): 1 (100%)

  Latest release: 0.9.1 (2014-12-09)

full-res pan-and-zoom JS viewer on a sufia/hyrax app

Our Digital Collections web app  is written using the samvera/hydra stack, and is currently based on sufia 7.3.

The repository currently has around 10,000 TIFF scanned page and photographic images. They are stored (for better or worse) as TIFFs with no compression, and can be 100MB and up, each. (Typically around 7500 × 4900 pixels). It’s a core use case for us that viewers be able to pan-and-zoom on the full-res images in the browser. OpenSeadragon is the premiere open source solution to this, although some samvera/hydra stack apps use other JS projects that wrap OpenSeadragon with more UI/UX, like UniversalViewer.   All of our software is deployed on AWS EC2 instances.

OpenSeadragon works by loading ’tiles’: Sub-regions of the source image, at the appropriate zoom level,  for what’s in the viewport. In samvera/hydra community it seems maybe popular to use an image server that complies with the IIIF Image API as a tile source, but OpenSeadragon (OSD) can work with a variety of types of tile sources, with an easy plug-in architecture for adding your own too.

Our team ended up spending 4 or 5 weeks investigating various options, finding various levels of suitability, before arriving at a solution that was best for us. I’m not sure if our needs and environment are different than most others in the community; if others are having better success with some solutions than we did; if others are using other solutions not investigated by us. At any rate, I share our experiences hoping to give others a head-start. It can sometimes be difficult for me/us to figure out what use cases, options, or components in the samvera/hyrax stack are mature, production-battle-tested, and/or commonly used, and which are at more of the proof-of-concept stage.

As tile sources for OpenSeadragon, we investigated, experimented with, and developed at least basic proofs of concept for: riiif, imgix.com, cantaloupe, and pre-generated “Deep Zoom Image” tiles to be stored on AWS S3. We found riiif unsuitable; imgix worked but was going to have some fairly high latency for users; cantaloupe worked pretty fine, but may take fairly expensive server resources to handle heavy load; the pre-generated DZI images actually ended up being what we chose to put into production, with excellent performance and maintainability for minimal cost.

Details on each:

Riiif

A colleague and I learned about riiif at advanced hydra camp in May and Minneapolis. riiif is a Rails engine that lets you add a IIIF server to a new or existing Rails app. It was easy to set up a simple proof of concept at the workshop. It was easy to incorporate authorization and access controls for your image. I left with the impression that riiif was more-or-less a community standard, and was commonly used in production.

So we started out our work assuming we would be using riiif, and got to work on implementing it.

Knowing that tile generation would likely be CPU-intensive, disk-IO-intensive, and RAM intensive, we decided at the outset that the actual riiif IIIIF image server would live on a different server than our main Rails app, so it could be scaled independently and wouldn’t have resource contention with the main app.  I included the riiif stuff in the same Rails app and repo, but used some rails routing definition tricks so that our main app server(s) would refuse to serve riiif IIIIF routes, and the “image server” would refuse to serve anything but IIIF routes.

Since the riiif image server was obviously also not on the same server as our fedora repo, and shared disk can be expensive and/or unreliable in AWS-land, riiif would be using its HTTPFileResolver to fetch the originals from fedora. Since our originals are big and we figured this would be slow, we planned to give it enough disk space to cache all of them. And additionally worked up code to ‘ping’ the riiif server with an ‘info’ request for all of our images, forcing it to download them to it’s local cache on bootstrapped startup or future image uploads, thereby “pre-loading” them.

Later, in experiments with other tools, I think we saw that downloading even huge files from a fedora on one AWS EC2 to another EC2 on same account is actually pretty quick (AWS internal network is awfully fast), and this may have been a premature optimization. However, it did allow us to do some performance testing knowing that time to download originals from fedora was not a factor.

riiif worked fine in development, and even worked okay with only one user using in a deployed staging app. (Sometimes you had to wait 2-3 seconds for viewer tiles, which is not ideal, but not as disastrous as it got…).

But when we did a test with 3 or 4 (manual, human) users simultaneously opening viewers (on the same or different objects), things got very rough. People waiting 10+ seconds for tiles, or even timing out on OpenSeadragon’s default 30 second timeout.

We spent a lot of time trying to diagnose and trying different things. Among other things, we tried having riiif use GraphicsMagick instead of ImageMagick. When testing image operations individually, GM did perform around 50% faster than IM, and I recommend using it instead of IM wherever appropriate. We also tried increasing the size of our EC2 instance. We tried an m4.large, then a c4.xlarge, and then also keeping our data on a RAID-arrayed EBS trying to increase disk access speeds.   But things were still disastrous under multi-user simultaneous use.

Originally, we were using riiif additionally for generating thumbnails for our ‘show’ pages and search results pages. I had the impression from a slack conversation this was a common choice, and it certainly is convenient if you already have an image server to use it this way. (Also makes it super easy to generate multiple resolutions for responsive srcset attribute). At one point in trying to get riiif working better, we turned off riiif for thumbs, using riiif only on the viewer, to significantly reduce load on the image server. But still, no dice.

After much investigation, we saw that CPU use would often go to 99-100%, and disk IO levels were also through the roof during concurrent use tests. (RAM was actually okay).  Also doing a manual command-line imagemagick  conversion on the server at the same time as concurrent use testing, operations were seen to sometimes take 30+ seconds that would only take a few seconds on an otherwise unloaded server.  We gave up on riiif before diagnosing exactly what was going on (this kind of lower-level OS/infrastructure profiling and diagnosis is kinda hard!), but my guess is saturated disk IO.  If you look at what riiif does, this seems plausible. Riiif will do a separate shell-out to an imagemagick or graphicsmagick command line operation for every image request, if the derivative requested is not yet cached.

If you open up an OpenSeadragon viewer, OSD can start out by sending requests for a dozen+ different tiles. Each one, with a riiif-backed tile source, will result in an independent shell-out to imagemagick/graphicsmagick command line tool, with one of our 100MB+ source image TIFFs as input argument. With several people concurrently using the viewer, this could be several dozens of imagemagick shellouts, each trying to use a 100MB+ file on disk as input.  You could easily imagine this filling up even a fairly fat disk IO pipeline, and/or all of these processes fighting for access to various OS concurrency locks involved in reading from the file system, etc. But this is just hypothesis supported by what we observed, we didn’t totally nail down a diagnosis.

At some point, after spending a week+ on trying to solve this, and with deadlines looming, we decided to explore the other tile source alternatives we’ll discuss, even without being entirely sure what was going on with riiif.

It may be that other people have more success with riiif. Perhaps they are using smaller original sources; or running on AWS EC2 may have exacerbated things for us; or we just had bad luck for some as of yet undiscovered reason.

But we got curious how many people were using riiif in production, and what their image corpus looked like. We asked on samvera-tech listserv, and got only a handful of answers, and none of them were using riiif! I have an in-progress side-project I’m working on that gives some dependency-use statistics for public github repos holding hydra/samvera apps — of 20 public repos I had listed, 3 of them had riiif as a dependency, but I’m not sure how many of those are in production.   I did happen to talk to one other developer on samvera slack who confirmed they were having similar problems to ours. Still interested in hearing from anyone that is using riiif successfully in production, and if so what your source images are like, and how many concurrent viewers it can handle.

Cantaloupe

Wanting to try alternatives to riiif, the obvious choice was another IIIF server. There aren’t actually a whole lot of mature, reliable, open source IIIF server options, I think IIIF as a technology hasn’t caught on much outside of library/cultural heritage digital repositories. We knew from the samvera-tech listserv question that Loris (written in python) was a popular choice in the community, but Loris is currently looking for a new maintainer, which gave us pause.

We eventually decided on Cantaloupe, written in Java, as the best bet to try first. While it being in Java gave us a bit of concern, as nobody on the team is much of a Java dev, even without being Java experts we could tell looking at the source code that it was impressively clean and readable. The docs are good, and revealed attention to performance issues. The Github repo has all the signs of being an active and well-maintained project.

Cantaloupe too was having a bit of performance trouble when we tried using it for show-page thumbnails too (we have a thumb for every page in a work on our ‘show’ page, as in default sufia/hyrax, and our works can have 200+ pages). So we decided fairly early on that we’d just keep using a pre-generated derivative for thumbs, and stick to our priority use case, the viewer, in our tests of all our alternatives from here on out.

And then, Cantaloupe did pretty well, so long as it had a powerful and RAM-ful enough EC2.

Cantaloupe lets you configure caching separately for originals and derivatives, but even with no caching turned on, it was somehow doing noticeably better than riiif. Max 1-2 second wait times even with 3-4 simultaneous viewers. I’m not really sure how it pulled off doing better, but it did!

Our largest image is a whopping 1G TIFF. When we asked cantaloupe to generate derivatives for that, it unfortunately crashed with a Java OOM, and was then unresponsive until it was manually restarted. We gave the server and cantaloupe more RAM, now it handled that fine too (although our EC2 was getting more expensive). We hadn’t quite figured out how to approach defining how many simultaneous viewers we needed to support and how much EC2 was necessary to do that before moving on to other alternatives.

We started out running cantaloupe on an EC2 c4.xlarge (4 core Xeon E5 and 7.5 GB RAM), and ended up with a m4.2xlarge (8 core and 32 GB RAM) which could handle our 1G image, and generally seemed to handle load better with lower latency.  We used the JAI image processor for Cantaloupe. (It seemed to perform better than the Java2D processor; Cantaloupe also supports ImageMagick or GraphicsMagick as a processor, but we didn’t get to trying those out much).

Auth

If you’re not using riiif, and have images meant to be only available to certain/all logged-in users, you need to think about auth. With any external image server, you could do auth by proxying all access through your rails app, which would check auth in the usual way. But I worry about taking up web worker processes/threads in the main app with dozens of image requests. It would be better to keep the servers entirely separate.

There also might be a way to have apache/nginx proxying directly, rather than through the rails app, which would make me more comfortable, but you’d have to figure out how to use a custom auth plugin for apache or nginx.

Cantaloupe also has the very nice option of writing your own custom auth in ruby (even though the server is Java; thanks JRuby!), so we could actually check the existing Rails session (just make sure the cantaloupe server knows your Rails secret key, and figure out the right classes to call to decrypt and load data from the session cookie), and then Fedora/Solr to check auth in the usual samvera way.

Any live checking of auth before delivering an image tile is of course going to increase image response latency.

These were the options we thought of, but we didn’t get to any of them before ultimately deciding to choose pre-generated tile images.

However, Cantaloupe was definitely our second choice — and first choice if we really were to need need a full IIIIF server — it for sure looked like it could have worked well, although at potentially somewhat expensive AWS charges.

Imgix.com

Imgix.com is a commercial cloud-hosted image server.  I had a good opinion of it from using it for thumbnail-generation on ecommerce projects while working at Friends of the Web last year.  Imgix pricing is pretty affordable.

Imgix does not conform to IIIF API, but it does do pretty much all the image operations that IIIF can do, plus more. Definitely everything we needed for an OpenSeadragon tile source.

I figured, let’s give it a try, get out of the library/cultural-heritage silo, and use a popular, successful, well-regarded commercial service operating in the general web app space.

OpenSeadragon can not use imgix.com as a tile source out of the box, but OSD makes it pretty easy to write your own tile source. In a day I had a working proof of concept for an OSD imgix.com tile source, and in a couple more had filed off all the rough edges.

It totally worked. But. It was slow.  Imgix.com was willing to take our 100MB TIFF sources, but it was clear this was not really the use case it was designed for.  It was definitely slow downloading our original sources from our web app–the difference, I guess, between downloading directly from fedora on the same AWS subnet, and downloading via our Rails app from who knows where. (I did have to make a bugfix/improvement to samvera code to make sure HTTP headers were delivered quicker for a streaming download, to avoid timing out imgix. Once that was done, no more imgix timeout problems).  We tried pinging it to “pre-load” all originals as we had been doing with riiif — but as a cloud service, and one not expecting originals to be so huge, we had no control over when imgix purged originals from cache, and we found it did sometimes purge not-recently-accessed originals fairly quickly.

Also imgix has a (not really unreasonable) 512MB max for original images; our one 1G TIFF was not gonna be possible (and of course, that’s the one you really want a pan-and-zoom viewer for, it’s huge!).

On the plus side:

  • with imgix, we don’t need to worry about the infrastructure at all, it’s outsourced. We don’t need to plan some redundancy for max uptime or scaling for heavy use, they’re already doing it.
  • The response times are unlikely to change under heavy use, it’s already a cloud-scale service designed for heavy use.
  • Can handle the extra load of using it for thumbs too, just as well as it can for viewer tiles.
  • Our cost estimates had it looking cheaper (by 50%+) than hosting our own Cantaloupe on an EC2.
  • Once originals and derivatives being accessed (say tiles for a given viewer) were cached, it was lightning fast, reliably just 10s of ms for a tile image. But again, you have no control over how long these stay in cache before being purged.

For non-public images, imgix offers signed-and-expiring URLs.  The downside of these is you kind of lose HTTP cacheability of your images. And imgix.com doesn’t provide any way to auth imgix to get your originals, so if they’re not public you would have to put in some filters recognizing imgix.com IP addresses (which are subject to change, although they’re good at giving you advance notice), and let them in to private images.

But ultimately the latency was just too high. For images where the originals were cached but not the derivatives, it could take 1-4 seconds to get our tile derivatives; if the originals were not cached, it could take 10 or higher.  (One option would be trying to give it a much smaller lzw or zip compressed TIFF as a source, instead of our uncompressed original originals, cutting down transfer time for fetching originals. But I think this would be probably unlikely to improve latency sufficiently, and we moved on to pre-generated DZI. We would need to give imgix a lossless full-res original of some kind, cause full-res zoom is the whole goal here!)

I think imgix is potentially a workable last resort (and I still love it for creating thumbs for more reasonably sized sources), but it wasn’t as good an option as other alternatives explored for this use case, a tile source for enormous TIFFs.

Pre-Generated Deep Zoom Tiles

Eventually we came back to an earlier idea we originally considered, but then assumed would be too expensive and abandoned/forgot about.  When I realized Cantaloupe was recommending pyramidal TIFFs , which require some preprocessing/prerendering anyway, why not go all the way and preprocess/prerender every tile, and store them somewhere (say, cheap S3?)?  OpenSeadragon has a number of tile sources it supports that are or can be pre-generated images, including the first format it ever supported, Deep Zoom Image (file suffix .dzi).   (I had earlier done a side-project using OpenSeadragon and Deep Zoom tiles to put the awesome Beehive Collective Mesoamerica Resiste poster online).

But then we learned about riiif and it looked cool and we got on that tip, and kind of assumed pre-generating so many images would be unworkable. It took us a while to get back to exploring pre-generated Deep Zoom tiles.  But it actually works great (of course we had learned a lot of domain knowledge about manipulating giant images and tile sources at this point!).

We use vips (rather than imagemagick or graphicsmagick) to generate all the tiles. vips is really optimized for speed, CPU and RAM usage, and if you’re creating all the tiles at once vips can read the source image in once and slice it up into tiles.  We do this as a background job, that we have running on a different server than the Rails app; the built-in sufia bg jobs still run on our single rails app server. (In sufia 7.3, out of the box they can’t run on a different server without a shared file system; I think this may have been improved in as-yet-unreleased-hyrax-master).

We hook into the sufia/hyrax actor stack to trigger the bg job on file upload. A small EC2  (t2.medium 4 GB RAM, 2 core CPU) with five resque workers can handle the dzi creation much faster than the existing actor stack can trigger them when doing a bulk ingest (the existing actor stack is slow, and the dzi creation can’t happen until the file is actually in fedora, so that the job can retrieve it from fedora to make it into dzi tiles. So DZI’s can’t be generated any faster than sufia can do the ingests no matter what).  The files are uploaded to an S3 bucket.

We also provide a rake task to create the .dzi files for all Files in our fedora repo, for initial bootstrapping or if corruption ever needs to be fixed, etc. For our 8000-file staging server, running the creation task on our EC2 t2.medium, it takes around 7 hours to create and upload them all to S3 (I use some multi-threaded concurrency in the uploading), and results in ~3.2 million S3 keys taking up approx 32GB.

Believe it or not, this is actually the cheapest option, taking account of S3 storage and our bg jobs EC2 instance for dzi creation (that we’ll probably try to move more bg jobs to in the future). Cheaper than imgix, cheaper than our own Cantaloupe on an EC2 big enough to handle it.

If you have 800K or 8 million images instead of 8000, it’ll get more complicated/expensive. But S3 is so cheap, and a spot-priced temporary fleet of EC2s to do a mass dzi-creation ingest affordable enough you might be surprised how affordable it is. Alas fedora makes it a lot less straightforward to parallelize ingest than if it were a more conventional stack, but there’s probably some way to do it. Unless/until fedora itself becomes your bottleneck. There are costs to our stack.

It handles our 1GB original source just fine (takes about 30 seconds to create all tiles for this one). It’s also definitely fast for the end-user. Once the tiles are pre-generated, it’s just a request from S3. Which I’m seeing taking like 40-80ms in Chrome Dev Tools. Under a really lot of load (I’m guessing 100+ users concurrently using viewer), or to reduce latency beyond that 40-80ms, the standard approach would be to put a CDN in front of S3.  Probably either Amazon’s own CloudFront, or CloudFlare. This should be simple and affordable. But this is already reliably faster than any of our other options, and can handle way more concurrent load without a CDN compared to our other options, we aren’t worrying about it for now.  When/if we want to add a CDN, it oughta only be a couple clicks and a hostname change to implement.

And of course, there’s very little server maintenance to deal with, once the files are generated, they’re just static assets on S3, there’s nothing to “crash” really. (Well, except S3 itself, which happens very occasionally. If you wanted to be very safe, you’d mirror your S3 bucket to another AWS region or another cloud provider). Just one pretty standard and easily-scalable bg-job-running server for creating the DZIs on image upload.

We’re still punting on auth for now. Which talking on slack channel, seems to be a common choice with auth and IIIF image servers. One dev told me they just didn’t allow non-public images to be viewed in the image viewer (or via their image server) at all, which I guess works if your non-public images are all really just in-progress-in-workflow only viewable to staff.  As is true for us here. Another dev told me they just don’t worry about it, no links will be generated to non-public images, but they’ll be there via the image server without auth — which again works if your non-public images aren’t actually sensitive or legally un-shareable, they’re just in-process-not-quite-ready. Which is also true for us, for now anyway.  (I would not ever count on “nobody knows the URL”-based-security for actual sensitive or legally un-shareable content, for anything where it actually matters if someone happens to come across it. For our current and foreseeable future content, it doesn’t really. knock on wood. It does make me nervous!).

There are some auth options with the S3 approach, read about them as well as some internal overview documentation of what we’ve done in our code here, or see our PR  for initial implementation of the pre-generated-DZI-on-S3 approach for our complete solution.  Pre-generated DZI on S3 is indeed the approach we are going with.

IIIF vs Not

Some readers may have noticed that two of the alternatives we examined are not IIIF servers, and the one we ended up with — pre-generated DZI tiles — is not a dynamic image server at all. You may be reacting with shocked questions: You can do that? But what are you missing? Can you still use UniversalViewer? What about those other IIIF things?

Well, the first thing we miss is truly dynamic image generation. We instead need to pre-generate all the image derivatives we need upon image upload, including the DZI tiles. If we wanted a feature like, say, user enters a number of pixels and we deliver a JPG scaled to the user-specified width, a dynamic image server would be a lot more convenient. But I only thought of that feature when brainstorming for things that would be hard without a dynamic image server, it’s not something we are likely to prioritize. For thumbs and downloads at various preset sizes, pre-generating should work just fine with regards to performance and cost, especially with a bg job running on it’s own jobs server and stored on S3 (both don’t happen out of the box on sufia, but may in latest unreleased-master hyrax).

So, UniversalViewer. UniversalViewer uses OpenSeadragon to do the actual pan-and-zoom viewer.  Mirador  seems to as well. I think OpenSeadragon is pretty much the only viable open source JS pan-and-zoom viewer, which is fine, because OSD is pretty great.  UV, I believe, just wraps OSD in some additional UI/UX, with some additional features like table of contents viewing, downloads, etc.

We decided, even when we were still planning on using riiif, to not use UniversalViewer but instead develop directly with OpenSeadragon. Some of the fancier UV features we didn’t really need right now, and it was unclear if it would take more time to customize UV UX to our needs, or just develop a relatively light-weight UI of our own on top of OSD.

As these things do, our UI took slightly longer to develop than estimated, but it was still relatively quick, and we’re pretty happy with what we’ve got.  It is still subject to changes as we continue to user-test — but developing our own gives us a lot more control of the UI/UX to respond to such.  Later it turned out in other useful non-visual UX ways too — in our DZI implementation, we put something in our front-end that, if the dzi file is not available on S3, automatically degrades to a smaller not-very-zoomable image with an apology/warning message.  I’m not sure if I would have wanted to try and hack that into UV.

So using OpenSeadragon directly, we don’t need to give it an IIIF Image API URL, we can give it anything it handles (or you write a plugin for), and it works just fine. No code changes necessary except giving it a URL pointing to a different thing. No problem, everything just worked, it required no extra work in our front-end to use DZI instead of IIIF. (We did do some extra work to add some feature toggles so we could switch between various back-ends easily). No problem at all, the format of your tile source, so long as OSD can handle it, is a very loosely coupled dependency.

But what if you want to use UV or Mirador? (And we might in the future ourselves, if we need features they provide and we discover they are non-trivial to develop in our homegrown UI).  They take IIIF as input, right?

To be clear, we need to distinguish between the IIIF Image API (the one where a server provides image derivatives on demand), and the IIIF Manifest spec. The Manifest spec, part of the IIIF Presentation API, defines a JSON-ld file that “represents a single object and any intellectual work or works embodied within that object…  includes the descriptive, rights and linking information for the object… embeds the sequence(s) of canvases that should be rendered to the user.”

It’s the IIIF Manifest that is input to UV or Mirador. Normally these tools would extract one or more IIIF Image API URLs out of the Manifest, and just hand them to OpenSeadragon. Do they do anything else with an IIIF Image API url except hand it to OSD? I’m not sure, but I don’t think so. So if they just handed any other URI that OSD can handle to OSD, it should work fine? I think so.

An IIIF Manifest doesn’t actually need to include an IIIF Image API url.  “If a IIIF Image API service is available for the image, then a link to the service’s base URI should be included.” If. And an IIIF Manifest can include any other sort of image resource, from any external service,  identified by a uri in @context field.  So you can include a link to the .dzi file in the IIIF Manifest now, completely legally, the same IIIF Manifest you’d do otherwise just with a .dzi link instead of an IIIF Image API link — you’d just have to choose a @context URI to identify it as a DZI. Perhaps `https://msdn.microsoft.com/en-us/library/cc645077(VS.95).aspx`, although that might not be the most reliable URI identifier. But, really, we could be just as standards-compliant as ever and put a DZI URL in the IIIF Manifest instead of an IIIF Image API URL.

Of course, as with all linked data work, standards-compliant doesn’t actually make it interoperable. We need mutually-recognizable shared vocabulary. UV or Mirador would have to recognize the image resource URL supplied in the Manifest as being a DZI, or at any rate at least as something that can be passed to OSD. As far as I know UV or Mirador won’t do this now. It should probably be pretty trivial to get them to, though, perhaps by supporting configuration for “recognize this @context uri as being something you can pass to OSD.”  If we in the future have need for UV/Mirador (or IIIF Manifests), I’d look into getting them to do that, but we don’t right now.

What about these other tools that take IIIF Manifests and aggregate images from different sites?  Probably the same deal, they just gotta recognize an agreed-upon identifier meaning “DZI format”, and know they can pass such to OpenSeadragon.

I’m not sure if any such tools currently exist used for real work or even real recreation, rather than as more of a demo. I’ll always choose to achieve greatness rather than mediocrity for our current actual real prioritized use cases, above engineering for a hoped-for-but-uncertain future. Of course, when you can do both without increasing expense or sacrificing quality for your present use cases, that’s great too, and it’s always good to keep an eye out for those opportunities.

But I’m feeling pretty good about our DZI choice at the moment. It just works so well, cheaply, with minimal expected ongoing maintenance, compared to other options — and works better for end-users too, with reliable nearly instantaneous delivery of tiles even under heavy load. Now, if you have a lot more images than us, the cost-benefit calculus may end up different. Especially because a dynamic image server scales (gets more expensive) with number of concurrent users/requests more or less regardless of number of images, while the pre-gen DZI solution gets more expensive with more images more or less regardless of concurrent request level. If you have a whole lot of images (say two orders of magnitude bigger than our 10K), your app typically gets pretty low use (and you don’t care about it supporting lots of concurrent use), and maybe additionally if your original source images aren’t nearly as large as ours, pre-gen DZI might not be such a sweet spot. However, you might be surprised, pre-generated DZI is in the end just such a simple solution, and S3 storage is pretty affordable.

on hooking into sufia/hyrax after file has been uploaded

Our app (not yet publicly accessible) is still running on sufia 7.3. (A digital repository framework based on Rails, also known in other versions or other drawings of lines as hydra, samvera, and hyrax).

I had a need to hook into the point after a file has been added to fedora, to do some post-processing at that point.

(Specifically, we are trying to run a riiif instance on another server, without a shared file system (shared FS are expensive and/or tricky on AWS). So, the riiif server needs to copy the original image asset down from fedora. Since our original images are uncompressed TIFFs that average around 100MB, this is somewhat slow, and we want to have the riiif server “pre-load” at least the originals, if not the derivatives it will create. So after a new image is uploaded, we want to ‘ping’ the riiif server with an info request, causing it to download the original, so it’s there waiting for conversion requests, and at least it won’t have to do that. But it can’t pull down the file until it’s in fedora, so we need to wait until after fedora has it to ping. phew.)

Here are the cooperating objects in Sufia 7.3 that lead to actual ingest in Fedora. As far as I can tell. Much thanks to @jcoyne for giving me some pointers as to where to look to start figuring this out.

Keep in mind that I believe “actor” is just hydra/samvera’s name for a service object involved in handling ‘update a persisted thing’. Don’t get it confused with the concurrency notion of an ‘actor’, it’s just an ordinary fairly simple ruby object (although it can and often does queue up an ActiveJob for further processing).

The sufia default actor stack at ActorFactory includes the Sufia::CreateWithFilesActor.

 

  • AttachFilesToWork job does some stuff, but then calls out to a CurationConcerns::Actors::FileSetActor#create_content. (we are using curation_concerns 1.7.7 with sufia 7.3) — At least if it was a direct file upload (I think is what this means). If the file was a `CarrierWave::Storage::Fog::File` (not totally sure in what circumstances it would be), it instead kicks off an ImportUrlJob.  But we’ll ignore that for now, I think the FileSetActor is the one my codepath is following. 

 

 

 

 

  • We are using hydra-works 0.16.0. AddFileToFileSet I believe actually finishes things off synchronously without calling out to anything else related to ‘get this thing into fedora’. Although I don’t really totally understand what the code does, honestly.
    • It does call out to Hydra::PCDM::AddTypeToFile, which is confusingly defined in a file called add_type.rb, not add_type_to_file.rb. (I’m curious how that doesn’t break things terribly, but didn’t look into it).

 

So in summary, we have six fairly cooperating objects involved in following the code path of “how does a file actually get added to fedora”.  They go across 3-4 different gems (sufia, curation_concerns, hydra-works, and maybe hydra-pcdm, although that one might not be relevant here). Some of the classes involved inherit from, mix-in, or have references to classes from other gems. The path involves at least two (sometimes more in some paths?) bg jobs — a bg job that queues up another bg job (and maybe more).

That’s just trying to follow the path involved in “get this uploaded file into fedora”, some  of those cooperating objects also call out to other cooperating objects (and maybe queue bg jobs?) to do other things, involving a half dozenish additional cooperating objects and maybe one or two more gem dependencies, but I didn’t trace those, this was enough!

I’m not certain how much this changed in hyrax (1.0 or 2.0), at the very least there’d be one fewer gem dependency involved (since Sufia and CurationConcerns were combined into Hyrax). But I kind of ran out of steam for compare and contrast here, although it would be good to prepare for the future with whatever I do.

Oh yeah, what was I trying to do again?

Hook into the point “after the thing has been successfully ingested in fedora” and put some custom code there.

So… I guess…  that would be hooking into the ::IngestFileJob (located in CurationConcerns), and doing something after it’s completed. It might be nice to use the ActiveJob#after_perform hook to this.  I actually hadn’t known about that callback, haven’t used it before — we’d need to get at least the file_set arg passed into it, which the docs say maybe you can get from the passed-in job.arguments?  That’s a weird way to do things in ruby (why aren’t ActiveJob’s instances with their state as ordinary state? I dunno), but okay! Or, of course we could just monkey-patch override-and-call-super on perform to get a hook.

Or we could maybe hook into Hydra::Works::AddFileToFileSet instead, I think that does the actual work. There’s no callbacks there, so that’d just be monkey-patch-and-call-super on #call, I guess.

This definitely seems a little bit risky, for a couple different reasons.

  • There’s at least one place where a potentially different path is followed, if you’re uploading a file that ends up as a CarrierWave::Storage::Fog::File instead of a CarrierWave::SanitizedFile.  Maybe there are more I missed? So configuration or behavior changes in the app might cause my hook to be ignored, at least in some cases.

 

  • Forward-compatibility seems unreliable. Will this complicated graph of cooperating instances get refactored?  Has it already in future versions of Hyrax? If it gets refactored, will it mean the object I hook into no longer exists (not even with a different namespace/name), or exists but isn’t called in the same way?  In some of those failure modes, it might be an entirely silent failure where no error is ever raised, my code I’m trying to insert just never gets called. Which is sad. (Sure, one could try to write a spec for this to warn you…  think about how you’d do that. I still am.)  Between IngestFileJob and AddFileToFileSet, is one ‘safer’ to hook into than the other? Hard to say. If I did research in hyrax master branch, it might give me some clues.

I guess I’ll probably still do one of these things, or find another way around it. (A colleague suggested there might be an entirely different place to hook into instead, not the ‘actor stack’, but maybe in other code around the controller’s update action).

What are the lessons?

I don’t mean to cast any aspersions on the people who put in a lot of work, very well-intentioned work, conscientious work, to get hydra/samvera/sufia/hyrax where it is, being used by lots of institutions. I don’t mean to say that I could or would have done differently if I had been there when this code was written — I don’t know that I could or would have.

And, unfortunately, I’m not saying I have much idea of what changes to make to this architecture now, in the present environment, with regards to backwards compat, with regards to the fact that I’m still on code one or two major versions (and a name change) behind current development (which makes the local benefit from any work I put into careful upstream PR’s a lot more delayed, for a lot more work; I’m not alone here, there’s a lot of dispersion in what versions of these shared dependencies people are using, which adds a lot of cost to our shared development).  I don’t really! My brain is pretty tired after investigating what it’s already doing. Trying to make a shared architecture which is easily customizable like this is hard, no ways around it.  (ActiveSupport::Callbacks are trying to do something pretty analogous to the ‘actor stack’, and are one of the most maligned parts of Rails).

But I don’t think that should stop us from some evaluation.  Going forward making architecture that works well for us is aided immensely by understanding what has worked out how in what we’ve done before.

If the point of the “Actor stack” was to make it easy/easier to customize code in a safe/reliable way (meaning reasonably forward-compatible)–and I believe it was–I’m not sure it can be considered a success. We gotta start with acknowledging that.

Is it better than what it replaced?  I’m not sure, I wasn’t there for what it replaced. It’s probably harder to modify in the shared codebase going forward than the presumably simpler thing it replaced though… I can say I’d personally much rather have just one or two methods, or one ActiveJobs, that I just hackily monkey-patch to do what I want, that if it breaks in a future version will break in a simple way, or one that takes less time and brain to figure out what’s going on anyway. That wouldn’t be a great architecture, but I’d prefer it to what’s there now, I think.  Of course, it’s a pendulum, and the grass is always greener, if I had that, I’d probably be wanting something cleaner, and maybe arrive at something like the ‘actor stack’ — but now we’re all here now with what we’ve got, so we can at least consider that this may have gone in some unuseful directions.

What are those unuseful directions?  I think, not just in the actor stack, but in many parts of hydra, there’s an ethos that breaking things into many very small single-purpose classes/instances is the way to go, then wiring them all together.  Ideally with lots of dependency injection so you can switch in em and out.  This reminds me of what people often satirize and demonize in stereotypical maligned Java community architecture, and there’s a reason it’s satirized and demonized. It doesn’t… quite work out.

To pull this off well — especially in shared library/gem codebase, which I think has different considerations than a local bespoke codebase, mainly that API stability is more important because you can’t just search-and-replace all consumers in one codebase when API changes — you’ve got to have fairly stable APIs, which are also consistent and easily comprehensible and semantically reasonable.   So you can replace or modify one part, and have some confidence you know what it’s doing, when it will be called, and that it will keep doing this for at least a few months of future versions. To have fairly stable and comfortable APIs, you need to actually design them carefully, and think about developer use cases. How are developers intended to intervene in here to customize? And you’ve got to document those. And those use cases also give you something to evaluate later — did it work for those use cases?

It’s just not borne out by experience that if you make everything into as small single-purpose classes as possible and throw them all together, you’ll get an architecture which is infinitely customizable. You’ve got to think about the big picture. Simplicity matters, but simplicity of the architecture may be more important than simplicity of the individual classes. Simplicity of the API is definitely more important than simplicity of internal non-public implementation. 

When in doubt if you’re not sure you’ve got a solid stable comfortable API,  fewer cooperating classes with clearly defined interfaces may be preferable to  more classes that each only have a few lines. In this regard, rubocop-based development may steer us wrong, too much to the micro-, not enough to the forest.

To do this, you’ve got to be careful, and intentional, and think things through, and consider developer use cases, and maybe go slower and support fewer use cases.  Or you wind up with an architecture that not only does not easily support customization, but is very hard to change or improve. Cause there are so many interrelated coupled cooperating parts, and changing any of them requires changes to lots of them, and breaks lots of dependent code in local apps in the process. You can actually make forwards-compatible-safe code harder, not easier.

And this gets even worse when the cooperating objects in a data flow are spread accross multiple gems dependencies, as they often are in the hydra/samvera stack. If a change in one requires a change in another, now you’ve got dependency compatibility nightmares to deal with too. Making it even harder (rather than easier, as was the original goal) for existing users to upgrade to new versions of dependencies, as well as harder to maintain all these dependencies.  It’s a nice idea, small dependencies which can work together — but again, it only works if they have very stable and comfortable APIs.  Which again requires care and consideration of developer use cases. (Just as the Java community gives us a familiar cautionary lesson about over-architecture, I think the Javascript community gives us a familiar cautionary lesson about ‘dependency hell’. The path to code hell is often paved with good intentions).

The ‘actor stack’ is not the only place in hydra/samvera that suffers from some of these challenges, as I think most developers in the stack know.  It’s been suggested to me that one reason there’s been a lack of careful, considered, intentional architecture in the stack is because of pressure from institutions and managers to get things done, why are you spending so much time without new features?  (I know from personal experience this pressure, despite the best intentions, can be even stronger when working as a project-based contractor, and much of the stack was written by those in that circumstance).

If that’s true, that may be something that has to change. Either a change to those pressures — or resisting them by not doing rearchitectures under those conditions. If you don’t have time to do it carefully, it may be better not to commit the architectural change and new API at all.  Hack in what you need in your local app with monkey-patches or other local code instead. Counter-intuitively, this may not actually increase your maintenance burden or decrease your forward-compatibility!  Because the wrong architecture or the wrong abstractions can be much more costly than a simple hack, especially when put in a shared codebase. Once a few people have hacked it locally and seen how well it works for their use cases, you have a lot more evidence to abstract the right architecture from.

But it’s still hard!  Making a shared codebase that does powerful things, that works out of the box for basic use cases but is still customizable for common use cases, is hard. It’s not just us. I worked last year with spree/solidus, which has an analogous architectural position to hydra/samvera, also based on Rails, but in ecommerce instead of digital repositories. And it suffers from many of the same sorts of problems, even leading to the spree/solidus fork, where the solidus team thought they could do better… and they have… maybe… a little.  Heck, the challenges and setbacks of Rails itself can be considered similarly.

Taking account of this challenge may mean scaling back our aspirations a bit, and going slower.   It may not be realistic to think you can be all things to all people. It may not be realistic to think you can make something that can be customized safely by experienced developers and by non-developers just writing config files (that last one is a lot harder).  Every use case a participant or would-be participant has may not be able to be officially or comfortably supported by the codebase. Use cases and goals have to be identified, lines have to drawn. Which means there has to be a decision-making process for who and how they are drawn, re-drawn, and adjudicated too, whether that’s a single “benevolent dictator” person or institution like many open source projects have (for good or ill), or something else. (And it’s still hard to do that, it’s just that there’s no way around it).

And finally, a particularly touchy evaluation of all for the hydra/samvera project; but the hydra project is 5-7 years old, long enough to evaluate some basic premises. I’m talking about the twin closely related requirements which have been more or less assumed by the community for most of the project’s history:

1) That the stack has to be based on fedora/fcrepo, and

2) that the stack has to be based on native RDF/linked data, or even coupled to RDF/linked data at all.

I believe these were uncontroversial assumptions rather than entirely conscious decisions (edit 13 July, this may not be accurate and is a controversial thing to suggest among some who were around then. See also @barmintor’s response.), but I think it’s time to look back and wonder how well they’ve served us, and I’m not sure it’s well.  A flexible powerful out-of-the-box-app shared codebase is hard no matter what, and the RDF/fedora assumptions/requirements have made it a lot harder, with a lot more uncharted territory to traverse, best practices to be invented with little experience to go on, more challenging abstractions, less mature/reliable/performant components to work with.

I think a lot of the challenges and breakdowns of the stack are attributable to those basic requirements — I’m again really not blaming a lack of skill or competence of the developers (and certainly not to lack of good intentions!). Looking at the ‘actor stack’ in particular, it would need to do much simpler things if it was an ordinary ActiveRecord app with paperclip (or better yet shrine), it would be able to lean harder on mature shared-upstream paperclip/shrine to do common file handling operations, it would have a lot less code in it, and less code is always easier to architect and improve than more code. And meanwhile, the actually realized business/institutional/user benefits of these commitments — now after several+ years of work put into it — are still unclear to me.  If this is true or becomes consensual, and an evaluation of the fedora/rdf commitments and foundation does not look kindly upon them… where does that leave us, with what options?

On open source, consensus, vision, and scope

Around minute 27 of Building Rails ActionDispatch::SystemTestCase Framework from Eileen Uchitelle.

What is unique to open source is that the stakeholders you are trying to find consensus with have varying levels of investment in the end result…

…but I wasn’t prepared for all the other people who would care. Of course caring is good, I got a lot of productive and honest feedback from community members, but it’s still really overwhelming to feel like I needed to debate — everyone.

Rails ideologies of simplicity differ a lot from capybara’s ideology of lots of features. And all the individuals who were interested in the feature had differing opinions as well… I struggled with how to respect everyone’s opinions while building system tests, but also maintaining my sense of ownership.

I new that if I tried to please all groups and build systems tests by consensus, then I would end up pleasing no one. Everyone would end up unhappy because consensus is the enemy of vision. Sure, you end up adding everything everyone wants, but the feature will lose focus, and the code will lose style, and I will lose everything that I felt like was important.

I needed to figure out a way to respect everyone’s opinions without making systems tests a hodepodge of idoelogies of feeling like I threw out everything I cared about. I had to remind ourselves that we all had one goal: to integrate systems testing into rails. Even if we disagreed about the implementation, htis was our common ground.

With this in mind, there are a few ways you can keep your sanity when dealing with multiple ideologies in the open source world. One of the biggest things is to manage expectations. In open source there are no contracts, you can’t hold anyone else acountable (except for yourself) and nobody else is going to hold you accountable either… You are the person who has to own the scope, and you are the person who has to say ‘no’. There were a ton of extra features suggested for systems tests that I would love to see, but if I had implemented all of them it still wouldn’t be in rails today. I had to manage the scope and the expectations of everyone involved to keep the project in budget…

…When you are building open source features, you are building something for others. If you are open to suggestions the feature might change for the better. Even if you don’t agree, you have to be open to listening to the other side of things. It’s really easy to get cagey about the code that you’ve worked so hard to write. I still have to fight the urge to be really protective of systems test code… but I also have to remember that it’s no longer mine, and never was mine, it now belongs to everyone that uses Rails….

I new that if I tried to please all groups and build systems tests by consensus, then I would end up pleasing no one. Everyone would end up unhappy because consensus is the enemy of vision. Sure, you end up adding everything everyone wants, but the feature will lose focus, and the code will lose style…

On the graphic design of rubyland.news

I like to pay attention to design, and enjoy good design in the world, graphic and otherwise. A well-designed printed page, web page, or physical tool is a joy to interact with.

I’m not really a trained designer in any way, but in my web development career I’ve often effectively been the UI/UX/graphic designer of apps I work on, and I do my best, and always try to do the best I can (our users deserve good design), and to develop my skills by paying attention to graphic design in the world, reading up (I’d recommend Donald Norman’s The Design of Everyday Things, Robert Bringhurt’s The Elements of Typographic Style, and one free online one, Butterick’s Practical Typography), and trying to improve my practice, and I think my graphic design skills are always improving.   (I also learned a lot looking at and working with the productions of the skilled designers at Friends of the Web, where I worked last year).

Implementing rubyland.news turned out to be a great opportunity to practice some graphic and web design. Rubyland.news has very few graphical or interactive elements, it’s a simple thing that does just a few things. The relative simplicity of what’s on the page, combined with it being a hobby side project — with no deadlines, no existing branding styles, and no stakeholders saying things like “how about you make that font just a little bit bigger” — made it a really good design exercise for me, where I could really focus on trying to make each element and the pages as a whole as good as I could in both aesthetics and utility, and develop my personal design vocabulary a bit.

I’m proud of the outcome, while I don’t consider it perfect (I’m not totally happy with the typography of the mixed-case headers in Fira Sans), I think it’s pretty good typography and graphic design, probably my best design work. It’s nothing fancy, but I think it’s pleasing to look at and effective.  I think probably like much good design, the simplicity of the end result belies the amount of work I put in to make it seem straightforward and unsophisticated. :)

My favorite element is the page-specific navigation (and sometimes info) “side bar”.

Screenshot 2017-04-21 11.21.45

At first I tried to put these links in the site header, but there wasn’t quite enough room for them, I didn’t want to make the header two lines — on desktop or wide tablet displays, I think vertical space is a precious resource not to be squandered. And I realized that maybe anyway it was better for the header to only have unchanging site-wide links, and have page-specific links elsewhere.

Perhaps encouraged by the somewhat hand-written look (especially of all-caps text) in Fira Sans, the free font I was trying out, I got the idea of trying to include these as a sort of ‘margin note’.

Screenshot 2017-04-21 11.32.51

The CSS got a bit tricky, with screen-size responsiveness (flexbox is a wonderful thing). On wide screens, the main content is centered in the screen, as you can see above, with the links to the left: The ‘like a margin note’ idea.

On somewhat narrower screens, where there’s not enough room to have margins on both sides big enough for the links, the main content column is no longer centered.

Screenshot 2017-04-21 11.36.48.png

And on very narrow screens, where there’s not even room for that, such as most phones, the page-specific nav links switch to being above the content. On narrow screens, which are typically phones that are much higher than they are wide, it’s horizontal space that becomes precious, with some more vertical to spare.

Screenshot 2017-04-21 11.39.16

Note on really narrow screens, which is probably most phones especially held in vertical orientation, the margins on the main content disappear completely, you get actual content with it’s white border from edge-to-edge. This seems an obvious thing to do to me on phone-sized screens: Why waste any horizontal real-estate with different-colored margins, or provide a visual distraction with even a few pixels of different-colored margin or border jammed up against the edge?  I’m surprised it seems a relatively rare thing to do in the wild.

Screenshot 2017-04-21 11.39.36

Nothing too fancy, but I quite like how it turned out. I don’t remember exactly what CSS tricks I used to make it so. And I still haven’t really figured out how to write clear maintainable CSS code, I’m less proud of the actual messy CSS source code then I am of the result. :)

One way to remove local merged tracking branches

My git workflow involves creating a lot of git feature branches, as remote tracking branches on origin. They eventually get merged and deleted (via github PR), but i still have dozens of them lying around.

Via googling, getting StackOverflow answers, and sort of mushing some stuff I don’t totally understand together, here’s one way to deal with it, create an alias git-prune-tracking.  In your ~/.bash_profile:

alias git-prune-tracking='git branch --merged | grep -v "*" | grep -v "master" | xargs git branch -d; git remote prune origin'

And periodically run git-prune-tracking from a git project dir.

I do not completely understand what this is doing I must admit, and there might be a better way? But it seems to work. Anyone have a better way that they understand what it’s doing?  I’m kinda surprised this isn’t built into the git client somehow.

Use capistrano to run a remote rake task, with maintenance mode

So the app I am now working on is still in it’s early stages, not even live to the public yet, but we’ve got an internal server. We periodically have to change a bunch of data in our (non-rdbms) “production” store. (First devops unhappiness, I think there should be no scheduled downtime for planned data transformation. We’re working on it. But for now it happens).

We use capistrano to deploy. Previously/currently, the process for making these scheduled-downtime maintenance mode looked like:

  • on your workstation, do a cap production maintenance:enable to start some downtime
  • ssh into the production machine, cd to the cap-installed app, and run a bundle exec run a rake task. Which could take an hour+.
  • Remember to come back when it’s done and `cap production maintenance:disable`.

A couple more devops unhappiness points here: 1) In my opinion you should ideally never be ssh’ing to production, at least in a non-emergency situation.  2) You have to remember to come back and turn off maintenance mode — and if I start the task at 5pm to avoid disrupting internal stakeholders, I gotta come back after busines hours to do that! I also think every thing you have to do ‘outside business hours’ that’s not an emergency is a not yet done ops environment.

So I decided to try to fix this. Since the existing maintenance mode stuff was already done through capistrano, and I wanted to do it without a manual ssh to the production machine, capistrano seemed a reasonable tool. I found a plugin to execute rake via capistrano, but it didn’t do quite what I wanted, and it’s implementation was so simple that I saw no reason not to copy-and-paste it and just make it do just what I wanted.

I’m not gonna maintain this for the public at this point (make a gem/plugin out of it, nope), but I’ll give it to you in a gist if you want to use it. One of the tricky parts was figuring out how to get “streamed” output from cap, since my rake tasks use ruby-progressbar — it’s got decent non-TTY output already, and I wanted to see it live in my workstation console. I managed to do that! Although I never figured out how to get a cap recipe to require files from another location (I have no idea how I couldn’t make it work), so the custom class is ugly inlined in.

I also ordinarily want maintenance mode to be turned off even if the task fails, but still want a non-zero exit code in those cases (anticipating future further automation — really what I need is to be able to execute this all via cron/at too, so we can schedule downtime for the middle of the night without having to be up then).

Anyway here’s the gist of the cap recipe. This file goes in ./lib/capistrano/tasks in a local app, and now you’ve got these recipes. Any tips on how to organize my cap recipe better quite welcome.

Hash#map ?

I frequently have griped that Hash didn’t have a useful map/collect function, something allowing me to transform the hash keys or values (usually values), into another transformed hash. I even go looking for for it in ActiveSupport::CoreExtensions sometimes, surely they’ve added something, everyone must want to do this… nope.

Thanks to realization triggered by an example in BigBinary’s blog post about the new ruby 2.4 Enumerable#uniq… I realized, duh, it’s already there!

olympics = {1896 => 'Athens', 1900 => 'Paris', 1904 => 'Chicago', 1906 => 'Athens', 1908 => 'Rome'}
olympics.collect { |k, v| [k, v.upcase]}.to_h
# => => {1896=>"ATHENS", 1900=>"PARIS", 1904=>"CHICAGO", 1906=>"ATHENS", 1908=>"ROME"}

Just use ordinary Enumerable#collect, with two block args — it works to get key and value. Return an array from the block, to get an array of arrays, which can be turned to a hash again easily with #to_h.

It’s a bit messy, but not really too bad. (I somehow learned to prefer collect over it’s synonym map, but I think maybe I’m in the minority? collect still seems more descriptive to me of what it’s doing. But this is one place where I wouldn’t have held it against Matz if he had decided to give the method only one name so we were all using the same one!)

(Did you know Array#to_h turned an array of duples into a hash?  I am not sure I did! I knew about Hash(), but I don’t think I knew about Array#to_h… ah, it looks like it was added in ruby 2.1.0.  The equivalent before that would have been more like Hash( hash.collect {|k, v| [k, v]}), which I think is too messy to want to use.

I’ve been writing ruby for 10 years, and periodically thinking “damn, I wish there was something like Hash#collect” — and didn’t realize that Array#to_h was added in 2.1, and makes this pattern a lot more readable. I’ll def be using it next time I have that thought. Thanks BigBinary for using something similar in your Enumerable#uniq example that made me realize, oh, yeah.

 

“Polish”; And, What makes well-designed software?

Go check out Schneem’s post on “polish”. (No, not the country).

Polish is what distinguishes good software from great software. When you use an app or code that clearly cares about the edge cases and how all the pieces work together, it feels right. Unfortunately, this is the part of the software that most often gets overlooked, in favor of more features or more time on another project…

…When we say something is “polished” it means that it is free from sharp edges, even the small ones. I view polished software to be ones that are mostly free from frustration. They do what you expect them to and are consistent.…

…In many ways I want my software to be boring. I want it to harbor few surprises. I want to feel like I understand and connect with it at a deep level and that I’m not constantly being caught off guard by frustrating, time stealing, papercuts.

I definitely have experienced the difference between working with and on a project that has this kind of ‘polish’ and, truly, experiencing a deep-level connection of the code that lets me crazy effective with it — and working on or with projects that don’t have this.  And on projects that started out with it, but lost it! (An irony is that it takes a lot of time, effort, skill, and experience to design an architecture that seems like the only way it would make sense to do it, obvious, and as schneems says, “boring”!)

I was going to say “We all have experienced the difference…”, but I don’t know if that’s true. Have you?

What do you think one can do to work towards a project with this kind of “polish”, and keep it there?  I’m not entirely sure, although I have some ideas, and so does schneems. Tolerating edge-case bugs is a contraindication — and even though I don’t really believe in ‘broken windows theory’ when it comes to neighborhoods, I think it does have an application here. Once the maintainers start tolerating edge case bugs and sloppiness, it sends a message to other contributors, a message of lack of care and pride. You don’t put in the time to make a change right unless the project seems to expect, deserve, and accommodate it.

If you don’t even have well-defined enough behavior/architecture to have any idea what behavior is desirable or undesirable, what’s a bug– you’ve clearly gone down a wrong path incompatible with this kind of ‘polish’, and I’m not sure if it can be recovered from. A Fred Brooks “Mythical Man Month” quote I think is crucial to this idea of ‘polish’: “Conceptual integrity is central to product quality.”  (He goes on to say that having an “architect” is the best way to get conceptual integrity; I’m not certain, I’d like to believe this isn’t true because so many formal ‘architect’ roles are so ineffective, but I think my experience may indeed be that a single or tight team of architects, formal or informal, does correlate…).

There’s another Fred Brooks quote that now I can’t find and I really wish I could cause I’ve wanted to return to it and meditate on it for a while, but it’s about how the power of a system is measured by what you can do with it divided by the number of distinct architectural concepts. A powerful system is one that can do a lot with few architectural concepts.  (If anyone can find this original quote, I’ll buy you a beer or a case of it).

I also know you can’t do this until you understand the ‘business domain’ you are working in — programmers as interchangeable cross-industry widgets is a lie. (‘business domain’ doesn’t necessarily mean ‘business’ in the sense of making money, it means understanding the use-cases and needs of your developer users, as they try to meet the use cases and needs of their end-users, which you need to understand too).

While I firmly believe in general in the caution against throwing out a system and starting over, a lot of this caution is about losing the domain knowledge encoded in the system (really, go read Joel’s post). But if the system was originally architected by people (perhaps past you!) who (in retrospect) didn’t have very good domain knowledge (or the domain has changed drastically?), and you now have a team (and an “architect”?) that does, and your existing software is consensually recognized as having the opposite of the quality of ‘polish’, and is becoming increasingly expensive to work with (“technical debt”) with no clear way out — that sounds like a time to seriously consider it. (Although you will have to be willing to accept it’ll take a while to get feature parity, if those were even the right features).  (Interestingly, Fred Books was I think the originator of the ‘build one to throw away’ idea that Joel is arguing against. I think both have their place, and the place of domain knowledge is a crucial concept in both).

All of these are more vague hand wavy ideas than easy to follow directions, I don’t have any easy to follow directions, or know if any exist.

But I know that the first step is being able to recognize “polish”, a well-designed parsimoniously architected system that feels right to work with and lets you effortlessly accomplish things with it.  Which means having experience with such systems. If you’ve only worked with ball-of-twine difficult to work with systems, you don’t even know what you’re missing or what is possible or what it looks like. You’ve got to find a way to get exposure to good design to become a good designer, and this is something we don’t know how to do as well with computer architecture as with classic design (design school consists of exposure to design, right?)

And the next step is desiring and committing to building such a system.

Which also can mean pushing back on or educating managers and decision-makers.  The technical challenge is already high, but the social/organizational challenge can be even higher.

Because it is harder to build such a system than to not, designing and implementing good software is not easy, it takes care, skill, and experience.  Not every project deserves or can have the same level of ‘polish’. But if you’re building a project meant to meet complex needs, and to be used by a distributed (geographically and/or organizationally) community, and to hold up for years, this is what it takes. (Whether that’s a polished end-user UX, or developer-user UX, which means API, or both, depending on the nature of this distributed community).