Adding constraints to projects for success

The company Github is known as a place where engineers have an unusual amount of freedom to self-organize as far as what projects they work on.

Here’s a very interesting blob post from a Github engineer, Brandon Keepers, on “Lessons learned from a cancelled project” — he has six lessons, which are really six principles or pieces of advice for structuring a project and a project team.

In some ways the lessons learned are particular to an environment with so much freedom — however, reading through I was struck by how many apply to the typical academic library environment too.

Since an academic library isn’t a for-profit business that measures success by it’s bottom line, we too can suffer from lack of defined measures of success or failure. (“Define success and failure“).

I think this isn’t just about being able to “evaluate if what you’re doing is working” — it’s also about knowing when to stop, as either a success or failure. I think many of us find ourselves in projects that can seem to go on forever, in pursuit of perfection, when there are other things we ought to be attending to instead of having one project monopolize our own time, or our team or organization.

This is related to Keepers’ second principle as well, “Create meaningful artificial constraints“. While we can’t say “money is not a factor” in an academic library, or that we are given ultimate failure to do whatever we want — I think we often do find ourselves with too many options (and too many stakeholders trying to evaluate all the options), or the expectation that “we can meet all the requirements at once,” which Keepers suggests is “paralyzing.” (Sound familiar?).    In a typical for-profit startup, freedom is constrained by a focus on the “minimal viable product”, the quickest way to sustainable revenue. When you have too much freedom — either because of Github’s culture of self-organizing, or an academic libraries… let’s say lack of focus — you need to define some artificial constraints in order to make progress.

Keepers highlights the “milestone” as an artificial constraint — a fixed date ‘deadline’, but one which “should never include scope”. You say “first beta user 2 weeks from today”, but you don’t say exactly what feature are to be included. “If you work 60 hours per week trying to meet a deadline, then you have missed the point. A constraint should never be used to get someone to work harder. It is a tool to enable you to work smarter.”

Which brings us to “Curate a collective vision“, and “People matter more than product.”  On that second one, Keepers writes

“For the first 9 months, I cared more about the outcome of the product than the people on the team. I gave feedback on ideas, designs and code with the assumption that the most important thing about that interaction was creating a superior product. I was wrong… If you care about people, the product will take care of itself. Pour all your energy into making sure your teammates are enjoying what they are doing. Happier people create better products.

That’s a lesson it took me a while to learn and I still have a lot of trouble remembering.

I don’t know if academic libraries just end up similar to the Github environment, or if it Keepers has come up with lessons that really apply to just about any organization (or any large non-startup organization?).  Either way, I found a lot of meat in his relatively short blog post, and I encourage reading it and reflecting on how those lessons may apply to your workplace, and how might account for them in your own organization.

Posted in General | Leave a comment

Academics investigate Big Deals

Gowers, a British mathematician has a very interesting blog post exploring the ramifications and characteristics of the current market in which universities pay for electronic access to academic journals.

The latter half of the essay has a few details of how Big Deal negotiations and contracts work, and some info on actual prices paid. Just a few details, because the vendors want to keep this all confidential, of course.

Elsevier journals — some facts; from Gowers’ Weblog

Some of the issues discussed certainly differ between the US and the UK (we lack the national centralized negotiating of JISC in the US), but it’s still illuminating how… bizarre it is. I suspect it’s equally bizarre in the US, although in different ways. Even though I work in a library (and am arguably a ‘librarian’), much of this was illuminating to me; I suspect it will be to others too. This stuff is both so secretive, and so complex and technical, that those not directly working on it are often not familiar with it.

Increasing numbers of faculty seem to be paying attention to these sorts of issues, which is encouraging as libraries are pretty much beholden to faculty desires (regardless of how reasonable they are), and we’re not going to be able to change the market much without faculty education and support. Gowers focuses on Elsevier — for reasons not entirely clear to me, Elsevier especially is bearing the brunt of faculty ire. I’m not sure they are particularly worse than anyone else, but I guess they are in some ways bigger and more central than anyone else, as a very large publisher that hosts and provides electronic access directly (instead of using an intermediary aggregator or provider).

It would be nice if more librarians were publishing educational essays and articles on this topic; seems like it ought to be a core part of our role to be educating our peers and faculty on these issues, not just leaving it to the faculty.   (I don’t know how much of that lack is because librarians are scared to offend the publishers/providers, while a faculty member can be more insulated from that concern.)

One welcome example of librarians publishing articles on scholarly communication is a recent piece in the online journal-blog, In the Library with the Lead Pipe: Librarian, Heal Thyself: A Scholarly Communication Analysis of LIS Journals by Micah Vandegrift and Chealsye Bowley. I haven’t had a chance to read it yet myself, but it’s on my list.  Although there’s no reason librarians need to restrict ourselves to analyzing scholarly communication in LIS journals, all scholarly communication is part of our domain of study and expertise!

Posted in General | 3 Comments

Amateur digital archival forensics

Two interesting projects in archival digital forensics — that is, rescuing digital content on legacy media and in legacy formats — recently came to my attention. Both involved significant contributions from amateurs and/or volunteers, which is interesting.

Firstly, according to a press release from Carnegie Mellon, some digital art by Andy Warhol circa 1985 has been rescued from Amiga floppy disks in the Warhol archives.

While the press release leads with “A multi-institutional team of new-media artists, computer experts, and museum professionals”, it actually sounds like most of the digital recovery work was done by a student group at CMU, the “CMU Computer Club.” 

I haven’t read the detailed report they link to, but I’d imagine there were challenges both in accessing the physical media and in converting the files to something usable by modern software. 

Some of the art looks pretty cool, go check it out at the link.

Secondly, from a Wired Magazine article: The Hackers Who Recovered NASA’s Lost Lunar Photos

So, some photos from early lunar orbiters were originally stored on digital tape; they were printed out for use at the time, and the digital tapes put on a shelf somewhere.

After the low-fi printing, the tapes were shoved into boxes and forgotten.

They changed hands several times over the years, almost getting tossed out before landing in storage in Moorpark, California. Several abortive attempts were made to recover data from the tapes, which were well kept, but it wasn’t until 2005 that NASA engineer Keith Cowing and space entrepreneur Dennis Wingo were able to bring the materials and the technical know how together.

When they learned through a Usenet group that former NASA employee Nancy Evans might have both the tapes and the super-rare Ampex FR-900 drives needed to read them, they jumped into action. They drove to Los Angeles, where the refrigerator-sized drives were being stored in a backyard shed surrounded by chickens. At the same time, they retrieved the tapes from a storage unit in nearby Moorpark, and things gradually began to take shape. Funding the project out of pocket at first, they were consumed with figuring out how to release the images trapped in the tapes.

With the original digital files and modern technology, they can actually get information and resolution out of them that were never originally obtained in the printouts. Pretty neat.

Sounds like eventually this turned into an actual official funded NASA project, but it began as some interested people doing it in their own time.

  •  Certainly archivists have been talking about digital preservation and forensics for some time.  I think it’s going to start increasingly becoming an issue in popular awareness, as time proceeds, the quantity of interesting cultural history trapped on legacy media/formats grows. 
  • It may just be anecdotal, but it seems like a lot of succesful digital archival forensics are being done by amateurs/volunteers.  Does the actual library/archives/museum sector lack capacity for what’s needed? If so, I would not assume this lack of capacity is professionals ‘not keeping up’, rather it points to a lack of funding or priorities from cultural sector institutions. Will we see this change?
  • On the other hand, the fact that amateurs voluntarily get involved in recovering legacy digital cultural history means that people really do care about this stuff. Which maybe is encouraging for cultural institutions getting funding for it? I realize it’s not as simple as that, but it’s something.
  • The techniques and tools that digital archivists use to recover legacy digital content have an awful lot of overlap with those law enforcement uses for digital forensics to recover digital evidence.  This is actually part of my interest in it, in that I think it’s crucial for civil society to understand law enforcement capabilities and practices there, so we as a society can make democratic decisions about what is appropriate, and so individuals can protect their private personal affects (whether from criminals or unjust law enforcement). If law enforcement are the only ones who understand this stuff, we all just need to take their word for it on whatever they tell us, rather than engaging in multi-stakeholder informed dialog.  Library/archive/museum workers who specialize in recovering legacy digital media are one source of civil society expertise in digital forensics.
Posted in General | Leave a comment

Large collections in JS in the browser

Developers from the New York Times have released some open source software meant for displaying and managing large digital content collections, and doing so client-side, in the browser with JS.

Developed for journalism, this has some obvious potential relevance to the business of libraries too, right?  Large collections (increasingly digital), that’s what we’re all about, ain’t it?

Pourover and Tamper

Today we’re open-sourcing two internal projects from The Times:

  • PourOver.js, a library for fast filtering, sorting, updating and viewing large (100k+ item) categorical datasets in the browser, and
  • Tamper, a companion protocol for compressing categorical data on the server and decompressing in your browser. We’ve achieved a 3–5x compression advantage over gzipped JSON in several real-world applications.

Collections are important to developers, especially news developers. We are handed hundreds of user submitted snapshots, thousands of archive items, or millions of medical records. Filtering, faceting, paging, and sorting through these sets are the shortest paths to interactivity, direct routes to experiences which would have been time-consuming, dull or impossible with paper, shelves, indices, and appendices….

…The genesis of PourOver is found in the 2012 London Olympics. Editors wanted a fast, online way to manage the half a million photos we would be collecting from staff photographers, freelancers, and wire services. Editing just hundreds of photos can be difficult with the mostly-unimproved, offline solutions standard in most newsrooms. Editing hundreds of thousands of photos in real-time is almost impossible.

Yep, those sorts of tasks sound like things libraries are involved in, or would like to be involved in, right?

The actual JS does some neat things with figuring out how to incrementally and just-in-time send delta’s of data, etc., and some good UI tools. Look at the page for more.

I am increasingly interested in what ‘digital journalism’ is up to these days. They are an enterprise with some similarities to libraries, in that they are an information-focused business which is having to deal with a lot of internet-era ‘disruption’.    Journalistic enterprises are generally for-profit (unlike most of the libraries we work in), but still with a certain public service ethos.  And some of the technical problems they deal with overlap heavily with our area of focus.

It may be that the grass is always greener, but I think the journalism industry is rising to the challenges somewhat better than ours is, or at any rate is putting more resources into technical innovation. When was the last time something that probably took as many developer-hours as this stuff, and is of potential interest outside the specific industry, came out of libraries?

Posted in General | Leave a comment

“You build it, you run it”

I have seen several different approaches to division of labor in developing, deploying, and maintaining web apps.

The one that seems to work best to me is when the same team responsible for developing an app is the team responsible for deploying it and keeping it up, as well as for maintaining it. The same team — and ideally the same individual people (at least at first; job roles and employment changes over time, of course).

If the people responsible for writing the app in the first place are also responsible for deploying it with good uptime stats, then they have incentive to create software that can be easily deployed and can stay up reliably. If it isn’t at first, then the people who receive the pain of this are the same people best placed to improve the software to deploy better, because they are most familiar with it’s structure and how it might be altered.

Software is always a living organism, it’s never simply “done”, it’s going to need modifications in response to what you learn from how it’s users use it, as well as changing contexts and environments.  Software is always under development, the first time it becomes public is just one marker in it’s development lifecycle, and not a clear boundary between “development” and “deployment”.

Compare this to other divisions of labor, where maybe one team does “R&D” on a nice prototype, then hands their code over to another team to turn it into a production service, or to figure out how to get it deployed and keep it deployed reliably and respond to trouble tickets.  Sometimes these teams may be in entirely different parts of the organization.  If it doesn’t deploy as easily or reliably as the ‘operations’ people would like, do they need to convince the ‘development’ people that this is legit and something should be done? And when it needs additional enhancements or functional changes, maybe it’s the crack team of R&Ders who do it, even though they’re on to newer and shinier things; or maybe it’s the operations people expected to it, even though they’re not familiar with the code since they didn’t write it; or maybe there’s nobody to do it at all, because the organization is operating on the mistaken assumption that developing software is like constructing a building, when it’s done it’s done.[1]

I just don’t find that it works well to create robust, reliable software which can evolve to meet changing requirements.


Recently I ran into a quote from an interview with Werner Vogels, Chief Technology Officer at Amazon, expressing these benefits of “You build it, you run it.”:

There is another lesson here: Giving developers operational responsibilities has greatly enhanced the quality of the services, both from a customer and a technology point of view. The traditional model is that you take your software to the wall that separates development and operations, and throw it over and then forget about it. Not at Amazon. You build it, you run it. This brings developers into contact with the day-to-day operation of their software. It also brings them into day-to-day contact with the customer. This customer feedback loop is essential for improving the quality of the service.

I was originally directed to that quote by this blog post on the need for shared dev and ops responsibility, which I reccommend too.

In this world of silos, development threw releases at the ops or release team to run in production.

The ops team makes sure everything works, everything’s monitored, everything’s continuing to run smoothly.

When something breaks at night, the ops engineer can hope that enough documentation is in place for them to figure out the dial and knobs in the application to isolate and fix the problem. If it isn’t, tough luck.

Putting developers in charge of not just building an app, but also running it in production, benefits everyone in the company, and it benefits the developer too.

It fosters thinking about the environment your code runs in and how you can make sure that when something breaks, the right dials and knobs, metrics and logs, are in place so that you yourself can investigate an issue late at night.

As Werner Vogels put it on how Amazon works: “You build it, you run it.”

The responsibility to maintaining your own code in production should encourage any developer to make sure that it breaks as little as possible, and that when it breaks you know what to do and where to look.

That’s a good thing.

None of this means you can’t have people who focus on ops other people who focus on dev; but I think it means they should be situated organizationally close to each other, on the same teams, and that the dev people have to have share some ops responsibilities, so they feel some pain from products that are hard to deploy, or hard to keep running reliably, or hard to maintain or change.

[1] Note some people think even constructing a building shouldn’t be “when it’s done it’s done”, but that buildings too should be constructed in such a way that allows continual modification by those who inhabit them, in response to changing needs or understandings of needs.

Posted in General | Leave a comment

Thank you again, Edward Snowden

According to this Reuters article, the NSA intentionally weakened encryption in popular encryption software from the company RSA.

They did this because they wanted to make sure they could continue eavesdropping on us all, but in the process they made us more vulnerable to eavesdropping from other attackers too. Once you put in a backdoor, anyone else that figures it out can access it too, it wasn’t some kind of NSA-only backdoor.  I bet, for instance, China’s hackers and mathematicians are as clever as ours.

“We could have been more sceptical of NSA’s intentions,” RSA Chief Technologist Sam Curry told Reuters. “We trusted them because they are charged with security for the U.S. government and U.S. critical infrastructure.”

I’m not sure if I believe him — the $10 million NSA paid RSA for inserting the mathematical backdoors probably did a lot to assuage their skepticism too. What did they think NSA was paying for?

On the other hand, sure, the NSA is charged with improving our security, and does have expertise in that.  It was fairly reasonable to think that’s what they were doing. Suggesting they were intentionally putting some backdoors in instead would have probably got you called paranoid… pre-Snowden.  Not anymore.

It is thanks only to Edward Snowden that nobody will be making that mistake again for a long time. Edward Snowden, thank you for your service.

Posted in General | Leave a comment

Academic freedom in Israel and Palestine

While I mostly try to keep this blog focused on professional concerns, I do think academic freedom is a professional concern for librarians, and I’m going to again use this platform to write about an issue of concern to me.

On December 17th, 2013, the American Studies Association membership endorsed a Resolution on Boycott of Israeli Academic Institutions. This resolution endorses and joins in a campaign organized by Palestinian civil society organizations for boycott of Israel for human rights violations against Palestinians — and specifically, for an academic boycott called for by Palestinian academics.

In late December and early January, very many American university presidents released letters opposing and criticizing the ASA boycott resolution, usually on the grounds that the ASA action threatened the academic freedom of Israeli academics.

Here at Johns Hopkins, the President and Provost issued such a letter on December 23rd. I am quite curious about what organizing took place that resulted in letters from so many university presidents within in a few weeks. Beyond letters of disproval from presidents, there has also been organizing to prevent scholars, departments, and institutions from affiliating with the ASA or to retaliate against scholars who do so (such efforts are, ironically, quite a threat to academic freedom themselves).

The ASA resolution (and the Palestinian academic boycott campaign in general) does not call for prohibition of cooperation with Israeli academics, but only against formal collaborations with Israeli academic institutions — and in the case of the ASA, only formal partnerships by the ASA itself, they are not trying to require any particular actions by members as a condition of membership in the ASA.  You can read more about the parameters of the ASA resolution, and the motivation that led to it, on the ASA’s FAQ on the subject, a concise and well-written document I definitely recommend reading.

So I don’t actually think the ASA resolution will have significant effect on academic freedom for scholars at Israeli institutions.  It’s mostly a symbolic action, although the fierce organizing against it shows how threatening the symbolic action is to the Israeli government and those who would like to protect it from criticism.

But, okay, especially if academic boycott of Israel continues to gain strength, then some academics at Israeli institutions will, at the very least, be inconvenienced in their academic affairs.  I can understand why some people find academic boycott an inappropriate tactic — even though I disagree with them.

But here’s the thing. The academic freedom of Palestinian scholars and students has been regularly, persistently, and severely infringed for quite some time.  In fact, acting in solidarity with Palestinian colleagues facing restrictions on freedom of movement and expression and inquiry was the motivation of the ASA’s resolution in the first place, as they write in their FAQ and the language of the resolution itself.

You can read more about restrictions in Palestinian academic freedom, and the complicity of Israeli academic institutions in these restrictions, in a report from Palestinian civil society here; or this campaign web page from Birzeit University and other Palestinian universities;  this report from the Israeli Alternative Information Center;  or in this 2006 essay by Judith Butler; or this 2011 essay by Riham Barghouti, one of the founding members of the Palestinian Campaign for the Academic and Cultural Boycott of Israel.

What are we to make of the fact that so many university presidents spoke up in alarm at an early sign of possible, in their views, impingements to academic freedom of scholars at Israeli institutions, but none have spoken up to defend significantly beleaguered Palestinian academic freedom?

Here at Hopkins, Students for Justice in Palestine believes that we do all have a responsibility to speak up in solidarity with our Palestinian colleagues, students and scholars, whose freedoms of inquiry and expression are severely curtailed; and that administrators silence on the issue does not in fact represent our community.  Hopkins SJP thinks the community should speak out in concern and support for Palestinian academic freedom, and they’ve written a letter Hopkins affiliates can sign on to.

I’ve signed the letter. I’d urge any readers who are also affiliated to Hopkins to read it, and consider it signing it as well. Here it is.

Posted in General | 1 Comment

“users hate change”

reddit comment with no particularly significant context:

Would be really interesting to put a number on “users hate change”.

Based on my own experience at a company where we actually researched this stuff, the number I would forward is 30%. Given an existing user base, on average 30% will hate any given change to their user experience, independent of whether the that experience is actually worse or better.

“Some random person on reddit” isn’t scientific evidence or anything, but it definitely seems pretty plausible to me that some very significant portion of any user base will generally dislike any change at all — I think I’ve been one of those users for software I don’t develop, I’m thinking of recent changes to Google Maps, many changes to Facebook, etc.

I’m not quite sure what to do with that though, or how it should guide us.  Because, if our users really do want stability over (in the best of cases) improvement, we should give it to them, right? But if it’s say 1/3rd of our users who want this, and not necessarily the other 2/3rds, what should that mean?  And might we hear more from that 1/3rd than the other 2/3rds and over-estimate them yet further?

But, still, say, 1/3rd, that’s a lot. What’s the right balance between stability and improvement? Does it depend on the nature of the improvement, or how badly some other portion of your userbase are desiring change or improvement?

Or, perhaps, work on grouping changes into more occasional releases instead of constant releases, to at least minimize the occurrences of disruption?  How do you square that with software improvement through iteration, so you can see how one change worked before making another?

Eventually users will get used to change, or even love the change and realize it helped them succeed at whatever they do with the software (and then the change-resistant won’t want the new normal changed either!) — does it matter how long this period of adjustment is? Might it be drastically different for different user bases or contexts?

Does it matter how much turnover you should expect or get in your user base?  If you’re selling software, you probably want to keep all the users you’ve got and keep getting more, but the faster you’re growing, the quicker the old users (the only ones to whom a change is actually a change) get diluted by newcomers.   If you’re developing software for an ‘enterprise’ (such as most kinds of libraries), then the turnover of your userbase is a factor of the organization not of your market or marketing.  Either way, if you have less turnover, does that mean you can even less afford to irritate the change-resistant portion of the userbase, or is it irrelevant?

In commercial software development, the answer (for better or worse) is often “whatever choice makes us more money”, and the software development industry has increasingly sophisticated tools for measuring the effect of proposed changes on revenue. If the main goal(s) of your software development effort is something other than revenue, then perhaps it’s important to be clear about exactly what those goals are,  to have any hope of answering these questions.

Posted in General | Leave a comment

blacklight_advanced_search 5.0.0 released for blacklight 5.x

blacklight_advanced_search 5.0.0 has been released.

If you were previously using the gem directly from it’s github repo on the ‘blacklight5′ branch, recommend you switch to using the released gem instead, with a line in your Gemfile like this:

gem 'blacklight_advanced_search', "~> 5.0"

Note that the URL format of advanced search form facet limits has changed from previous versions; if you previously had a previous version deployed, there is a way to configure in redirects for the old style, in order to keep previous bookmarked URLs working. See the README.

Posted in General | Leave a comment

“Code as Research Object”

Mozilla Science Lab, GitHub and Figshare team up to fix the citation of code in academia

Academia has a problem. Research is becoming increasingly computational and data-driven, but the traditional paper and scientific journal has barely changed to accommodate this growing form of analysis. The current referencing structure makes it difficult for anyone to reproduce the results in a paper, either to check findings or build upon their results. In addition, scientists that generate code for middle-author contributions struggle to get the credit they deserve.

The Mozilla Science LabGitHub and Figshare – a repository where academics can upload, share and cite their research materials – is starting to tackle the problem. The trio have developed a system so researchers can easily sync their GitHub releases with a Figshare account. It creates a Digital Object Identifier (DOI) automatically, which can then be referenced and checked by other people.

[HackerNews thread]

Posted in General | Leave a comment