technical debt/technical weight

Bart Wronski writes a blog post about “technical weight”, a concept related to but distinct from “technical debt.”  I can associate some of what he’s talking about to some library-centered open source projects I’ve worked on.

Technical debt… or technical weight?

…What most post don’t cover is that recently huge amount of technical debt in many codebases comes from shifting to naïve implementations of agile methodologies like Scrum, working sprint to sprint. It’s very hard to do any proper architectural work in such environment and short time and POs usually don’t care about it (it’s not a feature visible to customer / upper management)…

 

…I think of it as a property of every single technical decision you make – from huge architectural decisions through models of medium-sized systems to finally way you write every single line of code. Technical weight is a property that makes your code, systems, decisions in general more “complex”, difficult to debug, difficult to understand, difficult to change, difficult to change active developer.…

 

…To put it all together – if we invested lots of thought, work and effort into something and want to believe it’s good, we will ignore all problems, pretend they don’t exist and decline to admit (often blaming others and random circumstances) and will tend to see benefits. The more investment you have and heavier is the solution – the more you will try to stay with it, making other decisions or changes very difficult even if it would be the best option for your project.…

 

 

 

 

Posted in General | Leave a comment

UC Berkeley Data Science intro to programming textbook online for free

Looks like a good resource for library/information professionals who don’t know how to program, but want to learn a little bit of programming along with (more importantly) computational and inferential thinking, to understand the technological world we work in. As well as those who want to learn ‘data science’!

http://www.inferentialthinking.com/

Data are descriptions of the world around us, collected through observation and stored on computers. Computers enable us to infer properties of the world from these descriptions. Data science is the discipline of drawing conclusions from data using computation. There are three core aspects of effective data analysis: exploration, prediction, and inference. This text develops a consistent approach to all three, introducing statistical ideas and fundamental ideas in computer science concurrently. We focus on a minimal set of core techniques that they apply to a vast range of real-world applications. A foundation in data science requires not only understanding statistical and computational techniques, but also recognizing how they apply to real scenarios.

For whatever aspect of the world we wish to study—whether it’s the Earth’s weather, the world’s markets, political polls, or the human mind—data we collect typically offer an incomplete description of the subject at hand. The central challenge of data science is to make reliable conclusions using this partial information.

In this endeavor, we will combine two essential tools: computation and randomization. For example, we may want to understand climate change trends using temperature observations. Computers will allow us to use all available information to draw conclusions. Rather than focusing only on the average temperature of a region, we will consider the whole range of temperatures together to construct a more nuanced analysis. Randomness will allow us to consider the many different ways in which incomplete information might be completed. Rather than assuming that temperatures vary in a particular way, we will learn to use randomness as a way to imagine many possible scenarios that are all consistent with the data we observe.

Applying this approach requires learning to program a computer, and so this text interleaves a complete introduction to programming that assumes no prior knowledge. Readers with programming experience will find that we cover several topics in computation that do not appear in a typical introductory computer science curriculum. Data science also requires careful reasoning about quantities, but this text does not assume any background in mathematics or statistics beyond basic algebra. You will find very few equations in this text. Instead, techniques are described to readers in the same language in which they are described to the computers that execute them—a programming language.

Posted in General | Leave a comment

How to see if current version of a gem is greater than X

I sometimes need to this, and always forget how. I want to see the currently loaded version of a current gem, and see if it’s greater than a certain version X.

Mainly because I’ve monkey-patched that gem, and want to either automatically stop monkey patching it if a future version is installed, or more likely output a warning message “Hey, you probably don’t need to monkey patch this anymore.”

I usually forget the right rubygems API, so I’m leaving this partially as a note to myself.

Here’s how you do it.

# If some_gem_name is at 2.0 or higher, warn that this patch may
# not be needed. Here's a URL to the PR we're back-porting: <URL>
if Gem.loaded_specs["some_gem_name"].version >= Gem::Version.new('2.0')
   msg = "		
   Please check and make sure this patch is still needed\		
  at #{__FILE__}:#{__LINE__}\n\n"		
   $stderr.puts msg		
   Rails.logger.warn msg		
end

Whenever I do this, I always include the URL to the github PR that implements the fix we’re monkey-patch back-porting, in a comment right by here.

The `$stderr.puts` is there to make sure the warning shows up in the console when running tests.

Unfortunately:

Gem::Version.new("1.4.0.rc1") >= Gem::Version.new("1.4")
# => false

I really want the warning to trigger if I’m using a pre-release too. Hmm.

Aha! Perusing the docs, this seems like it’ll work:

if Gem.loaded_specs["some_gem_name"].version.release >= Gem::Version.new('2.0')

`Gem::Version#release` trims off the prerelease tags.

Posted in General | Leave a comment

Handy introspection for debugging Rails routes

I always forget how to do this, so leave this here partly as a note to myself. From Zobie’s Blog and Mike Blyth’s Stack Overflow answer

 

 routes = Rails.application.routes
 
 # figure out what route a path maps to:
 routes.recognize_path "/station/index/42.html"
 #  => {:controller=>"station", :action=>"index", :format=>"html", :id=>"42"}
 # or get a ActionController::RoutingError

 # figure out what url is generated for params, what url corresponds
 # to certain controller/action/parameters...
 r.generate :controller => :station, :action=> :index, :id=>42

If you have an isolated Rails engine mounted, it’s paths seem to not be accessible from the
`Rails.application.routes` router. You may need to try that specific engine’s router, like `Spree::Core::Engine.routes`.

It seems to me there’s got to be a way to get the actual ‘master’ router that’s actually used
for recognizing incoming urls, since there’s got to be one that sends to the mounted engine
routes as appropriate based on paths. But I haven’t figured out how to do that.

Posted in General | Leave a comment

GREAT presentation on open source development

I highly recommend Schneem’s presentation on “Saving Sprockets”, which he has also turned into a written narrative. Not so much for what it says about Sprockets, but for what it says about open source development.

I won’t say I agree with 100% of it, but probably 85%+, and some of the stuff I agree with is really important and useful, and Schneem’s analyzes what’s going on very well and figures out how to say it very well.

Some of my favorite points:

“To them, I ask: what are the problems? Do you know what they are? Because we can’t fix what we can’t define, and if we want to attempt a re-write, then a re-write would assume that we know better. We still have the same need to do things with assets, so we don’t really know better.”

A long term maintainer is really important, coders aren’t just inter-changeable widgets:

“While I’m working on Sprockets, there’s so many times that I say “this is absolutely batshit insane. This makes no sense. I’m going to rip this all out. I’m going to completely redo all of this.” And then, six hours later, I say “wow, that was genius,” and I didn’t have the right context for looking at the code. Maintainers are really historians, and these maintainers, they help bring context. We try to focus on good commit messages and good pull requests. Changelog entries. Please keep a changelog, btw. But none of that compares to having someone who’s actually there. A story is worth 1000 commit messages. For example, you can’t exactly ask a commit message a question, like, “hey, did you consider trying to uh…” and the commit message is like, “uh, I’m a commit message.” It doesn’t store the context about the conversations around that”

“These are all different people with very different needs who need different documentation. Don’t make them hunt down the documentation that they need. When I started working on Sprockets, somebody would ask, “is this expected?” and I would say honestly, “I don’t know, you tell me. Was it happening before?” And through doing that research, I put together some guides, and eventually we could definitively say what was expected behavior. The only way that I could make those guides make sense is if I split them out, and so, we have a guide for “building an asset processing framework”, if you’re building the next Rails asset pipeline, or “end user asset generation”, if you are a Rails user, or “extending Sprockets” if you want to make one of those plugins. It’s all right there, it’s kind of right at your fingertips, and you only need to look at the documentation that fits your use case, when you need it.

We made it easier for developers to find what they need. Also, it was a super useful exercise for me as well. One thing I love about these guides is that they live in the source and not in a wiki, because documentation is really only valid for one point in time.”

I also really like the concept that figuring out how to support or fix someone else’s code (which is really all ‘legacy’ means), is an excercize in a sort of code archeology.  I’ve been doing that a lot lately.  Also how to use someone else’s code that isn’t documented sufficiently.  It’s sort of fun sometimes, but better to have better docs.

Posted in General | Leave a comment

Really slow rspec suite? Use the fuubar formatter!

I am working on a ‘legacy’-ish app that unfortunately has a pretty slow test suite (10 minutes+).

I am working on some major upgrades to some dependencies, that require running the full test suite or a major portion of it iteratively lots of times. I’m starting with a bunch of broken tests, and whittling them down.

It was painful. I was getting really frustrated with the built-in rspec formatters — I’d see an ‘f’ on the output, but wouldn’t know what test had failed until the whole suite finished, or or I could control-c or run with –fail-fast to see the first/some subset of failed tests when they happen, but interrupting the suite so I’d never see other later failures.

Then I found the fuubar rspec formatter.  Perfect!

  • A progress bar makes the suite seem faster psychologically even though it isn’t. There’s reasons a progress bar is considered good UI for a long-running task!
  • Outputs failed spec as they happen, but keep running the whole suite. For a long-running suite, this lets me start investigating a failure as it happens without having to wait for suite to run, while still letting the suite finish to see the total picture of how I’m doing and what other sorts of failures I’m getting.

I recommend fuubar, it’s especially helpful for slow suites. I had been wanting something like this for a couple months, and wondering why it wasn’t a built-in formatter in rspec — just ran across it now in a reddit thread (started by someone else considering writing such a formatter who didn’t know fuubar already existed!).  So I write this blog post to hopefully increase exposure!

Posted in General | Leave a comment

Commercial gmail plugin to turn gmail into a help desk

This looks like an interesting product; I didn’t even know this level of gmail plugin was supported by gmail.

http://www.keeping.com/

Help desk ticketing, with assignment, priorities, notes, and built-in response-time metrics, all within your gmail inbox (support emails are in a separate tab from your regular email).

The cost is $49/month for the ‘unlimited’ plan (capped at 5 users for $29/month).

I think this product could be a good fit for libraries dealing with patron reference/help questions, I think many libraries don’t have very user-friendly interfaces for this at present. I think the price is pretty reasonable at $1000/year, probably cheaper than most alternatives and within the budgets of many libraries.

Posted in General | Leave a comment

Sequential JQuery AJAX using recursive creation of Promises

So I’m in JQuery-land.

I’ve got an array of 100K ID’s on the client-side, and I want to POST them to a back-end API which will respond with JSON, in batches of 100 at a time. So that’s 1000 individual posts.

I don’t want to just loop and create 1000 `$.post`s, because I don’t want the browser trying to do 1000 requests “at once.” So some kind of promise chaining is called for.

But I don’t really even want to create all 1000 promises at once, that’s a lot of things in memory, doing who knows what.  I want to go in sequence through each batch, waiting for each batch to be done, and creating the next promise/AJAX request in the chain only after the first one finishes.

Here’s one way to do it, using a recursive function to create the AJAX promises.

var bigArrayOfIds; // assume exists
var bigArrayLength = bigArrayOfIds.length;
var batchSize = 100;

function batchPromiseRecursive() {
  // note splice is destructive, removing the first batch off
  // the array
  var batch = bigArrayOfIds.splice(0, batchSize);

  if (batch.length == 0) {
    return $.Deferred().resolve().promise();
  }

  return $.post('/endpoint/post', {ids: batch})
    .done(function(serverData) {
      // Do something after each batch finishes. 
      // Update a progress bar is probably a good idea. 
    })
    .fail(function(e) {
      // if a batch fails, say server returns 500,
      // do something here. 
    })
    .then(function() {
      return batchPromiseRecursive();
    });
}
            

batchPromiseRecursive().then(function() {
  // something to do when it's all over. 
});

In this version, if one batch fails, execution stops entirely. In order to record the failure but keep going with the next batches, I think you’d just have to take the then inside the batchPromiseRecursive function, and give it a second error argument, that would convert the failed promise to a succesful one. i haven’t gotten that far. I think JQuery (ES6?) promise API is a bit more confusing/less concise than it could be for converting a failed state a resolved one in your promise chain.

Or maybe I just don’t understand how to use it effectively/idiomatically, I’m fairly new to this stuff. Other ways to improve this code?

Posted in General | Leave a comment

“Apple Encryption Engineers, if Ordered to Unlock iPhone, Might Resist”

From the NYTimes, “Apple Encryption Engineers, if Ordered to Unlock iPhone, Might Resist

SAN FRANCISCO — If the F.B.I. wins its court fight to force Apple’s help in unlocking an iPhone, the agency may run into yet another roadblock:Apple’s engineers.

Apple employees are already discussing what they will do if ordered to help law enforcement authorities. Some say they may balk at the work, while others may even quit their high-paying jobs rather than undermine the security of the software they have already created, according to more than a half-dozen current and former Apple employees.

Do software engineers have professional ethical responsibilities to refuse to do some things even if ordered by their employers?

Posted in General | 1 Comment

Followup: Reliable Capybara JS testing with RackRequestBlocker

My post on Struggling Towards Reliable Capybara Javascript Testing attracted a lot of readers, and some discussion on reddit.

I left there thinking I had basically got my Capybara JS tests reliable enough… but after that, things degraded again.

But now I think I really have fixed it for real, with some block/wait rack middleware based on the original concept by Joel Turkel, which I’ve released as RackRequestBlocker. This is middleware to keep track of ‘outstanding’ requests in your app that were triggered by a feature spec that has finished, and let the main test thread wait until they are complete before DatabaseCleaning and moving on to the next spec.

My RackRequestBlocker implementation is based on the new hotness concurrent-ruby (a Rails5 dependency, great collection of ruby concurrency primitives) instead of Turkel’s use of the older `atomic` gem, and using actual signal/wait logic instead of polling, and refactored to have IMO a more convenient packaged API. Influenced by Dan Dorman’s unfinished attempts to gemify Turkel’s design.

It’s only a few dozen lines of code, check it out for an example of using concurrent-ruby’s primitives to build something concurrent.

And my Capybara JS feature tests now appear to be very very reliable, and I expect them to stay that way. Woot.

To be clear, I also had to turn off DatabaseCleaner transactional strategy entirely, even for non-JS tests.  Just RackRequestBlocker wasn’t enough, neither was just turning off transactional strategy.  Either one by themselves I still had crazy race conditions — including pg:deadlocks… and actual segfaults!

Why? I honestly am not sure. There’s no reason transactional fixture strategy shouldn’t work when used only for non-JS tests, even with RackRequestBlocker.  The segfaults suggests a bug in something C; MRI, pg, poltergeist? (poltergeist was very unpopular in the reddit thread on my original post, but I still think it’s less bad than other options for my situation.)  Bug of some kind in the test_after_commit gem we were using to make things work even with transactional fixture strategy? Honestly, I have no idea — I just accepted it, and was happy to have tests that were working.

Try out RackRequestBlocker, see if it helps with your JS Capybara race condition problems, let me know in comments if you want, I’m curious. I can’t support this super well, I just provide the code as a public service, because I fantasize of the day nobody has to go through as many hours as I have fighting with JS feature tests.

Posted in General | Leave a comment