Yes, if something’s on the public web, it really will get found

So I’ve written an article on a study comparing various article search services, that will probably be published in the next Code4Lib Journal, when/if the final draft is accepted for publication etc (it’s currently in editorial process).

When writing the draft, I wanted to write it in markdown, because I’ve come to love markdown, and it outputs to nice clean HTML, which will eventually be needed for the final publication. And I thought it would be useful to have it online somewhere for sharing with peer-reveiwers, and also with vendors who I had agreed to get pre-print feedback from.

So it seemed very convenient to use a github repo for this. It’ll keep draft history, which is convenient. It’ll render markdown to HTML. They recently improved their in-browser file editor to even do syntax colorization for markdown, making editing/entry even more convenient. I think maybe a previous C4LJ article once experimented with this too (was Gabe the editor?), and the author/editor even used github’s commit comment feature to discuss the article.

So I did that. Just in a public github repo, I don’t pay for private protected github repos. (If I recall right, the last C4LJ article to experiment with this did it the same way).

I didn’t really want the in-progress draft to get wide distribution, but I figured:

  • If I put it in an out of the way hard to stumble upon path in my github account, it probably wouldn’t get noticed too much.
  • And, in the end, this wasn’t especially secure/private, I didn’t really mind if it did end up getting out. (At one point, I even considered publicizing a pre-print draft, when it looked like the journal editorial process might take longer than I wanted, but decided against that for current expected timelines).

It turned out that I was wrong about both of those things. :) Somehow the pre-print non-final draft has gotten out. One vendor staff person let me know that several actual and potential customers had referenced it to him (which means many more have seen it, presumably). I’m not quite sure how it wound up becoming widely known, but there are several quite plausible ways it would have happened.

And then I realized that I did mind it becoming widely distributed — it was a non-final in-progress draft, and one of the things that had been improved at the suggestio of my editor in more final drafts not represented in the github repo (the editor I was working with ultimately preferred transferring everything to a Word document for editing, go figure) — was more polite/accurate/fair/circumspect language used in critiques of various vendor products, I realized I really didn’t want the draft prior to these improvements being widely seen.

Oh well, so that was the answer to that experiment. I’ve made rather harder to find on the public web, although I haven’t scoured everything I could the way I would if, say, I had accidentally put my SSN on the web or something. It’s still not the end of the world, but I realized I’d rather it not get out there that much. But it’s gratifying to see there’s apparently interest in the article — it’ll hopefully be in the next c4lj issue.

This does again bring up the need for a good software package for supporting edited journal workflow. Of course what’s “good” may depend on the particular journal organization’s needs as well as individual preference — this is why in almost any industry/domain, software which is mostly about workflows tends to be awful, everyone’s needs are so custom. (I think both ILS’s and “ERP” software like PeopleSoft and SAP are examples). My own criteria would include markdown-based editing (why? because…), ability to easily publish as both HTML and PDF (PDF only for a web journal? Come on), good authoring tools, good comment/discussion tools (as good as Word’s comment/track changes but on the web?). And proper access control, including making it as painless as possible for third party reviewers to ‘create accounts’ (or not?) and auth to see what you want to share with them.

(When I last looked OJP was not what I would be looking for in journal editing workflow support and didn’t seem to me a good match for how C4LJ operates, although I’m sure there are many journals whose workflow is well supported by OJS).

(Oh, and yes, I am both one of the editors of C4LJ and the author of an article likely to appear in it. This is not the first time we’ve published an article by someone who’s also on editorial staff. When that happens, the author does not participate in any back-end editorial discussions or decisions regarding his or her own article).


6 thoughts on “Yes, if something’s on the public web, it really will get found”

  1. What about Google Docs (with an institutional apps account) for restricted sharing and commenting (and light revision control), plus pandoc for format interchange?

  2. Yeah, a lot of people do that. But I want to edit in something more like ascii than formatted text, myself. I am not familiar with pandoc, but I don’t want a workflow that requires me to run things through a converter before showing them to people (are you suggesting writing in actual markdown in a google doc, but running it through pandoc to get html? No way, the point of using something like github is I can _write_ in markdown, but viewers can _see_ the text, at any stage of the game, in prettier html).

    Like I said, workflow tends to be personal.

    But Google Docs works more or less fine for a one-off kind of thing. When I’ve experienced people trying to use it for long-term workflow with a collection of documents, I find it gets kind of out of control organizationally for multi-docs, although the tagging feature can help. Multi-docs where you have a set of docs that all need to be shared with the same group, and the members of the group sometime change? Nightmare. (This was not my situation in this example, but it’s a nightmare I’ve had with working groups that try to use google docs like this).

  3. I think you’re conflating two issues: workflow and (essentially) editor. And if you’re going to insist on markdown, you’re already screwed, because markdown is a too simplistic to expose the semantics of the underlying text (say, code vs. command-line vs. citations vs …) in such a way that you can control how it appears in the eventual output. I love me the markdown for blog posts and such, but it can’t convey any structural information beyond “heading” and “list.” This is why people invented DocType and the like, and why no one has bothered to write a publishing platform that supports markdown.

  4. Not a response to the bigger workflow question but Bitbucket allows for free private repos (five; more if you use a .edu address) that you can share with others. This would provide most of the features you like about Github (BB doesn’t have the nice in browser editor) but only make it available to those you authorize. BB has been quite useful to me for this ability to make a few repos private.

    Also in regards to markdown conveying structural information, a blog plus on PLOS recently called for ‘a scholarly markdown’.

    Looking forward to reading the article.

  5. Bill, my supposition is that if your workflow involves several rounds of collaboration with editors marking up text in metaphorical ‘red pen’ and suggesting changes, and author making changes — then it makes a lot of sense for the ‘workflow support’ tool to be good at supporting textual authorship/editing too, because that’s part of the workflow.

    Markdown doesn’t seem to me to be too simplistic for any of the journal articles I’ve written or edited; they’re basically just sections with hieararchial section headings, images with captions, tables with captions, links, occasional bold/italics for emphasis, and footnotes. A couple of those features are handled only a little bit hackily in markdown — but less hackily than trying to go from MS Word to HTML! (Let alone to both HTML _and_ PDF — at the C4LJ we only do HTML, but really both would be ideal. Some platforms choice to publish only in PDF, however, seems completely ridiculous to me for a journal published on the web).

    Granted though, the less geekily inclined will probably not find markdown as pleasant as I do.

    Not familiar with DocType, I’ll check it out.

  6. Trying to google for “DocType”, I have trouble finding anything except about XML/HTML “doctype” decleration. If you’re talking about something else, can you give me a link Bill?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s