deals

In every deal you get some things and don’t get (or give up) others, that’s what makes it a deal.

HathiTrust recently a welcome expansion of access to public domain texts, within some significant limits if you aren’t a HathiTrust institution.

  • All users can now download full PDFs of public domain volumes that were not digitized by Google. This currently includes nearly 100,000 Internet Archive-digitized volumes that were contributed by the University of California and thousands of volumes digitized locally by the University of Michigan.
  • Authenticated users can now download full PDFs of ALL public domain volumes.

http://mblog.lib.umich.edu/blt/archives/2010/07/hathitrust_digi.html

Seems safe to assume that a contract/license with Google is what prevents them from sharing public domain PDFs digitized by Google with unauthenticated and unaffiliated users.

In fact, it’s kind of suprisingly nice that they can apparently share digitized-by-google public domain texts with the users of HathiTrust institutions, that are not umich, and that may not even be Google partners (you don’t need to be a google partner to be a HathiTrust partner, do you?).

Thanks HathiTrust for actually giving everyone the maximum account you safely can by contract and copyright law (and additionally, not interpreting ‘safely can’ in the absolute most conservative way possible), instead of just saying “Ah, forget it, that’s too hard, let’s only let umich users/HathiTrust partners/Google partners access any of it.”

And HathiTrust is still useful even without full text, the ability to search text even without being able to see more than snippets (or in some cases page numbers) is still useful.  And I’m glad to have a non-profit library consortium in the sector, not just a Google monopoly. If HT wouldn’t really have been possible to start without a jump start from Google, well, that comes with limitations, but it’s a really good platform to start from and the HT folks are doing a good job with it.

This entry was posted in General. Bookmark the permalink.

5 Responses to deals

  1. YS says:

    If you look at the HT data API, you will see that it provides page by page full access even to Google images. The restriction is only for entire titles at one shot. People already wrote utilities to bulk download HT titles such as :

    http://library.sciencemadness.org/library/hathi/

  2. jrochkind says:

    Huh, nice. They definitely seem to push what they can provide as far as they can, which I appreciate.

    Now the sciencemadness.org folks should upload their compiled titles to the Internet Archive. Having a couple monolithic aggregators of public domain full text is a lot more useful than having a whole bunch of little bitty ones all over the place, easier for me to write services against for my users.

  3. Kat Hagedorn says:

    Jonathan, I think your URLs might have been hacked. Both http://mblog.lib.umich.edu/blt/archives/2010/07/hathitrust_digi.html and http://library.sciencemadness.org/library/hathi/ are taking me to a site security exception which resolves to a spam page.

  4. Kat Hagedorn says:

    FYI, this seems to only be a problem with Firefox. Also, the spam page is actually wordpress’ error page, it just looked like a spam page!

  5. Heather Christenson says:

    they can apparently share digitized-by-google public domain texts with the users of HathiTrust institutions that are not umich

    true

    that may not even be Google partners

    true

    (you don’t need to be a google partner to be a HathiTrust partner, do you?)

    no, you do not need to be a Google partner!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s