At work, from my desktop computer (Windows, Firefox), I still get the reader interface I am used to seeing for a year now.
What the heck is going on? Hey readers, when you look at a ‘look inside’ or ‘search inside’ view from Amazon, which do you get?
If I knew who to contact at Amazon, I might contact them to ask about it–except I’m not sure if they’d consider my Umlaut screen-scraping to be welcome, as it’s driving traffic to Amazon, or instead to be something they’d want to prevent, so I’d be hesitant to do so even if I knew who to talk to.
It seems to have something to do with whether the ‘reader’ URL has /gp/ in it or not.
http://www.amazon.com/gp/reader/0896085716 => The ‘good’ reader page when I can get it, but when I can’t, it’s because that URL is just issuing a redirect to:
Confused. This is what you get when you try to screen scrape, alas.
Update: Okay, I can’t even tell you how I figured this out, because I don’t know, basically a couple hours of google researching anything I could find on the Amazon Online Reader and urls, and not actually finding any explanations, but finding some interesting looking urls mentioned in four year old pages…
I think if you prefix the url like this:
You get the page I consider the ‘good’ reader. And it doesn’t insist on redirecting you to the ‘bad’ one. At least that’s what it looks like now to me, further expermentation is required to see how reliable this is. I still have no idea why the other URLs I started with give me one interface (the v3?) on some computers, but another (v4? v5?) on others. And there’s really no telling how long this “v3” one will stay around. I still wish I knew what was going on, but I think I can get the page I need to scrape with ‘sitbv3’.