When someone gives something to someone else, it comes with unspoken limitations attached. Failing to abide by those unwritten rules makes you a rude person and subject to censure. I believe this is as true online as it is offline. When I give someone my phone number offline, I don’t state it specifically, but do I expect that they aren’t going to sell it to telemarketers or use it in a fake Craigslist ad. When I give someone my e-mail address online, I similarly expect that they’re not going to sell it to spammers or post it up in a public place.
So let’s look at Plaxo’s Facebook-scraping tool, since Jerry wondered why I objected to it so strenuously. The tool was designed to obtain friends’ e-mail addresses, which are only displayed on Facebook as images and not available through the API. The tool did this through character recognition, taking the image and using it to recreate the text.
There is only one reason for displaying an e-mail as an image – to prevent harvesting by scripts. By displaying an e-mail as an image, I’m therefore entering an implicit contract that says “this is free for you to use, if you’re a human, and if it’s important enough to you that you do the image-to-text conversion yourself, the hard way.” This is just like a restriction on crawling in robots.txt or a CAPTCHA on a form – it says “if you’re human, go for it – but if you’re a script, no thanks.” Personally, I thought this was obvious.
Plaxo built a character-recognition tool that violates the implicit contract made by email-as-image. They know that someone doesn’t want that e-mail automatically harvested through a script, because that’s the sole reason for turning an e-mail address into an image – there’s no other interpretation. Yet they built their tool anyway. Perhaps I only made my e-mail public because Facebook displays it as an image. Plaxo has no way of knowing this, and by building this tool, showed that they could care less about my intentions. In short, Plaxo’s being rude. Scraping an image to retrieve an e-mail address is inherently a dick move.
There are other reasons why the Plaxo hoopla irritated me. It’s inefficient – you have to scrape a page per friend – and therefore, like most scrapers, is an unfair burden on sites, taking but giving nothing back. And it was carried out by an attention-seeker who (as it turns out, rightly) thought his high profile could protect him from any negative consequences. But primarily the issue is rudeness – a basic disrespect of my obvious if implicit wishes. If an e-mail’s rendered as an image, don’t scrape it.