08 Jul 2008
Name Scrapers

padlock The Slashdot headline “How to Fight Name Scraping Scammers?” caught my eye, and I had to follow it up. Curt Monash discovered his name on a dating site, but he’d never registered there. It appears that the site paired up first and last names at random and then presented them as being registered users. Monash posted an article at NetworkWorld where he says:

I’ve recently discovered a new kind of online scam. It’s elegant in its simplicity. It’s hard to combat, because all it steals is reputation. What’s worse, it preys on ordinary people who, unlike many enterprises, may not have enough of a web presence to easily fight back.

He goes on emphasize that everyone should have a visible web presence, a personal web page, with friends and/or employers linking to it — basically, the Internet will make note of you so you have to get your story out there too.

I’m reminded of “The Net of A Million Lies”, the parody(?) of USENET in Vernor Vinge’s excellent book A Fire Upon the Deep. You really shouldn’t believe everything you see on the ‘Net, but there seems to be a human tendency to give excessive weight to anything that’s “published”, both digital and especially hardcopy.

Some may object that they aren’t on the net, so they needn’t worry. However others, who may make reference to you, are on the net, so at least an occasional ego surf is warranted even if you aren’t registered anywhere, nor have a site of your own.

A somewhat mitigating argument is that most names are common — not “owned” by any particular individual — so they weren’t targeting him in particular. For example, there are several “Larry Horn”s on the net, and in fact some are (gasp) even more prominent than me. That’s one reason I usually try to use my full name, “Larry Olin Horn”, to avoid confusion or tarnishing their reputations. Getting back to Monash’s idea, having a web presence with pictures, a biography, etc., can help distinguish among the different instances of a name.

Monash also wondered if there was a service that offered some form of identity claiming. The one I’m familar with is claimID, which is free. Others mentioned QAlias, which charges a monthly fee and provides a “search engine optimized, professional, online profile”.

You may want to read my brief post about why I have all these mostly neglected registrations, then check out my claimID.

A slightly humorous bit of trivia. I was surprised one day to discover that I suddenly had quite a few new hits on the net. It turns out that it was because old USENET archives had suddenly been put online and indexed. The oldest reference I’ve found to myself is a 16-year-old post to comp.os.vms in private_control_code_news.txt; check out this header:

  1. Newsgroups: comp.os.vms
  2. Path: utkcs2!darwin.sura.net!mips!zaphod.mps.ohio-state.edu!cis.ohio-state.edu!ucbvax!OKRA.MILLSAPS.EDU!hornlo
  3. Message-ID: <0095D724.39F46440.13119@okra.millsaps.edu>
  4. Date: 12 Jul 1992 09:30:11 GMT
  5. Sender: daemon@ucbvax.BERKELEY.EDU
  6. From: hornlo@OKRA.MILLSAPS.EDU (Larry Horn)
  7. Subject: Re: Escape seq. for DECterm window and icon names.

Finally, let’s not even get into the rat’s nest of privacy, anonymity, and security — except to say much of your information is: generally public, part of government public records, for sale by many targeted marketing companies; lent, sold, or acquired (via merger or bankruptcy) by corporations; compromised via lost or stolen computers or backup media; even exposed via invasive subpoenas for data. And any of that can “leak” onto the net.

image: Slashdot

Category: Life-Society
Tags: , ,
(comments closed) | (trackbacks closed) | Permalink | Subscribe to comments |

2 Comments


  1.  Terri (12 comments)
    Posted 2008-07-09 at 01:23:23 | Permalink

    I’ve also noticed some of my former “specific word for word” search criteria appearing in the “catch all garbage” inserted into some questionable websites to drive traffic to them. I’ve experimented with a number of search engines so I don’t actually know which ones are “grabbing” and selling our very specific searchs. Some of them contain proper names, cities, schools, etc.


  2.  hornlo (19 comments)
    Posted 2008-07-09 at 12:14:23 | Permalink

    Your (anonymous) search text can be discovered from just analyzing search hits, whether the link is followed or not.

    Say you search for “alphabet soup in brass bowls for linux lovers” — some of my linux pages will likely show up somewhere in your search results, and your search string (not identifying you) will show up in my stats because of that match, even though you didn’t follow any of the links.

    I think that’s kind of a weird combination, but there could be a whole subculture into this. Maybe I’ll put “brass” and “alphabet soup” on a few pages to suck in some of them as visitors.

    For instance, after my Jot and Tittle post, I got some really interesting search hits involving “s_k_…” and “animal” (Turkish-specific characters removed so this page doesn’t match).

    For fun, check out Google Trends.

Site last updated 2015-01-12 @ 13:31:07; This content last updated 2008-07-08 @ 04:15:41