I’ve been sick as hell over the past couple of days, likely thanks to one of hundreds of thousands of people on the New York subway system. If this happens again I’m going to start wearing a face mask. I’m not too ill to write a SELECT statement, though, so I’ve been continuing to crunch the AOL data.
Most of my queries have been exercises in confirming the obvious. I remember being told – by a SEO guy who’ll remain nameless – that it was better to be first on the second page of the search results (i.e. #11) than last on the first page of the search results (i.e. #10). I also remember thinking that this guy was full of it, and just trying to say some ‘non-obvious’ things to me to convey a sense of expertise. Anyhow, the data speaks for itself:

See that sharp drop-off there in the center-left of the graph? That’s the page one to page two jump. The tenth-ranked result got 2.97% of all clicks, while the eleventh got 0.66%. First place (not shown on my graph) continues to be king with 42.30% of the clickthroughs.
The query I was really interested in – what terms were searched for that didn’t get results – proved to be a monster. Could just be my mad query-writing skills, but it kept my Macbook grinding for over twelve hours. Of course, it’d be worth it if I could uncover what people were searching for unsuccessfully – I’d be more than happy to step into that gap with a little amateur webmastery. But the results of the query somewhat underwhelmed. The top ten terms searched for with no clickthroughs:
- ebay – 67956 queries
- google – 67951 queries
- internet – 38891 queries
- yahoo.com – 36192 queries
- mapquest – 30829 queries
- http – 30039 queries
- google.com – 26787 queries
- yahoo – 23371 queries
- myspace.com – 22842 queries
- adserver.ign.com – 18070 queries
Doh! I didn’t think of all the people who’d get interrupted in the middle of queries (the phone rings, an IM comes in, etc.) It turns out the most-frequent queries with no clickthroughs pretty much match the most-frequent queries overall. (Although that adserver.ign.com result looks suspicious – maybe that’s due to some sort of bot.) My quest for the online El Dorado is going to take some more digging – I’m currently compiling a list, sorted by frequency, of all the queries that did result in clickthroughs. Once I have that, I can compare clickthroughs vs. no-clickthroughs for each query, expressing the results as both an integer and a ratio. That way I might identify some truly under-served search terms…
As an aside, I transferred my gigantic 150 MB of query results from my home computer to my work computer with Pando, which is just BitTorrent reversed to push instead of pull. Seth used it to send me an attachment a while ago and I’ve been loving it since. No more transferring crap from home to office by uploading it to my webserver!
{ 5 comments… read them below or add one }
hmm. you’re doing the same sorts of things i’ve been doing. except none of my queries have finished yet.
damn data is too big. i’m still at the optimizing phase.
I’ve started over and am wondering if you really need all the data to get an idea of the whole picture or if maybe just one or two of the files loaded up will yield a similiar spread of terms and behaviors…
if thats true and all we’re doing is looking for keywords and behaviors then its a lot faster to search 2-4 million rows rather than search 36 million.
now here’s something odd. i’ve 7.2 million of the records loaded.
there are 678 searches for out say 800 for “eva longoria” that never resulted in a click through…
that’s weird.
Pat – I get 2799 searches without clickthroughs out of 3108 total for ‘Eva Longoria’ in my complete database.
There are things about these logs that we don’t quite know for certain, that I suspect are AOL specific.
For instance, Eva Longoria has a ‘spotlight’ on AOL which shows a few select links deemed appropriate by a human editor above the search results. I suspect that clicks on these spotlights are showing up as searches without clickthroughs.
If my hypothesis is correct then we could use this data to determine just how effective the AOL Spotlight feature really is. (Plenty, I’m guessing.)
By the way, if you can figure out what the “-” results mean I’ll be much appreciative. They’re not just blank searches – from the looks of the click-out URLs that appear beside them they relate to the searches that were made before, but I’m not sure how. Sponsored links, maybe?
Yeah. I noticed that too, but I’m at a loss as to what they are.
Hopefully as people get over the “look at the freakshow” aspect of this and get into the real phenom of AOL search, and AOL search users we’ll start finding out what these things mean.
Greg,
Here’s something I just realized. Something to think about.
For the purpose of speeding up query times I only have half the database loaded. I figure that will tell me enough about what I want to know.
So. That gives me 5,384,356 different search phrases. Of that 5,384,356 5,362,590 had 50 or less searches attempts.
Thats pretty long tail.