The Promise & Reality Of Mixing The Social Graph With Search Engines
Robert Scoble used the theme “let’s blame SEO” as a launching pad for a series of videos on how Facebook potentially could be a killer search engine — regardless of the fact he seems to have no clue that “social graph” or social networking mixing has been tried and abandoned with search. Having watched his videos, which have sparked much discussion, I’ll do some debunking, some educating for those who want more history of what’s been done in the area, plus I’ll swing around to that New York Times article that ascribes super-ranking powers to SEO. Plus, I’ll use the F-word along the way.
Robert’s excited about “social graph search,” which is the idea that if you know a network of people, you can use their connection to improve search results. It’s a “revolution” coming in search that will overtake all the major search engines, he says. Maybe, but it’s not like we haven’t heard this before. I’ll go through his arguments, but it really feels like this is more about getting attention to Robert’s videos, period.
Part 1 of Robert’s social graph video series starts off by telling us that there’s no way we’d have gotten to his videos from a search engine. That’s absurd. People write about what’s in video content all the time. Want to see the Lazy Sunday video? Oh, look — I found it number one on Google without Google needing to analyze the words inside the video.
That’s the real point that Robert’s trying to make, of course — that search engines typically don’t analyze all the words within a video in the way they read the words in a web page. Want to understand more about that? My Video Search Challenge Isn’t Speech Recognition, It’s Content Owner Management post from February explains this in more depth as well as why it really hasn’t been an issue. In particular to Robert’s argument, it’s because there are plenty of people who will reference what’s in the video content in more user friendly and search engine friendly HTML text.
The meat of his first part is to talk about three different types of search engines: crawlers (like Google), Techmeme and Mahalo to discuss how they are or are not “SEO resistant,” as if SEO is a bad thing — you know, that SEO equals spam.
SEO is not spam. It’s like saying email is spam. There’s email; there’s email marketing; there’s email spam. These are all different things. You want to better understand why SEO isn’t spam? Then read the posts below:
- Yes Virginia, SEO Is Rocket Science - Defending Search Engine Optimization Once Again
- Defending SEO, Yet Again!
- Why The SEO Folks Were Mad At You, Jason
- From My Inbox: More Defense Of SEO
- SEO: Real Skills That Can Protect Your Traffic
Want to be like Robert — and Jason Calacanis — and keep equating SEO with spam? Then f- off.
I don’t think I’ve ever used the F-word in any of my writing, and my apologies for being so crass. But I’ve had enough of people trying to advance their own personal agendas (Jason hoping someone will care about Mahalo; Robert hoping someone will watch his videos) on the back of an industry that is full of plenty of people who do good work.
Last week, I was part of a meeting at Google along with a number of notable SEOs, being asked about ways Google could be better. This group wasn’t pushing for Google to make it easier for them to spam the listings. A chief concern they had was how Google (along with other major search engines) continues to have difficulties identifying original source documents. You know — you publish your blog post, then some other site with more authority than you picks it up, and then that site gets the top ranking. SEOs are leading the charge to help site owners get a fix for this overall. But all people like Jason and Robert want to do is characterize them as evil comment spammers for their own personal gain.
Back to the video. Robert goes through how search engines make use of “on the page” factors, though he doesn’t call it that, and greatly simplifies the process. But yes, search engines look at the frequency and location of words on a page to determine if a page is relevant to that.
Robert then explains that PageRank is also used, using incorrect shorthand for link analysis that’s part of “off the page” factors that search engines use to rank pages — looking at the quality and the context of links to a page. My What Is Google PageRank? A Guide For Searchers & Webmasters post from April goes into more depth about what exactly PageRank is and how it is not the same as link analysis. Give it a read, Robert.
Next, we get the news that paid links are hard for Google to tell apart from “real” links. Actually, Robert says Google can’t tell the difference between them. In reality, it can easily identify many types of paid links. But not all of them.
Apparently, the SEO community feels it’s its “birthright” to stick paid links into pages. Actually, Robert — there’s disagreement about that within the SEO community, and the bigger audience that feels it has a birthright to do what it wants with paid links are the content owners themselves. SEOs aren’t selling the links; they’re buying the real estate that others are selling.
Robert then shifts gears to Techmeme and how, in his view, news won’t get on the site until someone starts to blog about it. Um, yeah — that’s part of the “meme” part of Techmeme. New stuff can (and does) hit the site pretty quickly, too, since important blogs catch things fast and talk about it. FYI, Q&A With Gabe Rivera, Creator Of Techmeme from me in January talks about Techmeme and how it works in more depth.
Techmeme is described as an SEO resistant site. Sure, in the sense that you’re dealing with a smaller source list. From that reason, Google News is more resistant. Any vertical or specialized search engine that deals with a subset of sites is SEO resistant (or more correctly, spam resistant).
Mahalo comes up next and how by using a small number of human editors, it can be harder to spam. Sure. So’s the Yahoo Directory. You remember the Yahoo Directory, right? It used, um, a small number of human editors to categorize the web. Advances in crawler-based search engines meant you could get really good relevancy and be spam resistant, which caused the Yahoo Directory to effectively be abandoned by Yahoo. Mahalo’s approach to custom-tailor the most popular searches is interesting — but despite heaps and heaps of publicity the new service has had showered upon it, it still hasn’t gained any real traction among searchers. Mahalo Launches With Human-Crafted Search Results from me in May describes the service in more depth.
Finally, Robert turns to Facebook, talking about the “social graph” term that’s now being bandied about as this month’s new Kool-Aid to drink. I’m being harsh — there’s obvious value in being able to look at the connections between people and form a ranking mechanism that can be applied to things. SocialRank, PeopleRank — whatever you want to call it. The idea at the moment is that Facebook especially has it, so everyone else better look out.
That leads to Part 2, where how adding the social graph to existing search technologies will really change the search game. Wow. Wow! I mean, yeah, never heard that before.
Look, Robert, back in January 2004, Eurekster launched with the promise of mixing the social graph with other search criteria, to improve results. As I wrote back then:
Personalized search? The concept has been that by knowing some things about you, a search engine might refine your results to make them more relevant. A teenager searching for music might get different matches than a senior citizen. A man looking for flowers might see different listings than a woman.
Eurekster’s twist on this concept is to provide personalized results based not on who you are but who you know. Friends, colleagues and anyone in your Eurekster social network will influence the type of results you see….
The potential of using your friends or colleagues is enormous. Imagine Eurekster being used by all the employees of a medical research firm, where many might do similar medical-related queries. With Eurekster, all the employees can be linked together and benefit from the searches and selections made by their colleagues.
Libraries are another institution that might latch on to the Eurekster concept. Librarians are constantly asked by patrons for assistance. Eurekster would allow librarians to collaborate invisibly with each other and share what they’ve found to be the best for various queries.
There are downsides. Not all of my friends have the same interests as me. In addition, as my social network grows — because my friends invite their friends and so on — commonalities that are useful get diluted.
Eurekster is still out there, but the idea of a network of friends influencing search results seems to have died at some point over the years. Viva not the revolution Robert was promising us.
Well, maybe Eurekster had no luck with that particular model since the company was small. Well, Yahoo’s not small. And in June 2005, it rolled out Yahoo My Web 2.0, which promised to bring social networks into search. As I wrote then:
After seeing what was planned, I remarked to Yahoo senior vice president of search Jeff Weiner sitting next to me that they were building “an eBay for knowledge.” Jeff was already literally bouncing at times with excitement in showing the new system, and the remark made him smile even more broadly.
He smiled because that’s exactly the Yahoo goal. My Web is Yahoo’s community rating system for information. Just as you buy things on eBay depending on ratings to know if you’ll trust a seller, My Web is what Yahoo hopes will help you choose more wisely the information you receive, whether you actively check reviews, contribute or remain an ordinary searcher who completely ignores the tagging and social search components.
In short, Yahoo’s not banking on tagging — the categorization of material — as a way to help people find things better. It’s banking that the mere act of saving things at all, even without tags, will give them a clue about what are trusted pages across the web. By looking at patterns of saving, Yahoo will have trust networks to tap into….
We’ve had a generation of search engines that depended on on-the-page factors such as word location and frequency. We’ve had a current second generation that tapped into link analysis, looking at how people are linking and what they say in links.
Personal search is that third generational jump, and Yahoo’s flavor of personal search is a social network one that it hopes will improve relevancy in web wide results in the way that link analysis helped drive back spam and improve relevancy years ago.
“We’re creating personal anchor text for pages, but by having a trust network, we can actually pretty much eliminate spamming,” Walther said.
Guess what? Still no revolution. The masses didn’t descend upon Yahoo My Web to form networks and save search results. In fact, Yahoo pretty much pulled back from the product, even dropping inline integration with it from regular search results back in October 2006.
But before Robert gets into applying social networks to search, he prattles on about Mahalo again being so superior to Google. Oddly for a video, he doesn’t show us any of the search results pages he’s talking about, saying at one point he can’t show us them. Apparently the camera he uses can’t be swiveled to show a screen. C’mon, Robert — if we’re investing the time (over a half-hour) to watch your video, make use of the medium.
He talks about a search for HDTV on Google versus Mahalo and pokes at the Google results, despite Google showing Wikipedia, How Stuff Works, CNET as well as a page from Amazon, just as Mahalo does.
I mean seriously, you might want to ding Google for deciding Amazon deserves to have a top listing for HDTV over all the other types of information that could be shown (especially if you’re one of the millions who use Google from outside the United States). But if you ding it for that, what’s up with Mahalo’s supposedly great human method of also deciding Amazon deserves a top spot.
Mahalo lists more than a top seven list, of course — you get a chunk of review sites, manufacturers, retailers and so on. But here’s the deal on Mahalo — it’s not really a search engine. The page it provides is good human crafted content, a good destination page like you might find at some of the other destination pages that Google lists. Mahalo — as Jason Calacanis himself will tell you — is a great place to start searching if your searches involve very popular queries. But if you want to hit those “search tail” terms that people always encounter? It’s not going to help.
My desktop computer crashed just when I got back from a trip, and since it’s likely to be down for a day or two, I decided to start using a new Vista laptop I purchased until I can buy a Mac replacement (heh — well, maybe not). But ZoneAlarm for Vista doesn’t block http referrer information as it does on Windows XP. That led me to do a tail search — block referrer plugin for firefox — something that’s only going to happen a few times per month, relatively speaking. Try it on Google, then on Mahalo. Plenty of good solutions right at the top on Google. On Mahalo, I have to wade through four “related” links that aren’t relevant (Ryan Block, Sunscreen, Netscape and Jet), then I get Google results.
Robert also tells us that Mahalo rocks because you know, the first thing you do in the search process for HDTVs is want to know the manufacturers. Bull.
First of all, no one can predict how someone else will go through a search process, so that’s bull strike one. But if I want to play magic mindreader like Robert, I’ll say the first thing people want are some guides to HDTVs. What is an HDTV? Is 720p enough to make a TV HD quality? Does it have to have HDMI?
Saying you first go to a manufacturer site in the process is like saying that if you want to buy a new car, just go visit some car dealerships. Me — I go to Consumer Reports, figure out the cars I might want from a third party trusted resource, understand the jargon I might encounter, and then I go to the dealership. And as someone who bought an HDTV last year, I also remember going to the horrible manufacturer web sites where they often provided only sparse info about their own products and certainly didn’t compare them to competing products.
We then learn from Robert that Google can’t change to be like Mahalo because it has algorithms that are “stuck in sand, stuck in cement” and shifting will “prove impossible.”
Insane. Seriously, like you want to scream stop talking. Robert’s a personally likable guy, but watching him make statements like this is like watching someone driving a car full speed toward a concrete wall while yelling “It’ll be OK — we’ll get through.”
Google has constantly changed its ranking algorithm over the years and will continue doing so. If Robert knew any SEOs, they’d tell him this firsthand. But more to the point, Google can’t change the ranking algorithm to be more like Mahalo because Mahalo isn’t using an algorithm to rank web pages — it’s using human editors. Maybe Google someday might get an algorithm to mimic much of what Mahalo does, but that still wouldn’t be the same as using actual humans.
So it is impossible for Google to change! Maybe the algorithm, but Google could easily hire editors of their own, pay them more than Jason does and do what Mahalo is doing if that model takes off — which, so far, hasn’t happened.
Robert then jumps into the idea that Google also can’t integrate social networking into its algorithms, pointing to Google’s largely failed Orkut social networking site as an example. He completely overlooks the fact that Google is playing the human/social aspect on a different level — personalized search, where results are refined based on what you as an individual seem to like. That’s a major shift for Google, and it’s also one that I’ve found personally compelling. For more on the service, see:
- Google Ramps Up Personalized Search
- Just Behave: Google’s Marissa Mayer on Personalized Search
- Google Search History Expands, Becomes Web History
In particular, Google has been talking about how personalized search allows for creating personalized PageRank (and see here for a patent look), a way where rankings revolve around what you personally like. It’s not a hard leap to extend that into a “social network PageRank” model, where if you define a social network, the collective interests of that network could be used to model the rankings. Google’s not doing that now, but to suggest that the mechanism are somehow impossible from either a company attitude or technological model is simply being ignorant of Google.
Finally — halfway into part two of his video, 23 minutes of covering all that’s “wrong” with some existing players, Robert unveils how social might be blended into Facebook, giving you the impression this is simply a “please hire me” pitch to Facebook itself.
First step from Robert, use old-style on-the-page ranking. Yeah, there’s a waste of time.
Here’s a thought — why not just license an existing search engine period? I mean, how do you search the web when on the Facebook site itself? You don’t. The Facebook search box only searches within Facebook (and despite claims from Facebook itself that it is some type of fantastic people search engine, I’ve found the search less than compelling).
Facebook is partnered with Microsoft, so it’s somewhat amazing (if not telling) that there’s no ability to search using Microsoft’s Live Search. Hit MySpace, in contrast, and you’ll see how the Google partnership has Google web search over there.
Facebook doesn’t need to build on-the-page ranking from scratch, not to mention the nightmare situation of trying to crawl billions of pages.
Next, Robert gets into the social network and trust aspect. Sure — an exactly like what Eurekster and Yahoo already promised. There’s nothing new here. Well, there is. As Facebook has grown, we’ve also had frustration grow — including the famed Facebook Bankruptcy that Jason Calacanis declared last month.
People have friends on Facebook who aren’t friends at all. It’s just easier to accept them. Robert, at the time of this writing, has 4,875 “friends” in the system. Really — he knows all of these people? And wants all of them influencing his searches?
Ah — but see, Facebook knows how to “lock out” the SEOs, Robert tells us, so he’s not overwhelmed by noise. Sure — but on the flipside, Robert is one of the top FBOs out there, Facebook Optimizers, to the degree people have been complaining about how his activities dominate the news updates that Facebook sends out to those with him as a friend.
There will be more FBOs, no doubt about that. Any system that has lots of traffic will attract people who will study ways to tap into that traffic. That’s good and bad. It’s good in that since it’s going to happen, you want people to learn appropriate ways to do this. It’s bad in that there will no doubt be spamming that comes along with it.
For such hype about his video, I was pretty much left with a “is that it” response? Facebook will get pages, then look at a social network and hopefully get those people to proactively rank pages when they search. Despite the fact that the Yahoo My Web experience tells me people don’t want to build search results — they just want to search.
Social network data applied to search does have promise. But to assume that social networks can’t be spammed and lack noise is foolish. To assume that people want to participate in actively shaping results is also mistaken, in my view. To also assume that major players like Google or Yahoo can’t tap into things that make Mahalo, Techmeme or Facebook good is shortsighted. Yahoo Answers is akin to Mahalo. Google News is Techmeme across multiple subjects.
By the way, Robert, if you’re tired of the SEO “noise” you think screws up results, then do this. In a search for scoble, for well over a year now, you’ve crowded out variety in the results by not redirecting your scoble.weblogs.com address to your new home at scobleizer.com. That means you get both results 1 & 2 for your new place as well as results 3 & 4. You also have no content at scobelizer.wordpress.com, plus another version of your old place at radio.weblogs.com.
You have contacts with the Weblogs folks — getting a redirect should be easy for you. You can kill or block that Wordpress site that you no longer use. I assume you maintain these other sites simply so that when people search for you by name, you crowd out anyone else from ranking well, perhaps people who might disagree with you on topics. That’s an aspect of SEO — it’s a tactic used as part of public relations in SEO. It introduces the same “noise” into the results that you cheered about not being present in Mahalo. So clean it up or cut it out with the SEO slams. You’re doing SEO yourself.
What about that New York Times article I mentioned? When Bad News Follows You is the article, another amazing “SEO sucks” story. The New York Times has opened its archives for crawling, which apparently is causing people to come forward at the not so astounding rate of one person per day complaining about articles casting them in a bad light. Blame SEO:
Technically complex, search engine optimization pushes Times content to or near the top of search results, regardless of its importance or accuracy.
Wow, seriously — did I just read that in the New York Times? SEO just shoves whatever crud it wants to the top of Google. Hey, think there are some SEOs out there that would like to rank for “new york times.” I guess they just need to SEO up some pages and they get there. Not.
Geez. The rest of the article does some hand-wringing on what to do to make the “right” articles appear at the top of the results (why not just sprinkle some SEO fairy dust on them?).
Insane. If an article is factually incorrect, then correct it. If the article is about someone with a negative connotation, then a later article comes out updating the story, link prominently from the top of the negative article to the latest version of a story. It’s called online journalism in the 2000s.
Postscript: I purposely haven’t read any of the other commentary on Robert’s post until I could brain dump my own thoughts. Since then, I read Rand Fishkin’s I Used to Respect Robert Scoble’s Opinion post, and he does a great job of poking back at the SEO attack as well as debunking Robert’s ideas on that Maholo-Google search shoot-out I mentioned.
Google and search from Dave Winer points out that spam is not the problem that both Jason and Robert like to paint it as. There’s spam, but there are lots and lots of great results, too.
Why Google Should be Scared of Facebook from Dare Obasanjo highlights that Facebook’s wall around its content is a threat to Google, but that’s a wall I think will get ripped down sooner rather than later, if only when Facebook decides it needs to show more ads on those pages and so needs more traffic.
Techmeme has much more commentary.

June 18th, 2008 at 6:35 pm
Good job.
June 26th, 2008 at 6:51 pm
wow, what an interesting article. very informative.