The big buzz in the search world this week has been Google’s allegation that it caught Bing copying Google’s search results and using them on Bing. Microsoft at first seemed to deny the accusation, but later came out with a statement that rather circumspectly admitted the possibility that it was true.
Google’s Matt Cutts has been sounding rather indignant since the revelation was made. “We don’t use clicks on Bing’s users in Google ranking,” he said at the Farsight 2011 event (an event sponsored by…wait for it…Bing!). Later he added, “I’ve been doing search for a decade and never seen anything like this.” Google’s official blog detailed the sting operation they used to catch Bing in the act, and called Bing a “cheap imitation” of Google.
So is Bing’s use of Google results in its own the moral equivalent of a high school student leaning across the aisle to copy test answers from his smart nerd classmate? I say not at all.
As Microsoft responded in its own blog post, “We use over 1,000 different signals and features in our ranking algorithm. A small piece of that is clickstream data we get from some of our customers, who opt-in to sharing anonymous data as they navigate the web in order to help us improve the experience for all users.” This “opt-in sharing” comes from users of the Internet Explorer toolbar. Microsoft uses the aggregated data from their surfing of the web as one factor in its Bing search algorithm. Inevitably, some of that data is going to come from those users searching on Google.
The fact is, Google already does something very similar, albeit a bit more openly. Google watches user behavior on a number of different platforms, social media sites such as Twitter being most prominent, and has stated that they are factoring those observed behaviors into search results. Should Twitter get on a high horse and proclaim that Google is stealing their results and is therefore a “cheap imitation” of Twitter? Quite the opposite. Twitter seems flattered that there little experiment has grown to the point where the Mega Shark of the Internet is paying attention. Ultimately, Google ought to be flattered that it’s struggling competitor has been forced to acknowledge that Google’s results are an important factor in determining what matters on the web.
There’s one more little twist in the plot here. I’m not alone in observing the possibly advantageous timing of Google’s release of these accusations. It comes all too close on the heals of a wave of bad publicity alleging that Google’s search results were getting spammier and spammier. It’s all the more curious that the announcement of the accusations against Bing was made by Matt Cutts, Google’s chief anti-spam officer. Hmmmmm.
UPDATE (2/2/11 16:38 EST) Search Engine Land reports that Bing has issued a very strong denial of Google’s charges. “We do not copy results from any of our competitors. Period. Full stop.” Yusuf Mehdi, Microsoft’s Senior VP of Online Services, goes on to imply that Google engaged in a kind of click fraud known as a “honey pot” to intentionally mislead Bing. The search engine war continues to heat up! (Read “Bing: ‘We Do Not Copy Results. Period“)
1. Bing also takes social platforms into consideration. Both search engines are right to do so, it’s been openly discussed and verified independently, and it provides for a better search experience on freshness quality and time sensitive searches. Comparing that to what Bing is doing is such a stretch that I can’t believe you even mentioned it, and especially in a negative connotation.
2. To call Google’s proof-of-concept experiment “click fraud”, while technically true, is a weak accusation. The experiments were conducted on terms that are extremely unlikely to naturally occur (random letter sequences, longs strings of non-spaced words). The queries were designed to eliminate any naturally occurring traffic out of the equation. At best, Bing is negatively skewing their search results for their users. Bing’s demographic isn’t Googles, and to mimic Google’s results to their audience is a disservice. They should concentrate on serving their audience instead of being the next Google.
Here’s where it gets scary for IE users: “Information associated with the web address, such as search terms or data you entered in forms might be included.”
How many times have you come across a non-secured web contact form, or even a non-secure shopping cart? Typical IE users aren’t savvy enough to know the difference, and you can bet that Bing is caching that data if they’re extrapolating user click behavior already. Where is that form data going? Are they pushing that out to their other properties for marketing purposes?
3. Of course Cutts announced the Bing issue. Who do you expect? Larry Page and Sergey Brinn to do it? Matt Cutts is the face of Google, spam-related or otherwise.
Danny Sullivan sums it up nicely “I think Bing should develop its own search voice without using Google’s as a tuning fork.” A cheaper imitation will never be anything more than a cheaper imitation.
It took me a few reads to figure out whether you were agreeing or disagreeing with me, but I thin it’s the latter.
First of all, I’m not accusing Google of “click fraud.” I was reporting what an Microsoft spokesperson said.
As to your argument, I don’t understand how you can defend the practice of monitoring many “social signals” and then condemn Microsoft for monitoring and taking into consideration searches that happen to have taken place on Google. (BTW, to be clear, I also have no problem with monitoring social signals to improve search results.) As I said above, MS’s only fault here was that they didn’t have good enough filters in place to recognize what might be an outlier result and confirm it through other signals before using it. I’m sure they’ve corrected that by now.
Raising unsecured data capturing is kind of off topic here, so I’m not going to get into that.
Finally, I certainly can’t prove that having Cutts make the announcement was suspicious, but it still smells to me. First, many may still think of Cutts as “the face of Google,” but Google has recently very narrowed his role to spam prevention. Second, there still is that issue of timing, this announcement coming right at the height of all the bad PR Google was getting about spam, being made by Google’s chief of spam, and being made by surprise at a MS-sponsored conference while sitting right next to a MS representative.
I want to be clear that I’m not all “Google was evil and MS pure as the driven snow.” My intention was just to point out that I think Google was trying to make a mountain out of a molehill. At the same time, it is undeniable that MS needed a towel to get some egg off its face.
http://aclweb.org/anthology/P/P10/P10-1028.pdf (page 268, paragraph 3)
“The clickthrough data of the second type consists of a set of query reformulation sessions extracted from 3 months of log files from a commercial Web browser.”
Parsing the logs of search results on misspellings, finding the destination URL, and then promoting that result artificially sure looks like they are replicating Google’s results. In figure 1, they are specifically looking at the “&spell=1” (indicates a click on the “did you mean?” link on a misspelled result). If you know this, you can jump a step ahead of the user and serve the highest CTR result on the misspelled SERP page.
Cutts made a great point on his latest posts about this. If Bing supposedly has an excellent team of engineers, then they should prove that by doing away with this method of calculating results for misspells and work on their own methods more intently.
In the social signals vs. this…anyone can mine social data (given the time/money/knowledge). If I wanted to build an app that tells me the most talked about movie, or the restaurant that has the most negative mentions, *I* can mine the social data to do that. Can I go to MS and get access to the data they’re collecting from IE? I’m going to guess that they aren’t keen to let anyone have access to that, especially Google.
There’s overwhelming proof that Bing knows how, and has been caught doing this. To issue such a strong denial like Mehdi did reeks of guilt. I would’ve been less annoyed with “We’ll look into this” or “Yes, we do it because of $reason”.
I just think that using data from another search engine seems cheap.
I’ll go so far as to agree that Bing was very, very sloppy in this.
I won’t agree that “using data from another search engine seems cheap” (except insofar as it doesn’t cost anything!). I think that unless Google has found some way to either block the availability of the data or some way to legally protect it, it makes sense for Bing to sample that data if it’s able to. Again, Bing just has to be more careful about how it filters what it receives and how strongly it factors it into its own results.
Bing is simply awful. It cant stand anywhere near Google.
Here is a great example…
Im looking for a stapler called the “Rapid Classic 1 Stapler”
I go to Bing and type in:
Rapic Classic 1 Stapler
It returns with “We didn’t find any results for rapic clasic 1 stapler.You might try one of these suggestions…”
And gives me the follwing…
5 Year Old In Tanning Bed
2012 Kentucky Derby Horses
Jr Seau Drives Off Cliff
Colombia Secret Service
Genevieve Cook Obama
Cameron Diaz Golf Like Crack
I put the same EXACT search in Google. It immediately and automatically corrected the spelling of stapler, then proceeds to give me 227,000 results on the stapler and where to purchase it.
Wow, and Bing wonders why they suck so bad.