Comment Google built a multi-billion-dollar advertising empire atop a service that does little more than copy information from other sources. And yet it chastises others when they do the copying.
It’s an irony that could land the company in some very hot water.
Google made (countless) headlines last week when, after an intricate “sting operation”, it accused Microsoft of “copying” its search results. Many missed the irony of “Copygate”, but others quickly picked up on what can only be described as painfully obvious. It’s not just that Google has made made multi-billions selling ads alongside content copied from across the web. The company has also been known to “copy” Bing’s iconic background images – if only briefly. And we would argue it quite blatantly copied the iPhone in building Android.
No doubt, Google would deny such things. And even if it didn’t, it would argue that Bing’s copying is different. According to Google’s sting, when netizens search Google on certain Internet Explorer browsers, there are cases where Microsoft automatically lifts the results and plugs them straight into its own search engine. Granted, this isn’t the wisest move. At the very least, it gives miscreants an easy way of gaming Microsoft’s search engine.
But the irony is still there. Google is a company built on all sorts of copying. With its Google Books project, Mountain View copied millions of library books without asking permission from the authors and publishers.
On the surface, this seems little more than a source of amusement. But we’ve seen this irony before. The same Google attitude plays a significant role in the European Union’s ongoing antitrust investigation into the company’s search and ad practices.
The European Commission is probing Google after receiving complaints from a trio of companies that includes Foundem, a UK-based vertical search engine that focuses on comparison shopping. Foundem accuses Google of “exploiting its dominance of search in ways that stifle innovation, suppress competition, and erode consumer choice”.
The company’s complaint makes two overarching claims. It says that in some cases, Google uses “discriminatory penalties” to remove sites from its search-results engine regardless of how relevant they are to a user’s query, and it says that Google’s Univerisal Search setup is unfairly promoting the company’s own services – including Google Maps, YouTube, and Google Product Search – over those of its competitors.
In 2006, Google effectively removed Foundem from its “organic” search results, and all but barred the company from purchasing search ads on Google AdWords. For more than three years, Foundem fought for a return to Google’s search engine, and Google obliged only after Foundem took its story public in late 2009. Despite being reinstated, Foundem went ahead with its EU complaint.
Before the EU formally announced its investigation in November, Google made light of Foundem’s complaint without actually addressing the issues at hand, subtly criticizing the makeup of the company’s site. But then, the day the probe was announced, the company rolled out a new tactic. A Google spokesman told us that Foundem’s site was a problem because 79 per cent of its content is “duplicated” from other sites. And the company told The Guardian something similar, saying the site was de-indexed because about 87 per cent was “copied” from elsewhere.
There’s that word again.
According to The Guardian, Google explained that a high level of copying “leads to automatic downgrading in its search results”. This seems rather odd, however, when you consider that Google copies its content from elsewhere. Part of Google’s defense of Universal Search is that it’s not showing its own content, only the content of others. (This isn’t true, but it’s the company’s defense nonetheless).
When we pointed out that he was criticizing Foundem for “un-original” content while arguing that Google was immune to criticism because of un-original content, the company spokesman told us that Google’s situation was different. But he didn’t exactly say how it was different.
Certainly, there are some types of copied content you don’t want on a search results page. Just before accusing Microsoft of copying its search engine, Google rolled out a new algorithm designed to reduce “webspam”. Google search guru Matt Cutts pointed to a pair of programming-centric queries where the change had an effect. Originally, both were giving preference to a site called efreedom that had copied content from stackoverflow.com. But after the change, the original stackoverflow links rose to the top.
This is only reasonable. It’s welcome, in fact. A search engine, by design, should limit the sort of shamelessly pilfered content efreedom is throwing at people.
But in describing the webspam he was going after, Cutts used much of the same language Google has used to describe Foundem. “The algorithm change,” Cutts said on his personal blog, “primarily affects sites that copy others’ content and sites with low levels of original content.”
This wasn’t lost on Foundem, which – in a blog post of its own – was quick to point out that while some copied content is unwanted, other copied content can be very useful indeed. Foundem does copy a majority of its content, but it’s a search engine. “Copying, organising, and presenting the content of others is a defining characteristic of any search service,” Foundem said, “including Google’s own.”
Google is well aware of the distinction between webspam and a vertical search engine. After all, the company offers its own price-comparison engine, Google Product Search, and it received prominent placement on the company’s primary search engine thanks to Universal Search. And Google indexes various other vertical search engines, including – as of the end of 2009 – Foundem. This despite its somewhere between 74 and 87 per cent copied content, or whatever it is.
“The difference here is between service and content. Clearly, there are all kinds of services that aren’t required to author content,” Foundem cofounder and CTO Adam Raff tells The Reg. “Google used to talk about a ‘lack of original content’, but lately seems to have made a strategic shift to calling it ‘copying content’,” he said.
“’87 per cent of their content is copied from other sources’, [Google will say of Foundem],” Raff told us. “They make it sound like a cheap form of spam where a site simply copies somebody else’s content wholesale and runs Google ads on it to monetize it. But of course, for any legitimate search service, the vast majority of its content will have been copied from others.”
Raff actually defends Google’s claims that Microsoft is copying its search results, denying there’s any hypocrisy at play. “The word ‘copy’ has a lot of different meanings,” he says. “Using clickstream data to effectively copy a result from Google to Bing … is, at the very least, a mistake. Apart from anything else, it’s a way to game Bing’s results.”
The real hypocrisy, he says, lies elsewhere. “Another kind of copying altogether is the copying that any search engine does,” he explains. “In the context of a search engine, this kind of copying is not only legitimate, it is essential. The real hypocrisy here is that Google has started attacking vertical search services by suggesting that this perfectly legitimate form of copying is somehow illegitimate for all vertical search services other than its own.”
It’s a hypocrisy that may not sit well the European Commission. Google is keen to end the Commission’s investigation, but it continues apace. ®