 Mind what you say in Facebook comments, Google will soon be indexing them and serving them up as part of the company’s standard search results. Google’s all-seeing search robots still can’t find comments on private pages within Facebook, but now any time you use a Facebook comment form on a other sites, or a public page within Facebook, those comments will be indexed by Google.
Mind what you say in Facebook comments, Google will soon be indexing them and serving them up as part of the company’s standard search results. Google’s all-seeing search robots still can’t find comments on private pages within Facebook, but now any time you use a Facebook comment form on a other sites, or a public page within Facebook, those comments will be indexed by Google.
The new indexing plan isn’t just about Facebook comments, but applies nearly any content that’s previously been accessible only through an HTTP POST request. Google’s goal is to include anything “hiding” behind a form — comment systems like Disqus or Facebook and other JavaScript-based sites and forms.
Typically when Google announces it’s going to expand its search index in some way everyone is happy — sites get more searchable content into Google and users can find more of what they’re looking for — but that’s not the case with the latest changes to Google’s indexing policy.
Developers are upset because Google is no longer the passive crawler it once was and users will likely become upset once they realize that comments about drunken parties, embarrassing moments or what they thought were private details are going to start showing up next to their names in Google’s search results.
For now most of the ire seems limited to concerned web developers worried that Google’s new indexing plan ignores the HTML specification and breaks the web’s underlying architecture. To understand what Google is planning to do and why it breaks one of the fundamental gentleman’s agreements of the web, you first have to understand how various web requests work.
There are two primary requests you can initiate on the web — GET and POST. In a nutshell, GET requests are intended for reading data, POST for changing or adding data. That’s why search engine robots like Google’s have always stuck to GET crawling. There’s no danger of the Googlebot altering a site’s data with GET, it just reads the page, without ever touching the actual data. Now that Google is crawling POST pages the Googlebot is no longer a passive observer, it’s actually interacting with — and potentially altering — the websites it crawls.
While it’s unlikely that the new Googlebot will alter a site’s data — as the Google Webmaster Blog writes, “Googlebot may now perform POST requests when we believe it’s safe and appropriate” — it’s certainly possible now and that’s what worries some developers. As any webmaster knows, mistakes happen, especially when robots are involved, and no one wants to wake up one day to discover that the Googlebot has wreaked havoc across their site.
If you’d like to stop the Googlebot from crawling your site’s forms, Google suggests using the robots.txt file to disallow the Googlebot on any POST URLs your site might have. So long as you’re surfacing your content in other ways — and you should be, provided you want it indexed — there shouldn’t be any harm in blocking the Googlebot from POST requests.
If, on the other hand, you’d like to stop the Googlebot from indexing any embarrassing comments you may have left on the web, well, you’re out of luck.
[Photo by Glen Scott/Flickr/CC]
See Also:
Authors:
 Le principe Noemi concept
		    			Le principe Noemi concept			   
			 Astuces informatiques
		    			Astuces informatiques			   
			 Webbuzz & Tech info
		    			Webbuzz & Tech info			   
			 Noemi météo
		    			Noemi météo			   
			 Notions de Météo
		    			Notions de Météo			   
			 Animation satellite
		    			Animation satellite			   
			 Mesure du taux radiation
		    			Mesure du taux radiation			   
			 NC Communication & Design
		    			NC Communication & Design			   
			 News Département Com
		    			News Département Com			   
			 Portfolio
		    			Portfolio			   
			 NC Print et Event
		    			NC Print et Event			   
			 NC Video
		    			NC Video			   
			 Le département Edition
		    			Le département Edition			   
			 Les coups de coeur de Noemi
		    			Les coups de coeur de Noemi			   
			 News Grande Région
		    			News Grande Région			   
			 News Finance France
		    			News Finance France			   
			 Glance.lu
		    			Glance.lu			   
			



 
	       
	       
	       
	       
	       
	       
	       
	       
	       
	      



