Monday, March 8, 2010

Google PubSubHubbub (PuSH)

Google is developing a system that will enable web publishers of any size to automatically submit new content to Google for indexing within seconds of that content being published. Search industry analyst Danny Sullivan told us today that this could be "the next chapter" for Google.

Google would some day use PuSH for indexing the web instead of the crawling of links that has been the way search engines have indexed the web for years.Google senior product manager Dylan Casey said yesterday at Sullivan's Search Marketing Expo in Santa Clara, California that the company plans to soon publish a standard way for site owners to participate in a program much like that.


PuSH is a syndication system based on the ATOM format where a publisher tells the world about a Hub that it will notify every time new content is published. Subscribers then tell the Hub "when this Publisher posts new content, please deliver it to me right away." So instead of the Subscriber checking back with the Publisher all the time to see if there's new content, they just sit and wait to be told that there is by the Hub. The Publisher publishes something, then tells the Hub that it's available, then the Hub goes and delivers it to all the Subscribers. This can take as little as a few seconds.

If Google can implement an Indexing by PuSH program, it would ask every website to implement the technology and declare which Hub they push to at the top of each document, just like they declare where the RSS feeds they publish can be found. Then Google would subscribe to those PuSH feeds to discover new content when it's published.

PuSH wouldn't likely replace crawling, in fact a crawl would be needed to discover PuSH feeds to subscribe to, but the real-time format would be used to augment Google's existing index.


PuSH is much more computationally efficient for Google but Slatkin says that even more important is the impact of such a move for small publishers. Right now many small sites get visited by Google maybe once a week. With a PuSH system in place, they would be able to get their content to Google automatically right away.

A richer, faster, more efficient internet would be good for everyone, but the benefits in search wouldn't be limited to Google, either. The PubSubHubbub is an open protocol and the feeds would be as visible to Yahoo and Bing as they would be to Google.

No comments: