Tuesday, 11 March 2008

Splogging

Sploggers are one of the most common sources of plagiarism on the Internet. A small number of resolute and capable Sploggers can steal content from thousands of different sites, scraping RSS feeds from them and stealing the content. The change is that many “black hats” have taken up the art. The profit motivation of Sploggers is obvious, how they make a profit is less perceptible.
Splogs were certainly not intended for humans to view. Human-visited Splogs are high risk with little prospective gain. Rather, Splogs consist of links to other sites which are more often than not long junk domains burdened with keywords and metatags. The idea is to have search engines pick up their site. A Splogger’s site will typically consist of nothing but keywords and metatags loaded into the HTTP header with a small amount of random text (usually copied from another site) and numerous diverse groups of text ads arranged to look alternatively like search results or regular links. When time the site is ready to be used, over 90% of the site consists of ads from Adsense or a comparable service.

With sufficient spam links to the site, it is anticipated that the Splogger will rank highly in the search rankings and be besieged by visitors to those sites who they expect will click on the links (Note: According to most SEO experts and my own research, this does NOT work. You can only expedite getting listed, not drastically improve your ranking, thus hundreds of junk posts are a waste). It is hoped that the targeted visitors will subsequently click on the ads, either out of curiosity or due to the mistaken belief that they are regular links. Splogging is a classic example of black hat search engine optimisation (SEO) that merely involves extensive plagiarism to make it work.

The expression “splog” was popularized in August 2005 when it was termed publicly by Mark Cuban. The name was used a sporadically prior to this in describing spam blogs back to as a minimum, 2003. The “art” developed from many linkblogs that were attempting to manipulate search indexes and others attempting to Google-bomb every word in the dictionary.
It has been estimated that about one in five blogs are spam blogs. These fake blogs waste disk space and bandwidth as well as pollute search engine results, ruining blog search engines and are detrimental to a blogger’s community networking.Google's search engine uses PageRank, which is susceptible to link flooding, especially from highly weighted bloggers.

RSS abuse
Full content RSS feeds make the splog problem worse .As an RSS feed simplifies the coping of content from genuine blogs. Splog RSS feeds pollute RSS search engines, and are reproduced and propagated throughout the Internet.

Defences
A number of splog reporting services have arisen, allowing Internet users to report splog with plans of offering these splog URLs to search engines so that they can be excluded from search results. These services started with Splog Reporter. Some of the main services include:

  • SplogSpot which actually maintains a large database of Splogs and makes it available to the public via APIs,
  • A2B blocks web server IP addresses that splog URLs resolve to.
  • A Feed Copyrighter plugin (for WordPress) allows for the automatic addition of copyright messages to feed, so Splogs can be easily spotted and reported by visitors or through
  • Google search.
  • TrustRank attempts to automatically find Splogs.
  • Blogger has implemented a system that can detect Splogs and then force them to take a
  • Captcha 'spell this word' test.

No comments: