« Scan Bot | Main | Security Musings »
Saturday, April 11, 2009
Blocking the *.amazonaws.com domain with ZB Block, and why.
This domain has been a continual source of content theft and hacking attempts.
Now first, I must admit that I have seen a couple good services using a *.amazonaws.com domain name, but all of the domain names are cryptic, and you just can't be sure you aren't dealing with a spoofed user client string. Now onto some finds!
Tynted
Host:
ec2-67-202-60-246.compute-1.amazonaws.com
User Agent: Java/1.6.0_02
Here's the most egregious of the lot, tynt.com. This site claims straight out that it's copying the content of your site. Who da #&*%! gave them that right, especially when I claim copyright? Also, they will cause duplicate content to appear on the web, and in the eyes of google, this messes up your page rank, badly! But, that's not the worst thing...
EVEN WORSE tynt.com / tynted.net act as a no-registration-required proxy server! This allows previously blocked hackers, to come right back in and start pushing, pulling, tweaking, and investigating your site. This bad behaviour was the genesis of me blocking them. This by itself is bad, but wait, there's MORE...
REDIFF
Host:
ec2-72-44-45-196.compute-1.amazonaws.com
User Agent: rdfbot/1.0
(Indian Language Web Search Engine; Rediff.com; rdfbotsupport AT
rediffmailpro DOT com)
No habla hindi senõr! This is actually a content scraper, and their site seemed to be in English.
SimilarPages
Host:
ec2-174-129-187-47.compute-1.amazonaws.com
User Agent:
SimilarPages/Nutch-1.0-dev (SimilarPages Nutch Crawler;
http://www.similarpages.com; info@similarpages.com)
If this isn't saying "Hi, I'm an SEO scraper!" I don't know what it's saying. Buhbyenow. Usually Nutch is used by scrapers.
Conductor
Host:
ec2-72-44-52-94.compute-1.amazonaws.com
User Agent: Caliperbot/1.0
(+http://www.conductor.com/caliperbot)
They say (here): "Perfect ads are only possible when the publisher retains 100% editorial control over content and advertising. It's possible with Conductor. If interested, first review our publisher requirements and then submit your site for review."
I say: "I never submitted my site for review, so why are you here? I use, and am happy with adsense."
They say (here): "So if you can compete with those other articles, other competitors, those other affiliates and aggregators that are in front of you - you can discover millions of dollars of revenue every year - without even taking into consideration brand value or the synergy that results when you appear on the first page in both paid and natural search."
I say: "So you're really keyword spamming SEO scum. Get lost. My site is high ranked for content, not stolen words."
***
I am sure there will be more as time goes on, the next version of ZB Block's signatures should have bypasses for the valid bots (currently under test), but for now, the AmazonAWS cloud is banned.
Zap.
UPDATE: The bypasses are in. Amazon AWS can be blocked from your site with impunity, without harming any valid search engines.
Edited on: Tuesday, June 02, 2009 3:07 PM Mountain Daylight Time
Categories: Content Thieves, Odd Bot, Scrape Bot
Wednesday, April 08, 2009
Stop Keyword Poaching - It's mutiny on your bounty!
You may notice that now ZB Block is blocking SEO keyword scrapers. You may ask just what they are, and why I am directing your site to block it. Well, I will do my best to fill you in on the scoop.
First off, no keyword scraping SEO robot ever drove traffic to YOUR site. Quite the opposite, they attempt to tear traffic away from your site. Worse, they try to do this by fooling the legitimate search engines, and they make money in the process. Even beyond this, some of these are known to feed the Russian Business Network (A giant cybercrime conglomerate). They RBN is interested in this so they can make bogus pages (especially security related) that have high page ranks, to attract those with legitimate interest, away to pages with bogus scam software (Like the very evil AntiVirusPro XP 2010, otherwise known as Troj/FakeXPA).
Let's use a probable hypothetical example, one that happens far too often, to describe this:
*John, an expert in the field of wonder widgets, decides to share his knowledge with the world on the best way to care for and maintain wonder widgets. He works long and hard on a site describing how to do this, and even how you can make your own wonder widget if you can't afford to buy one. His site is very informative, and well written, and the great google gods decide to give him a good page rank as an award for his hard labor.
The SEO botmasters notice his up and coming star, and decide to scrape his site for keyword content, and build a profile of his site.
Then, Gidget's Gadgets notices that their business is failing a little, and hires a SEO firm to find out why. The SEO firm compares keywords in her site, to known profiles of other sites, and finds that John's site, and wonder widgets, have a lot in common with the gadgets that Gidget sells. Not caring that they aren't the same product, and each one fills a different, but related niche, they then sell the keywords that John has, to Gidget. Gidget adds these keywords into her site, and her page rank goes up a bit on these words, and John's pagerank gets diluted.
Now John's visits drop, and people are no longer getting helped. Gidget's site gets much more traffic, but she isn't making sales, because people really want wonder widgets, and her drop is sales was due to market saturation of gadgets, not a competing site. Now no one is happy... except the SEO company that has Gidget's money.*
This sort of behavior is in the realm of keyword spamming, it helps no one. Keyword spam turns the internet into a sargassosistic morass of false leads generated by tricked search engines, that just cause more traffic overload, and more confused, and frustrated innocent victims.
Someday, search engines may find a way to stop this, but for now, and until the expiration of P.T. Barnum's Maxim "You can fool all of the people some of the time, some of the people all of the time, but not all of the people all of the time.", and until the invention of decent AI, keyword spam will be a threat. Your best defense is to send the SEO bots packing with something like ZB Block, while welcoming legitimate search bots with open arms.
~Zaphod
P.S. Thanks WY G&F for a title idea. To be honest, it fits!
Edited on: Friday, May 22, 2009 12:27 PM Mountain Daylight Time
Categories: Content Thieves, Scrape Bot, Spam Bot
