80legs.com provides a service for web crawling.
We put over 50,000 computers to work for you to deliver exceptional crawling performance at incredibly low costs. Our service is easy to use and completely customizable, so you can crawl and process web content however you want, whenever you want.
They announce a possibility to process “up to 2 billion web pages per day“. So this opportunity is pretty cool if you want to get and process large amount of data from dozens of web sites. But it will not work well in case when you need to scrape data just from one site. Moreover in this case you could establish a really DDOS attack for that lonely web site. Not very good, isn’t it?
Another question is: what is the nature of their machine network? It maybe not very easy to establish such a big network at once. Even if they could the next problem is to create that network to be spreaded geographically. Because if not they should have a really high bandwidth channels. Maybe they use a zombie machine network? Anyway, it’s just my supposition and nothing else.
One more positive challenge they can propose you is an anonymity for you crawlers. Really if some web site takes care about forbidden of scrapping its own pages 80legs network could broke this at once. Because if each machine in their network has its own IP address it would be look like just dozens of common human-like users! Thus if you have some programming skills and some money you can establish highly secured and anonymous access to the WWW.
Posted by olexii 