<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Olexii&#039;s Blog</title>
	<atom:link href="http://olexii.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://olexii.wordpress.com</link>
	<description>Intelligent monsoon</description>
	<lastBuildDate>Mon, 09 Nov 2009 14:16:02 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='olexii.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Olexii&#039;s Blog</title>
		<link>http://olexii.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://olexii.wordpress.com/osd.xml" title="Olexii&#039;s Blog" />
	<atom:link rel='hub' href='http://olexii.wordpress.com/?pushpress=hub'/>
		<item>
		<title>80legs</title>
		<link>http://olexii.wordpress.com/2009/11/09/80legs/</link>
		<comments>http://olexii.wordpress.com/2009/11/09/80legs/#comments</comments>
		<pubDate>Mon, 09 Nov 2009 14:11:12 +0000</pubDate>
		<dc:creator>olexii</dc:creator>
				<category><![CDATA[anonymous internet]]></category>
		<category><![CDATA[crawlers]]></category>
		<category><![CDATA[crawler]]></category>

		<guid isPermaLink="false">http://olexii.wordpress.com/?p=16</guid>
		<description><![CDATA[80legs.com provides a service for web crawling. We put over 50,000 computers to work for you to deliver exceptional crawling performance at incredibly low costs. Our service is easy to use and completely customizable, so you can crawl and process web content however you want, whenever you want. They announce a possibility to process &#8220;up [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=olexii.wordpress.com&amp;blog=10350685&amp;post=16&amp;subd=olexii&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a title="80legs" href="http://www.80legs.com/" target="_self">80legs.com</a> provides a service for web crawling.</p>
<blockquote><p>We put over 50,000 computers to work for you to deliver exceptional crawling performance at incredibly low costs. Our service is easy to use and completely customizable, so you can crawl and process web content however you want, whenever you want.</p></blockquote>
<p>They announce a possibility to process &#8220;<em>up to 2 billion web pages per day</em>&#8220;. So this opportunity is pretty cool if you want to get and process large amount of data from dozens of web sites. But it will not work well in case when you need to scrape data just from one site. Moreover in this case you could establish a really DDOS attack for that lonely web site.  Not very good, isn&#8217;t it?</p>
<p>Another question is: what is the nature of their machine network? It maybe not very easy to establish such a big network at once. Even if they could the next problem is to create that network to be spreaded geographically. Because if not they should have a really high bandwidth channels. Maybe they use a zombie machine network? Anyway, it&#8217;s just my supposition and nothing else.</p>
<p>One more positive challenge they can propose you is an anonymity for you crawlers. Really if some web site takes care about forbidden of scrapping its own pages 80legs network could broke this at once. Because if each machine in their network has its own IP address it would be look like just dozens of common human-like users! Thus if you have some programming skills and some money you can establish highly secured and anonymous access to the WWW.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/olexii.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/olexii.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/olexii.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/olexii.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/olexii.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/olexii.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/olexii.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/olexii.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/olexii.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/olexii.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/olexii.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/olexii.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/olexii.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/olexii.wordpress.com/16/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=olexii.wordpress.com&amp;blog=10350685&amp;post=16&amp;subd=olexii&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://olexii.wordpress.com/2009/11/09/80legs/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/a3c629d24d6bcf57065c7dc9e89a8f58?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">olexii</media:title>
		</media:content>
	</item>
		<item>
		<title>Tor project</title>
		<link>http://olexii.wordpress.com/2009/11/09/tor-project/</link>
		<comments>http://olexii.wordpress.com/2009/11/09/tor-project/#comments</comments>
		<pubDate>Mon, 09 Nov 2009 13:26:20 +0000</pubDate>
		<dc:creator>olexii</dc:creator>
				<category><![CDATA[anonymous internet]]></category>
		<category><![CDATA[tor]]></category>

		<guid isPermaLink="false">http://olexii.wordpress.com/?p=9</guid>
		<description><![CDATA[Tor project is yet another solution to get anonymity in the WWW.  Here what they say: Tor is free software and an open network that helps you defend against a form of network surveillance that threatens personal freedom and privacy, confidential business activities and relationships. Tor protects you by bouncing your communications around a distributed [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=olexii.wordpress.com&amp;blog=10350685&amp;post=9&amp;subd=olexii&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a title="tor project" href="http://www.torproject.org/" target="_self">Tor project</a> is yet another solution to get anonymity in the WWW.  Here what they say:</p>
<blockquote><p>Tor is free software and an open network that helps you defend against a form of network surveillance that threatens personal freedom and privacy, confidential business activities and relationships.</p>
<p>Tor protects you by bouncing your communications around a distributed network of relays run by volunteers all around the world: it prevents somebody watching your Internet connection from learning what sites you visit, and it prevents the sites you visit from learning your physical location. Tor works with many of your existing applications, including web browsers, instant messaging clients, remote login, and other applications based on the TCP protocol.</p></blockquote>
<p>I have never been used Tor for myself. But idea as I get it is very good and approach of its realization is interesting. You can read more about it on <a title="Tor overview" href="http://www.torproject.org/overview.html.en" target="_self">Tor&#8217;s overview page</a>. In few words it works like this. There is a set of computers that have installed Tor and all of them are united in one network. Instead of establish direct connection to some web site you can use that network. Thus you request and site&#8217;s respond go through the network of several computers. Each time you request another web page the overall path is changing. So it looks like the fluid proxy network.</p>
<div id="attachment_11" class="wp-caption aligncenter" style="width: 428px"><img class="size-full wp-image-11" title="Tor work process" src="http://olexii.files.wordpress.com/2009/11/htw2.png?w=418&#038;h=267" alt="Tor work process" width="418" height="267" /><p class="wp-caption-text">Tor work process</p></div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/olexii.wordpress.com/9/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/olexii.wordpress.com/9/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/olexii.wordpress.com/9/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/olexii.wordpress.com/9/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/olexii.wordpress.com/9/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/olexii.wordpress.com/9/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/olexii.wordpress.com/9/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/olexii.wordpress.com/9/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/olexii.wordpress.com/9/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/olexii.wordpress.com/9/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/olexii.wordpress.com/9/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/olexii.wordpress.com/9/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/olexii.wordpress.com/9/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/olexii.wordpress.com/9/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=olexii.wordpress.com&amp;blog=10350685&amp;post=9&amp;subd=olexii&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://olexii.wordpress.com/2009/11/09/tor-project/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/a3c629d24d6bcf57065c7dc9e89a8f58?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">olexii</media:title>
		</media:content>

		<media:content url="http://olexii.files.wordpress.com/2009/11/htw2.png" medium="image">
			<media:title type="html">Tor work process</media:title>
		</media:content>
	</item>
		<item>
		<title>SwissVPN</title>
		<link>http://olexii.wordpress.com/2009/11/09/swissvpn/</link>
		<comments>http://olexii.wordpress.com/2009/11/09/swissvpn/#comments</comments>
		<pubDate>Mon, 09 Nov 2009 13:11:05 +0000</pubDate>
		<dc:creator>olexii</dc:creator>
				<category><![CDATA[anonymous internet]]></category>
		<category><![CDATA[vpn]]></category>

		<guid isPermaLink="false">http://olexii.wordpress.com/?p=4</guid>
		<description><![CDATA[SwissVPN is a service that provides VPN access to the World Wide Web. This is quite easy and comfortable way to get to the Internet anonymously. And on the other hand is quite cheap (just 5$ per month). Every time you&#8217;re establishing connection to that service it gives you distinct IP address. I used it [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=olexii.wordpress.com&amp;blog=10350685&amp;post=4&amp;subd=olexii&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a title="SwissVPN" href="http://www.swissvpn.net/" target="_blank">SwissVPN</a> is a service that provides VPN access to the World Wide Web. This is quite easy and comfortable way to get to the Internet anonymously. And on the other hand is quite cheap (just 5$ per month). Every time you&#8217;re establishing connection to that service it gives you distinct IP address.</p>
<p>I used it for several crawler projects. If for instance web site uses captcha after intensice usage to ensure that a user is a human one, you can allow your robot to reconnect to that site through another VPN connection and get another IP address.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/olexii.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/olexii.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/olexii.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/olexii.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/olexii.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/olexii.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/olexii.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/olexii.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/olexii.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/olexii.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/olexii.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/olexii.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/olexii.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/olexii.wordpress.com/4/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=olexii.wordpress.com&amp;blog=10350685&amp;post=4&amp;subd=olexii&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://olexii.wordpress.com/2009/11/09/swissvpn/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/a3c629d24d6bcf57065c7dc9e89a8f58?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">olexii</media:title>
		</media:content>
	</item>
	</channel>
</rss>
