TOR is a group of volunteer servers that allows people to access internet from different ip, therefore this is useful for programmers whom looking to use different IP’s for their apps or requests to be private or to avoid blocking.
Web scraping is a lot of fun, but make sure you are following the commonly accepted rules of web scraping:
- Make sure you’re following the target site’s Terms of Service. This means respecting
robots.txtand any other restrictions there may be.
- Limit your requests. Scraping bots can navigate webpages much faster than normal humans, and you don’t want to accidentally DOS a site with an out of control scraper.
- Be nice to the server. If you don’t need images, modify your scraper so it doesn’t download images. If you want to be really nice to the server, put your e-mail address in the scraper’s HTTP headers so the server admin can contact you if your scraper is giving them a problem.
However in this blog i would share the steps to run your Node js app with TOR and ability to change IP in the run time.
First of all we have to make sure that we have Nodejs version 6, and if we don’t we can upgrade the current version using the following code
curl -sL https://deb.nodesource.com/setup_6.x | sudo -E bash -
sudo apt-get install -y nodejs
Update the last NPM
sudo npm install npm@latest -g
Install TOR and configuration
we install the tor service using the following code
sudo apt-get install tor
also install the tor request library for Node JS
sudo npm install tor-request
Generate the hash password by running the following code
tor --hash-password mypassword
The last line of the output contains the hash password that you copy paste into torrc
So, update the torrc with the port and the hashed password
sudo nano ./etc/tor/torrc
Update the port and the insert the hashed control password
Restart the TOR
Now TOR should be up and running, we should write our Node js app.
Node JS App
the follow app present a quick view of how we can request and change ip in the run time
Till the next blog, Enjoy !