Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revisionBoth sides next revision
email_spider:methods_to_collect_emails [2013-09-05 07:55] svenemail_spider:methods_to_collect_emails [2015-07-15 13:28] – external edit 127.0.0.1
Line 1: Line 1:
 +{{indexmenu_n>1}}
 ====== Methods to collect E-Mails ====== ====== Methods to collect E-Mails ======
  
Line 30: Line 31:
 Whatever you plan to parse you can restrict it to one or the other setting. Whatever you plan to parse you can restrict it to one or the other setting.
  
 +==== Place holders ====
 +
 +You can use a special place holders or variable in the URL field. When entering **%nr%** or **%string%** inside of the URL you will be able to define a range of values on how to replace it.
 +
 +{{ :email_spider:email_spider_place_holders.png |}}
 +
 +If you enter e.g. %%http://www.some-site.com%%/?page=**%nr%**, then the program will replace 
 +**%nr%** with numbers that you define. It will add then multiple URLs to parse. Same for the place holder **%string%**.
 +
 +**%nr%** stands for "number" and will insert a number counting from //start number// to //end number// that you define, where %string% will insert a random generated string.
 +The result are generated URLs that are all getting parsed by the program. You don't need **%string%** in most cases. Just **%nr%** is interesting for you as it can be used to speed things up when you found an URL that holds all the data you need and uses a parameter that is a number. Usually this is a database ID.
 +==== Login Required Websites ====
 +
 +Some websites require you to enter a login and password to get to a certain point of interest. There are basically two types to login.
 +
 +=== Auth-Http-Header ===
 +
 +This method is a server based authorization and your browser asks you to enter a login/password before it opens the website. If this is the case, you can use the following format when entering the URL:
 +
 +%%http://%%<color red>login</color>:<color red>password</color>@%%www.some-site.com/page.html%%
 +
 +=== Website-Form ===
 +
 +If you have to open a special website to enter a login and password followed by pressing a button on the page, you have to click on the label "**Requires Login**" as seen in the screenshot below.
 +
 +{{ :email_spider:email_spider_requires_login.png |}}
 +
 +A new form opens where you can simulate the login.
 +
 +{{ :email_spider:email_spider_login_form.png |}}
 +
 +  * Enter the URL click on **1. Parse**
 +  * Choose the login form in the box below (usually detected automatically)
 +  * Fill the fields with login/password
 +  * Click on **2. Submit**
 +
 +A page should open in your browser and hopefully showing some indication that the login was successful. Usually you should see a "Logout" link or button or some welcome message.