Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
email_spider:methods_to_collect_emails [2013-09-05 08:13] svenemail_spider:methods_to_collect_emails [2021-12-28 09:02] (current) – [Use URL as Start] sven
Line 1: Line 1:
 +{{indexmenu_n>1}}
 ====== Methods to collect E-Mails ====== ====== Methods to collect E-Mails ======
  
Line 15: Line 16:
 Use this option if you have already a website where you want to extract emails from. Use this option if you have already a website where you want to extract emails from.
  
-The option „Page has to include“ will check the downloaded website for this keyword and will only continue to parse if the keyword is present on the website.+The option „Page must include“ will check the downloaded website for this keyword and will only continue to parse if the keyword is present on the website.
  
 ==== How deep to parse this site ==== ==== How deep to parse this site ====
  
-This will let you chose how this URL gets parsed and how deep. All links from that same domain get seen as a certain level of sublink from that site. +This will let you choose how this URL gets parsed and how deep. All links from that same domain get seen as a certain level of sublink from that site. 
  
 ==== How deep to parse external sites ==== ==== How deep to parse external sites ====
Line 39: Line 40:
 **%nr%** with numbers that you define. It will add then multiple URLs to parse. Same for the place holder **%string%**. **%nr%** with numbers that you define. It will add then multiple URLs to parse. Same for the place holder **%string%**.
  
 +**%nr%** stands for "number" and will insert a number counting from //start number// to //end number// that you define, where %string% will insert a random generated string.
 +The result are generated URLs that are all getting parsed by the program. You don't need **%string%** in most cases. Just **%nr%** is interesting for you as it can be used to speed things up when you found an URL that holds all the data you need and uses a parameter that is a number. Usually this is a database ID.
 ==== Login Required Websites ==== ==== Login Required Websites ====