Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revisionBoth sides next revision
proxy_scraper:url_metrics [2016-05-18 19:01] svenproxy_scraper:url_metrics [2020-05-12 14:46] sven
Line 1: Line 1:
 ====== URL Metrics Scanner ====== ====== URL Metrics Scanner ======
  
-This tool allows you to get details about imported Domains or URLs. It shows you all kinds of details that [[https://moz.com/products/api|Mozscape]] and [[https://majestic.com/|Majestics]] offer. You can access this module by clicking the Tools button on the main dialog:+This tool allows you to get details about imported Domains or URLs. It shows you all kinds of details. You can access this module by clicking the Tools button on the main dialog:
  
 {{ :proxy_scraper:ps_tools.png |}} {{ :proxy_scraper:ps_tools.png |}}
Line 13: Line 13:
 After resolving the data, you can export or filter them. After resolving the data, you can export or filter them.
  
-Please note that unlike other tools, **this module does not require any accounts** to get the data. All you need are proxies which the program is of course delivering to you.+Here is a list of all possible metrics you can extract:
  
-At startup it might be a bit slow as the software has to find proxies that would work. But once found they get cached and processing should be faster. Even after a program restart it should be faster.+===== Metrics accessible without any API-Key ===== 
 + 
 +^Column^Description^ 
 +|Yandex TIC URL|Yandex TIC for URL| 
 +|Yandex TIC SubDomain|Yandex TIC for SubDomain| 
 +|Yandex TIC RootDomain|Yandex TIC for RootDomain| 
 +|Yandex AGS SubDomain|Yandex TIC for SubDomain| 
 +|Yandex AGS RootDomain|Yandex TIC for RootDomain| 
 +|Alexa Traffic Rank|Alexa Traffic Rank| 
 +|WebArchive Date|Oldest Webarchive Date| 
 +|URL Status|Informations about the URL| 
 +|Google Indexed Pages|How many URLs are indexed for this SubDomain| 
 +|Google Indexed URL|Is this URLs indexed in Google?| 
 +|Country|Country that this website is for| 
 +|Language|Language that this website is in| 
 + 
 +===== Metrics accessible only via PR Emulation Software ===== 
 + 
 +^Column^Description^ 
 +|PR URL|PageRank by Google for URL| 
 +|PR SubDomain|PageRank by Google for SubDomain| 
 +|PR RootDomain|PageRank by Google for RootDomain| 
 + 
 +===== Majestic (requires API-key) =====
  
-Here is a list of all possible metrics you can extract: 
 ^Column^Description^ ^Column^Description^
-|Title|The title of the page, if available| 
-|Canonical URL|The canonical form of the URL| 
-|Subdomain|The subdomain of the URL (for example, blog.moz.com)| 
-|Root Domain|The root domain of the URL (for example, moz.com)| 
-|External Equity Links|The number of external equity links to the URL| 
-|Subdomain External Links|The number of external equity links to the subdomain of the URL| 
-|Root Domain External Links|The number of external equity links to the root domain of the URL| 
-|Equity Links|The number of equity links (internal or external) to the URL| 
-|Subdomains Linking|The number of subdomains with any pages linking to the URL| 
-|Root Domains Linking|The number of root domains with any pages linking to the URL| 
-|Links|The number of links (equity or nonequity or not, internal or external) to the URL| 
-|Subdomain, Subdomains Linking|The number of subdomains with any pages linking to the subdomain of the URL| 
-|Root Domain, Root Domains Linking|The number of root domains with any pages linking to the root domain of the URL| 
-|MozRank: URL|The MozRank of the URL, in the normalized 10-point score| 
-|MozRank: URL (RAW)|The MozRank of the URL, in the raw score| 
-|MozRank: Subdomain|The MozRank of the URLs subdomain, in normalized 10-point score | 
-|MozRank: Subdomain (RAW)|The MozRank of the URLs subdomain, in the raw score| 
-|MozRank: Root Domain|The MozRank of the URLs root domain, in both the normalized 10-point score| 
-|MozRank: Root Domain (RAW)|The MozRank of the URLs root domain, in the raw score| 
-|MozTrust|The MozTrust of the URL, in the normalized 10-point| 
-|MozTrust (RAW)|The MozTrust of the URL, in the raw score| 
-|MozTrust: Subdomain|The MozTrust of the subdomain of the URL, in the normalized 10-point score| 
-|MozTrust: Subdomain (RAW)|The MozTrust of the subdomain of the URL, in the raw score| 
-|MozTrust: Root Domain|The MozTrust of the root domain of the URL, in the normalized 10-point score| 
-|MozTrust: Root Domain (RAW)|The MozTrust of the root domain of the URL, in the raw score| 
-|MozRank: External Equity|The fraction of the URLs MozRank derived solely from external links, in the normalized 10-point score| 
-|MozRank: External Equity (RAW)|The fraction of the URLs MozRank derived solely from external links, in the raw score| 
-|MozRank: Subdomain, External Equity|The fraction, derived solely from external links, of the composite MozRank of all pages on the URLs subdomain, in the normalized 10-digit score| 
-|MozRank: Subdomain, External Equity (RAW)|The fraction, derived solely from external links, of the composite MozRank of all pages on the URLs subdomain, in the raw score| 
-|MozRank: Root Domain, External Equity|The fraction, derived solely from external links, of the composite MozRank of all pages on the URLs root domain, in the normalized 10-digit score| 
-|MozRank: Root Domain, External Equity (RAW)|The fraction, derived solely from external links, of the composite MozRank of all pages on the URLs root domain, in the raw source| 
-|MozRank: Subdomain Combined|The combined MozRank of all pages on the root domain, in the normalized 10-point score| 
-|MozRank: Subdomain Combined (RAW)|The combined MozRank of all pages on the root domain, in both the raw score| 
-|MozRank: Root Domain Combined|The combined MozRank of all pages on the subdomain, in the normalized 10-point score| 
-|MozRank: Root Domain Combined (RAW)|The combined MozRank of all pages on the subdomain, in both the raw score| 
-|Subdomain Spam Score|Spam score for the pages subdomain| 
-|Spam Flags|Bit field of triggered spam flags| 
-|Language|Language of the subdomain| 
-|HTTP Status Code|HTTP status code of the spam crawl| 
-|Last Crawl|Epoch time when the subdomain was last crawled| 
-|Spam Crawl Pages|List of pages used for the subdomains spam crawl| 
-|Facebook account|Facebook account| 
-|Twitter handle|Twitter handle| 
-|Google+ account|Google+ account| 
-|Email address|Email address| 
-|HTTP Status Code|The HTTP status code recorded by Mozscape for this URL, if available| 
-|Links to Subdomain|The total number of links (including internal and nofollow links) to the subdomain of the URL| 
-|Links to Root Domain|The total number of links, including internal and nofollow links, to the root domain of the URL| 
-|Root Domains Linking to Subdomain|The number of root domains with at least one link to the subdomain of the URL| 
-|Page Authority|A normalized 100-point score representing the likelihood of a page to rank well in search engine results| 
-|Domain Authority|A normalized 100-point score representing the likelihood of a domain to rank well in search engine results| 
-|External links|The number of external links to the URL, including nofollowed links| 
-|External links to subdomain|The number of external links to the subdomain, including nofollowed links| 
-|External links to root domain|The number of external links to the root domain, including nofollowed links| 
-|Linking C Blocks|The number of links from the same C class IP addresses.| 
-|Time last crawled|The time and date on which Mozscape last crawled the URL, returned in Unix epoch format| 
 |TrustFlow URL|Number predicting how trustworthy a page is based on how trustworthy sites tend to link to trustworthy neighbors| |TrustFlow URL|Number predicting how trustworthy a page is based on how trustworthy sites tend to link to trustworthy neighbors|
 |TrustFlow SubDomain|Number predicting how trustworthy a page is based on how trustworthy sites tend to link to trustworthy neighbors| |TrustFlow SubDomain|Number predicting how trustworthy a page is based on how trustworthy sites tend to link to trustworthy neighbors|
Line 87: Line 53:
 |RefDomains SubDomain|Number of unique domains linking to this SubDomain| |RefDomains SubDomain|Number of unique domains linking to this SubDomain|
 |RefDomains RootDomain|Number of unique domains linking to this RootDomain| |RefDomains RootDomain|Number of unique domains linking to this RootDomain|
-|Yandex TIC URL|Yandex TIC for URL| +|ACRank RootDomain|CitationFlow like Rank (but less relevant)| 
-|Yandex TIC SubDomain|Yandex TIC for SubDomain| +|ACRank SubDomain|CitationFlow like Rank (but less relevant)| 
-|Yandex TIC RootDomain|Yandex TIC for RootDomain| +|ACRank URL|CitationFlow like Rank (but less relevant)| 
-|SEMRush Rank-SubDomain|SEMRush Rank for SubDomain| +|CrawledFlag RootDomain|Was it actually crawled| 
-|SEMRush Rank-RootDomain|SEMRush Rank for RootDomain+|CrawledFlag SubDomain|Was it actually crawled| 
-|SEMRush Cost-SubDomain|how much need to spend for get same number of visitors from PPC for SubDomain| +|CrawledFlag URL|Was it actually crawled
-|SEMRush Cost-RootDomain|how much need to spend for get same number of visitors from PPC for RootDomain| +|CrawledURLs RootDomain|| 
-|SEMRush Traffic-SubDomain|estimated number of visitors coming from search engines for SubDomain| +|CrawledURLs SubDomain|
-|SEMRush Traffic-RootDomain|estimated number of visitors coming from search engines for RootDomain+|CrawledURLs URL|| 
-|SEMRush Keywords-SubDomain|number of ranked keywords in google ranking for SubDomain+|ExtBackLinksEDU RootDomain|Number of external .EDU/.AC/.EDU.xx domains| 
-|SEMRush Keywords-RootDomain|number of ranked keywords in google ranking for RootDomain+|ExtBackLinksEDU SubDomain|Number of external .EDU/.AC/.EDU.xx domains
-|Alexa Traffic Rank|Alexa Traffic Rank|+|ExtBackLinksEDU URL|Number of external .EDU/.AC/.EDU.xx domains| 
 +|ExtBackLinksGOV RootDomain|Number of external .GOV/.MIL/.GOV.xx domains.| 
 +|ExtBackLinksGOV SubDomain|Number of external .GOV/.MIL/.GOV.xx domains.| 
 +|ExtBackLinksGOV URL|Number of external .GOV/.MIL/.GOV.xx domains.| 
 +|FinalRedirectResult RootDomain|Describes the response from attempting to crawl the page after all redirects are followed
 +|FinalRedirectResult SubDomain|Describes the response from attempting to crawl the page after all redirects are followed| 
 +|FinalRedirectResult URL|Describes the response from attempting to crawl the page after all redirects are followed| 
 +|IndexedURLs RootDomain|Amount of indexted URLs| 
 +|IndexedURLs SubDomain|Amount of indexted URLs
 +|IndexedURLs URL|Amount of indexted URLs| 
 +|Language RootDomain|For URLs, this is the language code for the source page. For subdomains and root domains, this is the detected languages for each page on that domain. This may be a comma delimited string to support multiple languages (i.e. en,de).
 +|Language SubDomain|For URLs, this is the language code for the source page. For subdomains and root domains, this is the detected languages for each page on that domain. This may be a comma delimited string to support multiple languages (i.e. en,de).| 
 +|Language URL|For URLs, this is the language code for the source page. For subdomains and root domains, this is the detected languages for each page on that domain. This may be a comma delimited string to support multiple languages (i.e. en,de).| 
 +|LanguageConfidence RootDomain|Percentages indicating the confidence of the language. This may be a comma delimited string to support multiple languages (i.e. 90,80).| 
 +|LanguageConfidence SubDomain|Percentages indicating the confidence of the language. This may be a comma delimited string to support multiple languages (i.e. 90,80).| 
 +|LanguageConfidence URL|Percentages indicating the confidence of the language. This may be a comma delimited string to support multiple languages (i.e. 90,80).| 
 +|LanguageDesc RootDomain|This is the English name of the language codes. This may be a comma delimited string to support multiple languages (i.e. English,German).| 
 +|LanguageDesc SubDomain|This is the English name of the language codes. This may be a comma delimited string to support multiple languages (i.e. English,German).| 
 +|LanguageDesc URL|This is the English name of the language codes. This may be a comma delimited string to support multiple languages (i.e. English,German).| 
 +|LanguagePageRatios RootDomain|| 
 +|LanguagePageRatios SubDomain|| 
 +|LanguagePageRatios URL|| 
 +|LanguageTotalPages RootDomain|| 
 +|LanguageTotalPages SubDomain|| 
 +|LanguageTotalPages URL|| 
 +|LastCrawlDate RootDomain|Date when it was crawled| 
 +|LastCrawlDate SubDomain|Date when it was crawled| 
 +|LastCrawlDate URL|Date when it was crawled| 
 +|LastCrawlResult RootDomain|Status code when being crawled| 
 +|LastCrawlResult SubDomain|Status code when being crawled| 
 +|LastCrawlResult URL|Status code when being crawled| 
 +|LastSeen RootDomain|| 
 +|LastSeen SubDomain|| 
 +|LastSeen URL|| 
 +|NonUniqueLinkTypeDeleted RootDomain|| 
 +|NonUniqueLinkTypeDeleted SubDomain|| 
 +|NonUniqueLinkTypeDeleted URL|| 
 +|NonUniqueLinkTypeFrame RootDomain|| 
 +|NonUniqueLinkTypeFrame SubDomain|| 
 +|NonUniqueLinkTypeFrame URL|| 
 +|NonUniqueLinkTypeHomepages RootDomain|| 
 +|NonUniqueLinkTypeHomepages SubDomain|| 
 +|NonUniqueLinkTypeHomepages URL|| 
 +|NonUniqueLinkTypeImageLink RootDomain|| 
 +|NonUniqueLinkTypeImageLink SubDomain|| 
 +|NonUniqueLinkTypeImageLink URL|| 
 +|NonUniqueLinkTypeIndirect RootDomain|| 
 +|NonUniqueLinkTypeIndirect SubDomain|| 
 +|NonUniqueLinkTypeIndirect URL|| 
 +|NonUniqueLinkTypeNoFollow RootDomain|| 
 +|NonUniqueLinkTypeNoFollow SubDomain|| 
 +|NonUniqueLinkTypeNoFollow URL|| 
 +|NonUniqueLinkTypeProtocolHTTPS RootDomain|| 
 +|NonUniqueLinkTypeProtocolHTTPS SubDomain|| 
 +|NonUniqueLinkTypeProtocolHTTPS URL|| 
 +|NonUniqueLinkTypeRedirect RootDomain|| 
 +|NonUniqueLinkTypeRedirect SubDomain|| 
 +|NonUniqueLinkTypeRedirect URL|| 
 +|NonUniqueLinkTypeTextLink RootDomain|| 
 +|NonUniqueLinkTypeTextLink SubDomain|| 
 +|NonUniqueLinkTypeTextLink URL|| 
 +|OutDomainsExternal RootDomain|For URLs, this is the number of outgoing links to unique domains for that page. For subdomains and root domains, average numbers are shown.| 
 +|OutDomainsExternal SubDomain|For URLs, this is the number of outgoing links to unique domains for that page. For subdomains and root domains, average numbers are shown.
 +|OutDomainsExternal URL|For URLs, this is the number of outgoing links to unique domains for that page. For subdomains and root domains, average numbers are shown.| 
 +|OutLinksExternal RootDomain|For URLs, this is the number of outgoing links for that page. For subdomains and root domains, average numbers are shown.| 
 +|OutLinksExternal SubDomain|For URLs, this is the number of outgoing links for that page. For subdomains and root domains, average numbers are shown.| 
 +|OutLinksExternal URL|For URLs, this is the number of outgoing links for that page. For subdomains and root domains, average numbers are shown.| 
 +|OutLinksInternal RootDomain|For URLs, this is the number of internal links for that page. For subdomains and root domains, average numbers are shown.
 +|OutLinksInternal SubDomain|For URLs, this is the number of internal links for that page. For subdomains and root domains, average numbers are shown.| 
 +|OutLinksInternal URL|For URLs, this is the number of internal links for that page. For subdomains and root domains, average numbers are shown.| 
 +|OutLinksPages RootDomain|| 
 +|OutLinksPages SubDomain|
 +|OutLinksPages URL|| 
 +|RedirectFlag RootDomain|was it redirecting| 
 +|RedirectFlag SubDomain|was it redirecting| 
 +|RedirectFlag URL|was it redirecting| 
 +|RedirectTo RootDomain|Final redirecting URL| 
 +|RedirectTo SubDomain|Final redirecting URL| 
 +|RedirectTo URL|Final redirecting URL| 
 +|RefDomainsEDU RootDomain|Number of referring .EDU/.AC/.EDU.xx domains| 
 +|RefDomainsEDU SubDomain|Number of referring .EDU/.AC/.EDU.xx domains| 
 +|RefDomainsEDU URL|Number of referring .EDU/.AC/.EDU.xx domains| 
 +|RefDomainsGOV RootDomain|Number of referring .GOV/.MIL/.GOV.xx domains.| 
 +|RefDomainsGOV SubDomain|Number of referring .GOV/.MIL/.GOV.xx domains.| 
 +|RefDomainsGOV URL|Number of referring .GOV/.MIL/.GOV.xx domains.| 
 +|RefDomainTypeDirect RootDomain|| 
 +|RefDomainTypeDirect SubDomain|| 
 +|RefDomainTypeDirect URL|| 
 +|RefDomainTypeFollow RootDomain|| 
 +|RefDomainTypeFollow SubDomain|| 
 +|RefDomainTypeFollow URL|| 
 +|RefDomainTypeHomepageLink RootDomain|| 
 +|RefDomainTypeHomepageLink SubDomain|| 
 +|RefDomainTypeHomepageLink URL|| 
 +|RefDomainTypeLive RootDomain|| 
 +|RefDomainTypeLive SubDomain|| 
 +|RefDomainTypeLive URL|| 
 +|RefDomainTypeProtocolHTTPS RootDomain|| 
 +|RefDomainTypeProtocolHTTPS SubDomain|| 
 +|RefDomainTypeProtocolHTTPS URL|| 
 +|RefIPs RootDomain|Number of referring IP addresses pointing to this URL| 
 +|RefIPs SubDomain|Number of referring IP addresses pointing to this URL| 
 +|RefIPs URL|Number of referring IP addresses pointing to this URL| 
 +|RefLanguage RootDomain|| 
 +|RefLanguage SubDomain|| 
 +|RefLanguage URL|| 
 +|RefLanguageConfidence RootDomain|| 
 +|RefLanguageConfidence SubDomain|| 
 +|RefLanguageConfidence URL|| 
 +|RefLanguageDesc RootDomain|| 
 +|RefLanguageDesc SubDomain|| 
 +|RefLanguageDesc URL|| 
 +|RefLanguagePageRatios RootDomain|| 
 +|RefLanguagePageRatios SubDomain|| 
 +|RefLanguagePageRatios URL|| 
 +|RefLanguageTotalPages RootDomain|| 
 +|RefLanguageTotalPages SubDomain|| 
 +|RefLanguageTotalPages URL|| 
 +|RefSubNets RootDomain|Number of referring IP Class C subnets pointing to this URL| 
 +|RefSubNets SubDomain|Number of referring IP Class C subnets pointing to this URL| 
 +|RefSubNets URL|Number of referring IP Class C subnets pointing to this URL| 
 +|RootDomainIPAddress RootDomain|| 
 +|RootDomainIPAddress SubDomain|| 
 +|RootDomainIPAddress URL|| 
 +|TotalNonUniqueLinks RootDomain|| 
 +|TotalNonUniqueLinks SubDomain|| 
 +|TotalNonUniqueLinks URL|| 
 +|TrustMetric RootDomain|| 
 +|TrustMetric SubDomain|| 
 +|TrustMetric URL|| 
 + 
 +===== DomDetailer (requires API-Key) ===== 
 + 
 +^Column^Description^ 
 +|DD MozLinks|The number of links (equity or nonequity or not, internal or external) to the URL| 
 +|DD MozPage Authority|A normalized 100-point score representing the likelihood of a page to rank well in search engine results| 
 +|DD MozDomain Authority|A normalized 100-point score representing the likelihood of a domain to rank well in search engine results| 
 +|DD MozRank|The MozRank of the URL, in the normalized 10-point score| 
 +|DD MozTrust|The MozTrust of the URL, in the normalized 10-point| 
 +|DD MozSpam|Bit field of triggered spam flags| 
 +|DD Facebook Comments|Facebook Comments| 
 +|DD Facebook Shares|Facebook Shares| 
 +|DD Google+ Shares|Google+ shares| 
 +|DD Pinterest Pins|pinterest shares| 
 +|DD LinkedIn Shares|LinkedIn shares| 
 +|DD majesticLinks|Number of backlinks to this URL| 
 +|DD majesticRefDomains|Number of unique domains linking to this URL| 
 +|DD majesticRefDomainsEDU|Number of unique .edu domains linking to this URL| 
 +|DD majesticRefDomainsGOV|Number of unique .gov domains linking to this URL| 
 +|DD majesticRefSubnets|Number of unique subnets linking to this URL| 
 +|DD majesticTrustFlow|Number predicting how trustworthy a page is based on how trustworthy sites tend to link to trustworthy neighbors| 
 +|DD majesticCitationFlow|Number of predicting how influential a URL might be based on how many sites link to it| 
 +|DD majesticCat|Suggested Category of this site.| 
 + 
 +===== SEORank (needs API-Key) ===== 
 + 
 +^Column^Description^ 
 +|SEORank Domain Authority|number from 1 to 100 (higher - better) provided by Moz and predicts how DOMAIN will be ranked on search engines, DA linked with link counts, MozRank, MozTrust scores and other data.
 +|SEORank Page Authority|number from 1 to 100 (higher better) provided by Moz and predicts how URL will be ranked on search engines, PA linked with link counts, MozRank, MozTrust and other data.| 
 +|SEORank Moz Rank|represents a link popularity score that showing the importance of given web page on the Internet.| 
 +|SEORank Links In|number of links (equity or nonequity or not, internal or external) to the URL (data provided by Moz).| 
 +|SEORank External Equity Links|number of external equity links to the URL (data provided by Moz).| 
 +|SEORank Alexa Rank|global Alexa rank of domain.| 
 +|SEORank ALexa Links number|number of links to domain.| 
 +|SEORank Alexa Country|ISO2 code of country where domain is the most popular.| 
 +|SEORank Alexa Country Rank|Alexa Rank of domain in country where it is popular.| 
 +|SEORank SemRush Domain|domain that was taken by SemRush from url for analysis.
 +|SEORank SemRush Rank|rank of domain according to SemRush.| 
 +|SEORank SemRush Keywords number|number of keywords where site in Google
 +|SEORank SemRush Traffic|number of users expected to visit the website during the following month.| 
 +|SEORank SemRush Costs|estimated price of organic keywords in Google AdWords.| 
 +|SEORank SemRush URL Links number|number of links to URL according to SemRush.| 
 +|SEORank SemRush HOSTNAME Links number|number of links to HOSTNAME.| 
 +|SEORank SemRush DOMAIN Links number|number of links to SemRush Domain .| 
 +|SEORank Spam Score|number from 0 to 17, highest number means highest percent of sites that contains link to url but penalized or banned by Google.| 
 +|SEORank Citation Flow|number from 0 to 100 (higher is better), it displaying how influential a URL might be based on how many sites link to it.| 
 +|SEORank Trust Flow|number from 0 to 100 (higher is better), it displaying how trustworthy a page is based on how trustworthy sites tend to link to trustworthy neighbors.| 
 +|SEORank ExtBackLinks|number of external backlinks to current URL (data provided by Majestic).| 
 +|SEORank Refered Domains|number of domains with pages that refered to url|
  
 ====== Google-PR Emulation ====== ====== Google-PR Emulation ======