GSA Proxy Scraper as stand alone product can find you a lot more proxies in a shorter time than the basic scraper included in other products. In detail…
Please try the demo version of GSA Proxy Scraper and see for yourself.
All GSA Tools can use so called CONNECT-Proxies. Those proxies are usually anonymous and work perfectly. Other tools do not understand this type of proxy and use it as a normal WEB-Proxy. Sometimes the proxy does not work here or it reacts differently (no longer anonymous, blocked by google, requires authorization…). If you want to rule out this problem, you need to uncheck the testing for CONNECT proxies in options->filter.
Tools like ScrapeBox can not handle CONNECT proxies and so you will see different results when using them in that software.
There was something strange that I discovered when building the tool. When the program searches for new sources, it could happen that it found proxies who are on port 80 but only seem to work for bing.com. You see them in your list with that TAG and the anonymous level been gray it it could not check that.
I have checked that closely and really was not able to test that proxy successfully against anything else but bing. Sometimes I was able to access facebook.com or ebay.com with it. Even Twitter.com sometimes but thats it. Those proxies are only useful if you want to get access to those sites.
So what type is that I asked myself and opened that IP directly in browser. This is what you get:
Invalid URL The requested URL "[no URL]", is invalid. Reference #9.5c957441.1464336353.8984bf0
Having a look at the header showed:
HTTP/1.0 400 Bad Request Server: AkamaiGHost Mime-Version: 1.0 Content-Type: text/html Content-Length: 208 Expires: Fri, 27 May 2016 07:54:48 GMT Date: Fri, 27 May 2016 07:54:48 GMT Connection: close
OK so Server: AkamaiGHost. This is strange as I never heard anything about this. I searched for information about it and YES! Wikipedia had something for me. And the last sentence on this should get you worried:
The National Security Agency and Federal Bureau of Investigation have reportedly used Facebook's Akamai content delivery network (CDN) to collect information on Facebook users. Akamai has been accused of blocking access to web sites for visitors using Tor.
You better not use those sites for anything other than bing scraping. I would never use that for facebook and you should also stay away from that. To not use them in export files, just uncheck UNKNOWN as anonymous level and you rule them out.
Well yes, thats possible. However keep in mind that proxies are still public and might be banned quickly. To get just proxies being able to be used for E-Mail Verification do the following:
Now you will see the program searching for proxies and testing it only for email verification usage. The proxies can later be exported to a file and e.g. imported to GSA Email Verifier (if you still own it).
Open Regedit.exe and go to
HKEY_LOCAL_MACHINE → SYSTEM → CurrentControlSet → services → Tcpip → Parameters
and change settings…
TcpTimedWaitDelay - 30 (decimal)
MaxUserPort - 65534 (decimal)
This may sound more complicated as it really is. There are nice programs like sockscap that can hook up on any executable and add proxy support for it. Together with GSA Proxy Scraper, you can easily have rotating proxies for any application.
The product scans for proxies (IP with a port behind) on all kinds of different websites. Scraping for proxies is usually not a problem, but later the testing counld turn to one. Some really stupid VPS or even ISP providers think that this testing is some kind of a port / net scan. This is of course not the case and the program tries to randomize this testing as best as possible. However it can happen and if you get into such situation, you might need to use a proxy for testing and maybe also for scraping. That can be turned on with two options as seen below:
A public proxy is one that is listed on free accessible websites where everyone can see and use that proxy. That is also the reason why public proxies are usually slow and unstable as you don't know how many people are using it and if it. Also keep in mind that a public proxy is often banned on certain websites where proxies would help to get data faster (e.g. google or other search engines).
A private proxy however is the total opposite to a public proxy. It's usually accessible only to the person who bought this proxy and is also protected by a login/password. If it's not protected, then you usually have to define an IP-Range from where you plant to use it on the proxy providers web interface. Due to the nature that only you can access this proxy, it is considered to be very fast and not blocked on search engines.
There are many other proxy types among private proxies such as:
No, all proxies found with GSA Proxy Scraper are considered to be public proxies as everyone can see them on the webpages where it is scraping from. However, there is a way to get proxies not listed anywhere and thats by using the included Proxy Scanner. With that tool you can find open proxy servers from a given IP- and Port-Range by scanning, probing and testing.