FAQ

GSA Proxy Scraper as stand alone product can find you a lot more proxies in a shorter time than the basic scraper included in other products. In detail…

  • It also has the ability to monitor the found proxies in a better way to separate the good from the bad/unstable proxies.
  • It was a lot more sources to scrape from where the basic scraper is unable to find the proxies.
  • It has it's own internal proxy server that can be used as an rotating proxy server.
  • It comes with a proxy/port scanner to find proxies listed nowhere else.
  • It comes with a PR-Emulation module included.
  • It has a many other tools like the URL Metric Scanner or Search Engine Parser.
  • It can test the proxies against many different sites and also test against SMTP usage (used for email verification).

Please try the demo version of GSA Proxy Scraper and see for yourself.

All GSA Tools can use so called CONNECT-Proxies. Those proxies are usually anonymous and work perfectly. Other tools do not understand this type of proxy and use it as a normal WEB-Proxy. Sometimes the proxy does not work here or it reacts differently (no longer anonymous, blocked by google, requires authorization…). If you want to rule out this problem, you need to uncheck the testing for CONNECT proxies in options->filter.

Tools like ScrapeBox can not handle CONNECT proxies and so you will see different results when using them in that software.

There was something strange that I discovered when building the tool. When the program searches for new sources, it could happen that it found proxies who are on port 80 but only seem to work for bing.com. You see them in your list with that TAG and the anonymous level been gray it it could not check that.

I have checked that closely and really was not able to test that proxy successfully against anything else but bing. Sometimes I was able to access facebook.com or ebay.com with it. Even Twitter.com sometimes but thats it. Those proxies are only useful if you want to get access to those sites.

So what type is that I asked myself and opened that IP directly in browser. This is what you get:

Invalid URL
The requested URL "[no URL]", is invalid.

Reference #9.5c957441.1464336353.8984bf0 

Having a look at the header showed:

HTTP/1.0 400 Bad Request
Server: AkamaiGHost
Mime-Version: 1.0
Content-Type: text/html
Content-Length: 208
Expires: Fri, 27 May 2016 07:54:48 GMT
Date: Fri, 27 May 2016 07:54:48 GMT
Connection: close

OK so Server: AkamaiGHost. This is strange as I never heard anything about this. I searched for information about it and YES! Wikipedia had something for me. And the last sentence on this should get you worried:

The National Security Agency and Federal Bureau of Investigation have reportedly used Facebook's Akamai content delivery network (CDN) to collect information on Facebook users. Akamai has been accused of blocking access to web sites for visitors using Tor.

You better not use those sites for anything other than bing scraping. I would never use that for facebook and you should also stay away from that. To not use them in export files, just uncheck UNKNOWN as anonymous level and you rule them out.

Well yes, thats possible. However keep in mind that proxies are still public and might be banned quickly. To get just proxies being able to be used for E-Mail Verification do the following:

  1. On Main GUI click: Settings
  2. Click on Automatic Search
  3. Right click on the listing with the Test Scripts and click Check None
  4. On that same listing find the test scenario called Email Verification (gmail.com) and check it.

Now you will see the program searching for proxies and testing it only for email verification usage. The proxies can later be exported to a file and e.g. imported to GSA Email Verifier (if you still own it).

Open Regedit.exe and go to

HKEY_LOCAL_MACHINE → SYSTEM → CurrentControlSet → services → Tcpip → Parameters

and change settings…
TcpTimedWaitDelay - 30 (decimal)
MaxUserPort - 65534 (decimal)

This may sound more complicated as it really is. There are nice programs like sockscap that can hook up on any executable and add proxy support for it. Together with GSA Proxy Scraper, you can easily have rotating proxies for any application.

The product scans for proxies (IP with a port behind) on all kinds of different websites. Scraping for proxies is usually not a problem, but later the testing counld turn to one. Some really stupid VPS or even ISP providers think that this testing is some kind of a port / net scan. This is of course not the case and the program tries to randomize this testing as best as possible. However it can happen and if you get into such situation, you might need to use a proxy for testing and maybe also for scraping. That can be turned on with two options as seen below:

A public proxy is one that is listed on free accessible websites where everyone can see and use that proxy. That is also the reason why public proxies are usually slow and unstable as you don't know how many people are using it and if it. Also keep in mind that a public proxy is often banned on certain websites where proxies would help to get data faster (e.g. google or other search engines).

A private proxy however is the total opposite to a public proxy. It's usually accessible only to the person who bought this proxy and is also protected by a login/password. If it's not protected, then you usually have to define an IP-Range from where you plant to use it on the proxy providers web interface. Due to the nature that only you can access this proxy, it is considered to be very fast and not blocked on search engines.

There are many other proxy types among private proxies such as:

  • Residential Proxies use real peoples IPs (their computers, mobile phones, and other devices on WiFi). They are also called peer-to-peer proxies. Using such proxies makes them much harder to detect than any other proxies. They usually offer more locations and more precise targeting options. Residential proxies are often shared and have to rotate after a while.
  • Rotating Proxies automatically give you new IP after a certain time without you having to care about this. You get access to one or more backconnect gateway server that connects you to a provider’s IP pool.
  • Shared Proxies have IPs that multiple people use at the same time. Though this is not as mush are pure public proxies and they are still fast and stable. They are the cheapest option among private proxies. An IP costs 2-3 times less compared to dedicated proxies. When priced by traffic, you’ll pay 5-20 times less compared to residential addresses.
  • Dedicated Proxy is the complete opposite to public proxies and also to private, shared proxies. It is a specific IP address used only by a single user that are not used by multiple users. Since no one else shares the same IP address, you have full control over each it's activity.

No, all proxies found with GSA Proxy Scraper are considered to be public proxies as everyone can see them on the webpages where it is scraping from. However, there is a way to get proxies not listed anywhere and thats by using the included Proxy Scanner. With that tool you can find open proxy servers from a given IP- and Port-Range by scanning, probing and testing.