6

I have some code that uses mechanize and beautifulsoup for web scraping some data. The code works fine on a test machine but the production machine is blocking the connection. The error i get is:

urlopen error [Errno 10053] An established connection was aborted by the software in your host machine

I have read through similar posts and I cannot find this exact error. The site I am trying to scrape is HTTPS but I have also had the same error occur with an HTTP site. I am using python 2.6 and mechanize 0.2.4.

Is this due to the proxy or, as the error says, something on my local machine?? I've written in for mechanize to use the system's proxy:

br = mechanize.Browser()
br.addheaders = [('User-agent', 'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1')]
br.set_proxies({}) #will use system default proxy
page = br.open(url)
html = page.read()
soup = BeautifulSoup.BeautifulSoup(html)

Again, this all works on my test machine, but the production machine gives that Error 10053.

5
  • "The issue here was a host based IDS was preventing the connection out. Problem solved." - Could you please explain in details how the problem is solved, what changes you had need to do in order to solve this problem? I'm facing a similar problem and not sure how can i fix this. Many thanks Commented Jun 29, 2011 at 13:58
  • 1
    I added my python script to the HIDS exception list. The exception list was the list of files that I allowed to connect out to the internet. Once it was added to the list, I was able to get network connectivity with the script and had no further problems. The test machine did not have a HIDS client installed so that is why it was allowing me to talk out. FYI, both had firewalls but only one (production machine) had the HIDS. Commented Jun 29, 2011 at 14:09
  • Hi thanks for the answer but I'm sorry for my ignorance - what does HIDS stand for? I don't think I've any such client installed into my system, still, where can I check to be sure I don't have any such similar thing installed? My network security is administered by my companies network security team. Do I need their help to keep my script in alowable access list? Commented Jun 29, 2011 at 14:27
  • HIDS stands for Host based Intrusion Detection System. If the network security team has made the HIDS not visible to you, you might not know where to find it. Also, even if you do find it, you will not be able to disable it. You can ask your security team if they can add an exception for your script. Another sneaky way around the HIDS is to build your script into an exe (using Py2EXE) and rename the executable you create to something already on the HIDS exception list. A good one to rename it to would be your browser, so if Firefox is allowed internet access, rename your exe to firefox.exe. Commented Jun 29, 2011 at 17:22
  • This may not work if the HIDS is smart and recognizes that a program is being run from an unknown location. Ex: You rename your program to firefox.exe and run from desktop but the actual path that Firefox should be ran from is C:\Programs\Firefox\. This may raise a few eyebrows as to why you have the program in an unknown path. Commented Jun 29, 2011 at 17:24

1 Answer 1

3

The issue here was a host based IDS was preventing the connection out. Problem solved.

I added my python script to the HIDS exception list. The exception list was the list of files that I allowed to connect out to the internet. Once it was added to the list, I was able to get network connectivity with the script and had no further problems. The test machine did not have a HIDS client installed so that is why it was allowing me to talk out. FYI, both had firewalls but only one (production machine) had the HIDS.

HIDS stands for Host based Intrusion Detection System. If the network security team has made the HIDS not visible to you, you might not know where to find it. Also, even if you do find it, you will not be able to disable it. You can ask your security team if they can add an exception for your script. Another sneaky way around the HIDS is to build your script into an exe (using Py2EXE) and rename the executable you create to something already on the HIDS exception list. A good one to rename it to would be your browser, so if Firefox is allowed internet access, rename your exe to firefox.exe.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.