Using httplib to connect to a website in Python

Question

tl;dr: Used the httplib to create a connection to a site. I failed, I'd love some guidance!

I've ran into some trouble. Read about socket and httplib of python's, altough I have some problems with the syntax, it seems.

Here is it:

connection = httplib.HTTPConnection('www.site.org', 80, timeout=10, 1.2.3.4)

The syntax is this:

httplib.HTTPConnection(host[, port[, strict[, timeout[, source_address]]]])

How does "source_address" behave? Can I make requests with any IP from it? Wouldn't I need an User-Agent for it?

Also, how do I check if the connect is successful?

if connection:
print "Connection Successful."

(As far as I know, HTTP doesn't need a "are you alive" ping every one second, as long as both client & server are okay, when a request is made, it'll be processed. So I can't constantly ping.)

Hey - I didn't have it, I'm using 2.7.5 but I'll dig into urllib and see how it goes! Thank you! Also, I didn't know about "requests"! I'll look them up, thank you. — Daniel Crangu
– Daniel Crangu, Commented Oct 23, 2013 at 17:24

Sam van Kampen · Accepted Answer · 2013-10-23 17:56:11Z

2

Creating the object does not actually connect to the website:
HTTPConnection.connect(): Connect to the server specified when the object was created.

source_address seems to be sent to the server with any request, but it doesn't seem to have any effect. I'm not sure why you'd need to use a User-Agent for it. Either way, it is an optional parameter.

You don't seem to be able to check if a connection was made, either, which is strange.

Assuming what you want to do is get the contents of the website root, you can use this:

from httplib import HTTPConnection
conn = HTTPConnection("www.site.org", 80, timeout=10)
conn.connect()

conn.request("GET", "http://www.site.org/")
resp = conn.getresponse()

data = resp.read()
print(data)

(slammed together from the HTTPConnection documentation)

Honestly though, you should not be using httplib, but instead urllib2 or another HTTP library that is less... low-level.

edited Oct 23, 2013 at 17:56

answered Oct 23, 2013 at 17:34

Sam van Kampen

1,02310 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Daniel Crangu Over a year ago

Amazing! It definitely works but I get this error: prntscr.com/1z83aw My noob guess is that, I need an UA so the server would understand where I am requesting from. What do you think?

Sam van Kampen Over a year ago

That's odd - why would it send malformed requests? Are you using source_address? If so, are you sending a legitimate IP instead of 1.2.3.4?

Daniel Crangu Over a year ago

It seems that it gives me different errors on different hosts (404, 400, 302 and so on.). It means it's doing it job. But no, I am not using source_address for now!

Daniel Crangu Over a year ago

Thank you a lot for your help. I'll further research those libraries!

Collectives™ on Stack Overflow

Using httplib to connect to a website in Python

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related