11

I am looking to open a connection with python to http://www.horseandcountry.tv which takes my login parameters via the POST method. I would like to open a connection to this website in order to scrape the site for all video links (this, I also don't know how to do yet but am using the project to learn).

My question is how do I pass my credentials to the individual pages of the website? For example if all I wanted to do was use python code to open a browser window pointing to http://play.horseandcountry.tv/live/ and have it open with me already logged in, how do I go about this?

3 Answers 3

7
+50

As far as I know you have two options depending how you want to crawl and what you need to crawl:

1) Use urllib. You can do your POST request with the necessary login credentials. This is the low level solution, which means that this is fast, but doesn't handle high level stuff like javascript codes.

2) Use selenium. Whith that you can simulate a browser (Chrome, Firefox, other..), and run actions via your python code. Then it is much slower but works well with too "sophisticated" websites.

What I usually do: I try the first option and if a encounter a problem like a javascript security layer on the website, then go for option 2. Moreover, selenium can open a real web browser from your desktop and give you a visual of your scrapping.

In any case, just goolge "urllib/selenium login to website" and you'll find what you need.

Sign up to request clarification or add additional context in comments.

2 Comments

It sounds like selenium would do exactly what I am after, so this is definitely the correct answer. Eventually I would like to run it as a kodi addon on raspberry Pi so will not have access to this library! Ill research urllib more thoroughly as hopefully I can achieve what I'm looking for with that alone! Thanks
As the other mentioned, requests do the job as well (like urllib). I already ran selenium on a raspberry pi, try using PhantomJS as web browser (no graphical interface) to spare some computational resources.
1

If you want to avoid using Selenium (opening web browsers), you can go for requests, it can login the website and grab anything you need in the background.

Here is how you can login to that website with requests.

import requests
from bs4 import BeautifulSoup

#Login Form Data
payload = { 
    'account_email': 'your_email',
    'account_password': 'your_passowrd',
    'submit':   'Sign In'
}

with requests.Session() as s:
    #Login to the website.
    response = s.post('https://play.horseandcountry.tv/login/', data=payload)

    #Check if logged in successfully
    soup = BeautifulSoup(response.text, 'lxml')
    logged_in = soup.find('p', attrs={'class': 'navbar-text pull-right'})
    print s.cookies
    print response.status_code
    if logged_in.text.startswith('Logged in as'):
        print 'Logged In Successfully!'

If you need explanations for this, you can check this answer, or requests documentation

Comments

0

You could also use the requests module. It is one of the most popular. Here are some questions that relate to what you would like to do.

Log in to website using Python Requests module

logging in to website using requests

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.