2

I am trying to download a txt file given a url and port. This works on python 2 doing this:

Python 2.7.12 (default, Sep 26 2016, 09:46:23)  [GCC 4.2.1 Compatible
Apple LLVM 7.3.0 (clang-703.0.31)] on darwin Type "help", "copyright",
"credits" or "license" for more information.
>>> import urllib
>>> foo = urllib.urlopen("http://catnet-ip.icc.cat:8080/")
>>> foo.read() 
'SOURCETABLE 200 OK\r\nServer: NTRIP Trimble NTRIP Caster\r\nContent-Type: text/plain\r\nContent-Length: 2884\r\nDate:
02/Nov/2016:12:52:19 UTC\r\n\r\nSTR;VRS_RTK_2_3;Virtual RTK ver RTCM
2.3;RTCM 2.3;1(1),3(6),18(1),19(1),23(5),24(5);2;GPS;Catnet;ESP;41.3;2.09;1;1;Trimble
GPSNet;None;B;N;3900;;\r\nSTR;VRS_RTK_3_0;Virtual RTK ver RTCM
3.0;RTCM 3;1004(1),1005/1007(5),PBS(10);2;GPS;Catnet;ESP;41.3;2.09;1;1;Trimble
GPSNet;None;B;N;1100;;\r\nSTR;VRS_DGPS;Virtual DGPS ver RTCM 2.3;RTCM
2.3;1(1),3(6),22(6),23/24(5),16(59);0;GPS;Catnet;ESP;41.3;2.09;1;1;Trimble
GPSNet;None;B;N;640;;\r\n 
...

Similarly with wget:

Python 2.7.12 (default, Sep 26 2016, 09:46:23)  [GCC 4.2.1 Compatible
Apple LLVM 7.3.0 (clang-703.0.31)] on darwin Type "help", "copyright",
"credits" or "license" for more information.
>>> import wget
>>> foo = wget.download("http://catnet-ip.icc.cat:8080/", bar=None)
>>> foo
>>> ' (1).'
>>> exit()
$ less \ \(1\).
SOURCETABLE 200 OK\r\nServer: NTRIP Trimble NTRIP Caster\r\nContent-Type: text/plain\r\nContent-Length: 2884\r\nDate:
02/Nov/2016:12:52:19 UTC\r\n\r\nSTR;VRS_RTK_2_3;Virtual RTK ver RTCM
2.3;RTCM 2.3;1(1),3(6),18(1),19(1),23(5),24(5);2;GPS;Catnet;ESP;41.3;2.09;1;1;Trimble
GPSNet;None;B;N;3900;;\r\nSTR;VRS_RTK_3_0;Virtual RTK ver RTCM
3.0;RTCM 3;1004(1),1005/1007(5),PBS(10);2;GPS;Catnet;ESP;41.3;2.09;1;1;Trimble
GPSNet;None;B;N;1100;;\r\nSTR;VRS_DGPS;Virtual DGPS ver RTCM 2.3;RTCM
2.3;1(1),3(6),22(6),23/24(5),16(59);0;GPS;Catnet;ESP;41.3;2.09;1;1;Trimble
GPSNet;None;B;N;640;;\r\n 
...

But both fail on python 3 with error "http.client.BadStatusLine: SOURCETABLE 200 OK"

Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 26 2016, 10:47:25)  [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib.request
>>> foo = urllib.request.urlopen("http://catnet-ip.icc.cat:8080/") 
Traceback (most recent call last):   
File "<stdin>", line 1, in <module>       
File /Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 163, in urlopen
    return opener.open(url, data, timeout)   
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 466, in open
    response = self._open(req, data)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 484, in _open
    '_open', req)   File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 444, in _call_chain
    result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 1282, in http_open
    return self.do_open(http.client.HTTPConnection, req)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 1257, in do_open
    r = h.getresponse()
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1197, in getresponse
    response.begin()
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 297, in begin
    version, status, reason = self._read_status()
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 279, in _read_status
    raise BadStatusLine(line) 
http.client.BadStatusLine: SOURCETABLE 200 OK

and:

Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 26 2016, 10:47:25)  [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information.
>>> import wget
>>> wget.download("http://catnet-ip.icc.cat:8080/")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/toni/Downloads/wget-2.0/wget.py", line 308, in download
    (tmpfile, headers) = urllib.urlretrieve(url, tmpfile, callback)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 188, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 163, in urlopen
    return opener.open(url, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 466, in open
    response = self._open(req, data)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 484, in _open
    '_open', req)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 444, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 1282, in http_open
    return self.do_open(http.client.HTTPConnection, req)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 1257, in do_open
    r = h.getresponse()
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1197, in getresponse
    response.begin()
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 297, in begin
    version, status, reason = self._read_status()
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 279, in _read_status
    raise BadStatusLine(line)
http.client.BadStatusLine: SOURCETABLE 200 OK

From the python docs on http protocol I guess this is due to urllib and wget understanding the tag "SOURCETABLE" in the first position of the file I want to load as some http code. This tag is always present in the files I want to download (ntrip casters), but I can't find a workaround to the problem.

3 Answers 3

1

I faced this exact problem with a different NTRIP server. SOURCETABLE 200 OK is not a valid HTTP status code according to RFC 2616. Sigh. The workaround for me: curl, specifically pycurl.

For example:

import sys
import pycurl

def handle_write(buf):
    sys.stdout.write(buf.decode("iso-8859-1"))

host = "http://catnet-ip.icc.cat:8080/"
curl = pycurl.Curl()
curl.setopt(pycurl.URL, host)
curl.setopt(pycurl.TIMEOUT, 20)
curl.setopt(pycurl.CONNECTTIMEOUT, 3)
curl.setopt(pycurl.HEADERFUNCTION, sys.stdout.write)
curl.setopt(pycurl.WRITEFUNCTION, handle_write)

curl.perform()
curl.close()

Results in:

SOURCETABLE 200 OK
Server: GNSS Spider 7.4.0.8125/1.0
Date: Wed, 18 Dec 2019 21:29:11 GMT Standard Time
Content-Type: text/plain
Content-Length: 2667

STR;VRS_RTK_3_0;VRS_RTK_3_0;RTCM 
...
ENDSOURCETABLE 
Sign up to request clarification or add additional context in comments.

Comments

1
  1. Via urllib3
    import urllib3
    http = urllib3.PoolManager()
    foo = http.request('GET',"http://example.com:2101/")
    print(foo.data) 
  1. Via urllib Edit the files as in the next method and try this
import urllib.request
f = urllib.request.urlopen('http://splcare.in:2101')
print(f.read())
  1. Via http module You could modify the python http library to extra status code, you need to edit the file in {python programm folder}/lib/http/client.py

find the line with code

if not version.startswith("HTTP/") :
    self._close_conn()
    raise BadStatusLine(line) 

modify to

if not (version.startswith("HTTP/") or version.startswith("SOURCETABLE")):
    self._close_conn()
    raise BadStatusLine(line)

another modification at following lines

elif version.startswith("HTTP/1."):
        self.version = 11   # use HTTP/1.1 code for HTTP/1.x where x>=1
    else:
        raise UnknownProtocol(version) 

modify as

elif version.startswith("HTTP/1."):
        self.version = 11   # use HTTP/1.1 code for HTTP/1.x where x>=1
    elif  version.startswith("SOURCETABLE"):
        self.version = 12   # Use NTRIP  
    else:
        raise UnknownProtocol(version)

Now you could get The source table on python console using following code but you need to edit more files to support NTRIP.

import requests
url = "http://example.com:2101"
username = "xxxx"
password = "xxxx"
response = requests.get(url, auth=(username, password))
print(response.status_code)
print(response.content)

Comments

0

Another approach is to use raw sockets from the standard library:

with socket.create_connection(("catnet-ip.icc.cat", 2101)) as sock:
    req_payload = b"GET / HTTP/1.1\r\nHost: catnet-ip.icc.cat\r\n\r\n"
    sock.sendall(req_payload)

    resp_payload = sock.recv(4096)
    while data := sock.recv(4096):
        resp_payload += data

Now you can parse resp_payload yourself, or make http.client.HTTPResponse do it like it's mentioned here, but with a little modification to overcome the SOURCETABLE status line:

SOURCETABLE = b"SOURCETABLE"

http_payload = (
    resp_payload.replace(SOURCETABLE, b"HTTP/1.1".ljust(len(SOURCETABLE)), 1)
    if resp_payload.startswith(SOURCETABLE)
    else resp_payload
)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.