I'm using Python to scrape data from Japanese website where it offers both English & Japanese language. Link here
The problem is I got the data I needed but in the wrong language (Link of both languages are identical). I tried inspecting the html page and saw the element 'lang' as followed:
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<html xmlns="http://www.w3.org/1999/xhtml" lang="ja" xml:lang="ja" class="">
Here is the code I used:
import requests
import lxml.html as lh
import pandas as pd
url='https://data.j-league.or.jp/SFMS01/search?team_ids=33&home_away_select=0'
page = requests.get(url)
doc = lh.fromstring(page.content)
tr_elements = doc.xpath('//tr')
col = []
i = 0
for t in tr_elements[0]:
i += 1
name = t.text_content()
print("{}".format(name))
col.append((name,[]))
At this point I got the head row of the table from the page but in Japanese version. I'm new to Python and the scrapy. I don't know if there's any method I could use to get the data in English? If there is any existing examples, templates or other resources I could use, that'd be better.
Thanks in advance!
Set-Cookie: SFCM01LANG=en;