it is not UTF-8 encoding but url escaping or url quoting
import urllib.parse
print( urllib.parse.unquote( u'/wiki/Category:%E6%89%93%E7%A3%9A%E5%A1%8A') )
Result
/wiki/Category:打磚塊
Python 3.x doc: urllib.parse
EDIT:
Python 2.7 has it in module urlparse
import urlparse
print( urlparse.unquote(u'/wiki/Category:%E6%89%93%E7%A3%9A%E5%A1%8A') )
Python 2.7 doc: urlparse
EDIT:
After testing with Python 2.7 it needs encode() before unquote() to work with str (plain text) instead of unicode
#-*- coding: utf-8 -*-
import urlparse
url = u'/wiki/Category:%e6%89%93%E7%A3%9A%E5%A1%8A'
url = url.encode('utf-8') # convert `unicode` to `str`
url = urlparse.unquote(url) # convert `%e6%89%93%E7%A3%9A%E5%A1%8A` to `打磚塊`
print url
print type(url)
print '打磚塊' in url
Result
/wiki/Category:打磚塊
<type 'str'>
True
BTW: The same for Python 3 - it doesn't need encode()
import urllib.parse
url = u'/wiki/Category:%e6%89%93%E7%A3%9A%E5%A1%8A'
url = urllib.parse.unquote(url) # convert `%e6%89%93%E7%A3%9A%E5%A1%8A` to `打磚塊`
print(url)
print(type(url))
print('打磚塊' in url)
Result:
/wiki/Category:打磚塊
<class 'str'>
True
urllib.parse.unquote(u'/wiki/Category:%E6%89%93%E7%A3%9A%E5%A1%8A')