Python xml parsing from string return none attrib

Question

i

import xml.etree.ElementTree as ET

xml = '''<?xml version="1.0" encoding="UTF-8"?>
<Root xmlns="http://www.nexacroplatform.com/platform/dataset">
    <Parameters>
        <Parameter id="ErrorCode" type="string">-1</Parameter>
        <Parameter id="ErrorMsg" type="string">&#32;정원을&#32;초과하였습니다..!</Parameter>
        <Parameter id="O_RESULT" type="string">1</Parameter>
        <Parameter id="O_RESULT_STR" type="string">&#32;정원을&#32;초과하였습니다..!</Parameter>
    </Parameters>
</Root>'''
tree=ET.fromstring(xml)
tree.findall('Parameter')

tree.findall('Parameter') returns empty list.

tree has none attrib and '{http://www.nexacroplatform.com/platform/dataset}Root' tag. why this xml not work?

You need to fully qualify the parameter passed to findall with an appropriate namespace. You may find BeautifulSoup easier to work with — user2668284
– user2668284, Commented Aug 3, 2021 at 8:42
what is the information that you try to extract from the xml? — balderman
– balderman, Commented Aug 3, 2021 at 8:44

balderman · Accepted Answer · 2021-08-03 08:48:11Z

See below (no external lib is involved in the solution)

import xml.etree.ElementTree as ET

xml = '''<?xml version="1.0" encoding="UTF-8"?>
<Root xmlns="http://www.nexacroplatform.com/platform/dataset">
    <Parameters>
        <Parameter id="ErrorCode" type="string">-1</Parameter>
        <Parameter id="ErrorMsg" type="string">&#32;정원을&#32;초과하였습니다..!</Parameter>
        <Parameter id="O_RESULT" type="string">1</Parameter>
        <Parameter id="O_RESULT_STR" type="string">&#32;정원을&#32;초과하였습니다..!</Parameter>
    </Parameters>
</Root>'''
tree = ET.fromstring(xml)
for entry in tree.findall('.//{http://www.nexacroplatform.com/platform/dataset}Parameter'):
    print(f'id={entry.attrib["id"]}, type={entry.attrib["id"]}, data={entry.text}')

output

id=ErrorCode, type=ErrorCode, data=-1
id=ErrorMsg, type=ErrorMsg, data= 정원을 초과하였습니다..!
id=O_RESULT, type=O_RESULT, data=1
id=O_RESULT_STR, type=O_RESULT_STR, data= 정원을 초과하였습니다..!

Ram · Accepted Answer · 2021-08-03 08:40:16Z

You can use beautifulsoup with lxml parser to achieve what you want. I tried to print the ids of <Parameter> tags.

Here is the Code.


import bs4 as bs
xml = '''<?xml version="1.0" encoding="UTF-8"?>
<Root xmlns="http://www.nexacroplatform.com/platform/dataset">
    <Parameters>
        <Parameter id="ErrorCode" type="string">-1</Parameter>
        <Parameter id="ErrorMsg" type="string">&#32;정원을&#32;초과하였습니다..!</Parameter>
        <Parameter id="O_RESULT" type="string">1</Parameter>
        <Parameter id="O_RESULT_STR" type="string">&#32;정원을&#32;초과하였습니다..!</Parameter>
    </Parameters>
</Root>'''

# Create a soup object with lxml parser
soup = bs.BeautifulSoup(xml, 'lxml')

# Select all the parameter tags
params = soup.find('root').find('parameters').findAll('parameter')

# Print the ids of all parameter tags
for i in params:
    print(i['id'])

ErrorCode
ErrorMsg
O_RESULT
O_RESULT_STR

Collectives™ on Stack Overflow

Python xml parsing from string return none attrib

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related