How do I extract value of XML attribute in Python?

Question

I need to extract the value of an attribute in an XML document using Python.

For example, If I have an XML document like this:

<xml>
    <child type = "smallHuman"/>
    <adult type = "largeHuman"/>
</xml>

How would I be able get the text 'smallHuman' or 'largeHuman' to store in a variable?

Edit: I'm very new to Python and may require a lot of assistance.

This is what I've tried so far:

#! /usr/bin/python

import xml.etree.ElementTree as ET


def walkTree(node):
    print node.tag
    print node.keys()
    print node.attributes[]
    for cn in list(node):
        walkTree(cn)

treeOne = ET.parse('tm1.xml')
treeTwo = ET.parse('tm3.xml')

walkTree(treeOne.getroot())

Due to the way this script will be used, I cannot hard-code the XML into the .py file.

I've updated the question with the code written so far @James — Alex Ritchie
– Alex Ritchie, Commented Feb 12, 2018 at 12:29

pafreire · Accepted Answer · 2018-02-12 13:14:18Z

8

To get the attribute value from an XML, you can do like this:

import xml.etree.ElementTree as ET

xml_data = """<xml>
<child type = "smallHuman"/>
<adult type = "largeHuman"/>
</xml>"""

# This is like ET.parse(), but for strings
root = ET.fromstring(xml_data)

for a child in root:
    print(child.tag, child.attrib)

You can find more details and examples on the link below: https://docs.python.org/3.5/library/xml.etree.elementtree.html

answered Feb 12, 2018 at 13:14

pafreire

1463 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Rakesh · Accepted Answer · 2018-02-12 12:49:49Z

6

Using ElementTree you can use find method & attrib .

Example:

import xml.etree.ElementTree as ET

z = """<xml>
    <child type = "smallHuman"/>
    <adult type = "largeHuman"/>
</xml>"""


treeOne = ET.fromstring(z)
print treeOne.find('./child').attrib['type']
print treeOne.find('./adult').attrib['type']

Output:

smallHuman
largeHuman

answered Feb 12, 2018 at 12:49

Rakesh

82.9k17 gold badges86 silver badges122 bronze badges

Comments

Alexandra Dudkina · Accepted Answer · 2020-10-14 15:49:48Z

0

Another example using lxml library:

xml = '''<xml>
    <child type = "smallHuman"/>
    <adult type = "largeHuman"/>
</xml>'''

from lxml import etree as et

root = et.fromstring(xml)

# find attribute using xpath
child_type = root.xpath('//xml/child/@type')[0]
print(child_type)

adult_type = root.xpath('//xml/adult/@type')[0]
print(adult_type)

# combination of find / get
child_type = root.find('child').get('type')
adult_type = root.find('adult').get('type')

print(child_type)
print(adult_type)

answered Oct 14, 2020 at 15:49

Alexandra Dudkina

4,5123 gold badges18 silver badges29 bronze badges

Comments

yazz · Accepted Answer · 2020-10-24 23:49:18Z

Another example using SimplifiedDoc library:

from simplified_scrapy import SimplifiedDoc, utils
xml = '''<xml>
    <child type = "smallHuman"/>
    <adult type = "largeHuman"/>
</xml>'''
doc = SimplifiedDoc(xml).select('xml')

# first
child_type = doc.child['type']
print(child_type)

adult_type = doc.adult['type']
print(adult_type)

# second
child_type = doc.select('child').get('type')
adult_type = doc.select('adult').get('type')

print(child_type)
print(adult_type)

# second
child_type = doc.select('child>type()')
adult_type = doc.select('adult>type()')

print(child_type)
print(adult_type)

# third
nodes = doc.selects('child|adult>type()')
print(nodes)
# fourth
nodes = doc.children
print ([node['type'] for node in nodes])

Collectives™ on Stack Overflow

How do I extract value of XML attribute in Python?

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related