Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Python by (50.2k points)

I am having the below XML file:

<hierachy>

    <att>

        <Order>1</Order>

        <attval>Data</attval>

        <children>

            <att>

                <Order>1</Order>

                <attval>Studyval</attval>

            </att>

            <att>

                <Order>2</Order>

                <attval>Site</attval>

            </att>

        </children>

    </att>

    <att>

        <Order>2</Order>

        <attval>Info</attval>

        <children>

            <att>

                <Order>1</Order>

                <attval>age</attval>

            </att>

            <att>

                <Order>2</Order>

                <attval>gender</attval>

            </att>

        </children>

    </att>

</hierachy>

I want to convert it to a CSV file like this:

Data,Studyval

Date,Site

Info,age

Info,gender

The main problem is that both the parent and child names are the same - 'att' and 'attval'. How can I distinguish between them both and give me the output?

I have followed the below code but it is printing the same things twice:

import xml.etree.cElementTree as ET

tree = ET.parse('input.xml')

rebase = tree.getroot()

list = []

for att in rebase.findall('att'):

        name = att.find('attval').text

        for each_att in att.findall('attval'):

            try:

                val = att.find('attval').text

                print name, val

            except AttributeError:

                print name

1 Answer

0 votes
by (107k points)

It is better not to use the findall method, as it will look for att tags in the whole tree. Just repeat the tree in order from top to bottom and take the relevant elements in them.

from xml.etree import ElementTree

tree = ElementTree.parse('input.xml')

root = tree.getroot()

for att in root:

    first = att.find('attval').text

    for subatt in att.find('children'):

        second = subatt.find('attval').text

        print('{},{}'.format(first, second))

The result:

$ python process.py 

Data,Studyval

Data,Site

Info,age

Info,gender

For more information regarding the same, do refer to the Python certification course. 

Related questions

0 votes
1 answer
asked Jan 24, 2020 in Python by Rajesh Malhotra (19.9k points)
0 votes
0 answers
asked Apr 14, 2021 in AWS by xtdeka01 (120 points)
0 votes
1 answer
asked Jan 27, 2020 in Python by Rajesh Malhotra (19.9k points)

31k questions

32.8k answers

501 comments

693 users

Browse Categories

...