Posts Tagged ‘xml library’

Read XML painlessly

October 30, 2011 3 comments

I had an XML file (an RSS feed) from which I wanted to extract some data. I tried some XML libraries but I didn’t like any of them. Is there a simple, brain-friendly way for this? After all, it’s Python, so everything should be simple.

Yes, there is a simple library for reading XML called “untangle“, developed by Chris Stefanescu. It’s in PyPI, so installation is very easy:

sudo pip install untangle

For some examples, visit the project page.

Use Case
Let’s see a simple, real-world example. From the RSS feed of Planet Python, let’s extract the post titles and their URLs.

#!/usr/bin/env python

import untangle

#XML = 'examples/planet_python.xml'     # can read a file too
XML = ''

o = untangle.parse(XML)
for item in
    title = item.title.cdata
    link =
    if link:
        print title
        print '   ', link

It couldn’t be any simpler :)

According to Chris, untangle doesn’t support documents with namespaces (yet).

Related posts

Alternatives (update 20111031)
Here are some alternatives (thanks reddit).

lxml and amara are heavyweight solutions and are built upon C libraries so you may not be able to use them everywhere. untangle is a lightweight parser that can be a perfect choice to read a small and simple XML file.

Categories: python Tags: , , , , ,