Incremental Parsing Using the Consumer API

November 5, 2004 | Fredrik Lundh

The ElementTree library provides several ways to parse XML documents.

The most common way is to read the document from a file or an input stream, using the parse function:

from elementtree import ElementTree

tree = ElementTree.parse("document.xml")
root = tree.getroot()

Alternatively, you can create an empty ElementTree instance, and use the parse method to load a document into it:

from elementtree import ElementTree

tree = ElementTree.ElementTree()
root = tree.getroot()

The XML helper can be used to create an XML document from a string buffer (or a string literal):

from elementtree import ElementTree

root = ElementTree.XML("<document>body</document>")

You can also use the parser and tree builder components directly, to get more control over the document build process. The core XML parser component is called XMLTreeBuilder. This class implements the standard consumer interface, which lets you feed data to the parser, piece by piece:

from elementtree import ElementTree

parser = ElementTree.XMLTreeBuilder()


root = parser.close()

The pieces can be of any size, and tags and entities can be spread over multiple pieces.

Note that the close method returns the resulting document root (as an Element instance). If you want an ElementTree, just wrap it as usual:

from elementtree import ElementTree

parser = ElementTree.XMLTreeBuilder()


tree = ElementTree.ElementTree(parser.close())

A Django site. rendered by a django application. hosted by webfaction.