Element Tree error on parse after creating xml via element tree

Question

Why can't the xml created by this code be parsed by python or read?

I have a chunk of code that is writing an xml file:

idlist = list(set([d['type'] for d in List]))                   ##create list of all ID numbers
idlist.sort()
root = ET.Element("MarketData")
for i in idlist:                                                ##iterate over every ID number
    doc = ET.SubElement(root, 'Item', typeID=str(i))            ##create child for current ID number
    tList = list(filter(lambda x: x['type'] == i, List))        ##make a list of all orders for current ID
    sList = list(filter(lambda x: x['buy'] == False, tList))    ##create list of only sell orders
    bList = list(filter(lambda x: x['buy'] == True, tList))     ##create list of only by orders
    spl = list([d['price'] for d in sList])                     ##create list of sell prices
    bpl = list([d['price'] for d in bList])                     ##create list of buy prices
    if not spl:                                                 ##null case
        spl = [-1]
    if not bpl:                                                 ##null case
        bpl = [-1]
    sp = min(spl)                                               ##find min sell price
    bp = max(bpl)                                               ##find max buy price
    ET.SubElement(doc, 'Sell Price').text = str(sp)             ##write sell price to child as string under new sub-element
    ET.SubElement(doc, 'Buy Price').text = str(bp)              ##write buy price to branch as string under new sub-element
tree = ET.ElementTree(root)
tree.write("MarketData.xml")                                    ##write xml tree to final xml file

it executes fine, and my test code with identical xml logic writes a perfectly readable file but when i create a file using this code it is unreadable and can't be parsed by ElementTree.

From python I get: "xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 41".

From Firefox i get: "error on line 1 at column 42: Specification mandate value for attribute Price".

The first chuck of the xml (when opened via np++) is:

<MarketData><Item typeID="18"><Sell Price>64.92</Sell Price><Buy Price>53.31</Buy Price></Item><Item typeID="19"><Sell Price>36999.99</Sell Price><Buy Price>3502.03</Buy Price></Item>

I'm at a complete loss... any suggestions?

note: I'm not a coder, I do this for fun while playing a game so please don't beat me up too hard for anything...


Show source
| python   | xml   | parsing   | xml-parsing   | elementtree   2017-01-01 04:01 2 Answers

Answers ( 2 )

  1. 2017-01-01 05:01

    Element names can't contain blanks. For instance, you could replace occurrences of 'Buy Price' with 'Buy_Price'. This version of your file opens successfully.

    <MarketData>
    <Item typeID="18">
    <Sell_Price>64.92</Sell_Price>
    <Buy_Price>53.31</Buy_Price>
    </Item>
    <Item typeID="19">
    <Sell_Price>36999.99</Sell_Price>
    <Buy_Price>3502.03</Buy_Price>
    </Item>
    </MarketData>
    
  2. 2017-01-01 12:01

    You cannot have an element name with spaces in it, such as Sell Price. A start-tag such as <Sell Price> (or an empty-element tag like <Sell Price />) is not complete. It is interpreted as the element Sell having an attribute Price without a value assigned to it. And that is illegal.

    Unfortunately, ElementTree allows you to create bad output exhibiting this error. Here is a small demonstration (tested with Python 2.7.13):

    import xml.etree.ElementTree as ET
    
    root = ET.Element("root")
    ET.SubElement(root, 'Sell Price')
    print ET.tostring(root)
    

    This program outputs <root><Sell Price /></root>, which is ill-formed.

    If you use lxml instead of ElementTree, you get the correct behaviour (an exception is thrown):

    from lxml import etree as ET
    
    root = ET.Element("root")
    ET.SubElement(root, 'Sell Price')
    print ET.tostring(root)
    

    Result:

    Traceback (most recent call last):
      File "error.py", line 6, in <module>
        ET.SubElement(root, 'Sell Price')
      File "src\lxml\lxml.etree.pyx", line 3112, in lxml.etree.SubElement (src\lxml\lxml.etree.c:75599)
      File "src\lxml\apihelpers.pxi", line 183, in lxml.etree._makeSubElement (src\lxml\lxml.etree.c:16962)
      File "src\lxml\apihelpers.pxi", line 1626, in lxml.etree._tagValidOrRaise (src\lxml\lxml.etree.c:32556)
    ValueError: Invalid tag name u'Sell Price'
    
◀ Go back