Pulling Data From clojure.data.xml/parse

The last time I started processing XML data it was for hourly reads, one of three daily delimited (csv/xml) data “reports” that come with our AMR (water) product. At the time, I was finishing up the “AMR” store/forward/reporting system written primarily in Python, and there’s a Django web interface as well.

Since then, I started learning and implementing small tool-size applications in Clojure, and I thought it might be interesting to process XML data using Clojure.

First I turned to clojure-xml, only to realize parse¬† realized the whole sequence. I wanted to be lazy, so I came across Clojure’s data.xml module. The particular report is for cuts and tampers (of the cable leading from the electronic endpoint to the water meter).
After being parsed by data.xml/parse, the resulting data looks like this:


[:tag :TamperExport]

[:attrs {}]

The rest of the xml file was all content embedded with additional map tags.

So it was suggested on stackoverflow to pull additional :content using the following function. Everything is included:


(defn ret-xml-data
   "Returns a map of the supplied xml file, as parsed by data.xml/parse."

   [xml-fnam]
   (let [input-xml (try
                      (java.io.FileInputStream. xml-fnam)
                      (catch Exception e))]

      (if-not (nil? input-xml)
         (xmld/parse input-xml)
         nil)))

(defn gen-xml-content-tree
   "Returns a tree-seq with :content extracted."

   [parsed-xml]
   (map :content (first (tree-seq :content :content (:content parsed-xml)))))

From this data, I can now evaluate the rest of the data as shown at the REPL


xml-lib.core=> (def parsed-xml (ret-xml-data "test.xml"))
#'xml-lib.core/parsed-xml
xml-lib.core=> (def cl1 (gen-xml-content-tree parsed-xml))
#'xml-lib.core/cl1
xml-lib.core=> (def temp1 (first (second cl1)))
#'xml-lib.core/temp1
xml-lib.core=> (def test-map (zipmap (keys temp1) (vals temp1)) )
#'xml-lib.core/test-map
xml-lib.core=> test-map
{:content ("80580608"), :attrs {}, :tag :DeviceId}

parsed-xml comes directly from xml.data/parse.

c1 comes from  gen-xml-content-tree

Now, I can start parsing the data.

Advertisements

Leave a comment

Filed under Clojure

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s