Quick Links

XML is complicated, and parsing it natively is fairly hard, even with scripting languages. Luckily, there's a utility that can convert XML to JSON, which is easier to work with, both in scripts and on the command line.

Use the xq Utility

You'll want to use a custom made utility for this, rather than trying to parse it with something like regex, which is a bad idea. There's a utility called xq that is perfect for this task. It's installed alongside

        yq
    

, which works for YAML. You can install

        yq
    

 from pip:

pip install yq

Under the hood, this utility uses jq to handle working with JSON, so you'll need to download the binary, and move it to somewhere on your PATH (/usr/local/bin/ should work fine).

Now, you'll be able to parse XML input by piping it into xq:

cat xml | xq .

The . operator means you want to convert all of the XML to JSON. You can actually use full jq syntax here to select sub-elements, which you can read our guide on.

You can also output xq's response as XML with the -x flag:

xq -x

This lets you use jq's selection syntax to parse XML while keeping it in XML form. Though it doesn't seem to be able to convert the other way, as xq still wants XML syntax in.

One Problem With XML to JSON Conversion

XML to JSON isn't a perfect conversion---in XML, the order of the elements can matter, and keys can be duplicated. A document like:

<e>
    

<a>some</a>

<b>textual</b>

<a>content</a>

</e>

Would produce an error if translated outright to JSON, because the a key exists twice. So, it's converted to an array, which does break the order. xq returns the following output for that bit of XML:

{
    

"e": {

"a": [

"some",

"content"

],

"b": "textual"

}

}

Which is technically correct, just not ideal in all situations. You'll want to double-check and make sure your XML conversion has no issues.