- Unsupervised Learning
- Posts
- Extract XML Elements Using xmllint
Extract XML Elements Using xmllint
Many people got toys or socks or whatever for Christmas. I’m sad for them. I got a brilliant way to pluck the goodness from within XML tags, thanks to xmllint.
Ever been writing some sort of shell script and needed a value from a giant XML document?
Yeah, me neither. And whenever this never happens to me all the time, it often results in getting baned from the StackExchange network due to a grep/sed flavored denial of service attack.
Well, no more. With xmllint you can extract exactly what you need from precisely the right element. So if you have something like this:
<footag>TastyGoodness</footag>
You can now just do this:
xmllint --xpath "string(//footag)" sourcefile.xml
And you’re left with nothing but the fruit.
TastyGoodness
Wut? That easy? I used to start shaking uncontrollably when looking at a giant XML doc that I just wanted a simple thing from. This’ll save me a ton on meds.
And it has tons of options. You can get entire child trees, slice them up in various ways, etc.
Go forth and extract things from inside of things.
Notes
The xmllint man page.
Yes, I’m aware of the fact that real languages don’t have this problem. I often use Ruby when I want to do heavy XML parsing. But sometimes I’m deep enough into a Bash solution that I’d rather not port it. Hence the post. Also, stop judging me. I can feel that shit.
Still feel it.