Breaking through Roadblocks
Jan 2 2013
Hello in 2013! It has been ages since I’ve blogged anything, mainly because I enjoy Google’s social site, google+ way too much, despite, or perhaps due to it being filled mostly with my geek friends.
I decided to post this on wordpress, but it made me think about the possibilities to break the walled garden of g+ and somehow syndicate certain posts on this site. But that is perhaps material for another post.
What I wanted to share are two stumbling blocks, trivial for most of you, but very frustrating until you know the solution. A total must for your typical batch processing is the xargs
utility. Used typically with find
, it allows you to perform commands on a list of arguments. By default it lists all arguments on one line:
find . | grep svg$ | xargs echo
Now find
itself has a million switches to perform filtering, but I prefer not diving into the manpage if given the option :) The default behavior of xargs
leaves a lot to be desired, because usually there is a big list you are working on, and bash and other shells have a limit on the number of arguments. Additionally, it is very likely you will need another argument to follow the one you got passed. The magical parameter you’re looking for is -i
that splits the inline list and calls the provided command separately for each passed argument. You can place that argument anywhere on the commandline using {} brackets:
find . | grep mp4$ | xargs -i ffmpeg -i {} -sameq {}.webm
So while the manpage surely includes this info, I bet someone will find this through a google query and will appreciate it :)
The other big stumbling block that I also hit with ruby is about Xpath queries in python. Big thanks to Patryk Zawadzki for the solution. When parsing inkscape svg xml documents, they actually include numerous namespaced tags, so simple queries like //rect
will fail. You need to prepend all elements with the svg namespace (such as //{http://www.w3.org/2000/svg}rect
). Full example here:
#!/usr/bin/env python3
import glob
import os
import csv
from xml.etree import ElementTree
members = csv.reader(open('members.csv'))
TEMPLATE = 'template.svg'
for data in members:
print(data[0])
svg = ElementTree.parse(TEMPLATE)
svg.find(".//{http://www.w3.org/2000/svg}text[@id='memno']/{http://www.w3.org/2000/svg}tspan").text = data[0]
svg.find(".//{http://www.w3.org/2000/svg}text[@id='name']/{http://www.w3.org/2000/svg}tspan").text = data[1]
svg.find(".//{http://www.w3.org/2000/svg}text[@id='validto']/{http://www.w3.org/2000/svg}tspan").text = data[2]
svg.write('./out/%s.svg' % (data[0]))
os.system("inkscape -A ./out/%s.pdf ./out/%s.svg" % (data[0],data[0]))
os.unlink('./out/%s.svg' % (data[0]))
Update: Turns out the “ evaluation in the xargs example was flawed. Thanks for spotting. Additionally, find itself seems to have an iterator of its own:
<code>find . -name '*.avi' -exec echo ffmpeg -i '{}' -sameq '{}'.webm ';'</code>