Archive
2014 in review
The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.
Here's an excerpt:
The Louvre Museum has 8.5 million visitors per year. This blog was viewed about 190,000 times in 2014. If it were an exhibit at the Louvre Museum, it would take about 8 days for that many people to see it.
XML to dict / XML to JSON
Problem
You have an XML file and you want to convert it to dict or JSON.
Well, if you have a dict, you can convert it to JSON with “json.dump()
“, so the real question is: how to convert an XML file to a dictionary?
Solution
There is an excellent library for this purpose called xmltodict. Its usage is very simple:
import xmltodict # It doesn't work with Python 3! Read on for the solution! def convert(xml_file, xml_attribs=True): with open(xml_file) as f: d = xmltodict.parse(f, xml_attribs=xml_attribs) return d
This worked well under Python 2.7 but I got an error under Python 3. I checked the project’s documentation and it claimed to be Python 3 compatible. What the hell?
The error message was this:
Traceback (most recent call last): File "/home/jabba/Dropbox/python/lib/jabbapylib2/apps/xmltodict.py", line 247, in parse parser.ParseFile(xml_input) TypeError: read() did not return a bytes object (type=str) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "./xml2json.py", line 27, in <module> print(convert(sys.argv[1])) File "./xml2json.py", line 17, in convert d = xmltodict.parse(f, xml_attribs=xml_attribs) File "/home/jabba/Dropbox/python/lib/jabbapylib2/apps/xmltodict.py", line 249, in parse parser.Parse(xml_input, True) TypeError: '_io.TextIOWrapper' does not support the buffer interface
I even filed an issue ticket :)
After some debugging I found a hint here: you need to open the XML file in binary mode!
XML to dict (Python 2 & 3)
So the correct version that works with Python 3 too is this:
import xmltodict def convert(xml_file, xml_attribs=True): with open(xml_file, "rb") as f: # notice the "rb" mode d = xmltodict.parse(f, xml_attribs=xml_attribs) return d
XML to JSON (Python 2 & 3)
If you want JSON output:
import json import xmltodict def convert(xml_file, xml_attribs=True): with open(xml_file, "rb") as f: # notice the "rb" mode d = xmltodict.parse(f, xml_attribs=xml_attribs) return json.dumps(d, indent=4)
catch the output of pprint in a string
Problem
If you have a large nested data structure (e.g. a list or dictionary), the pprint
module is very useful to nicely format the output.
However, “pprint.pprint
” prints directly to the stdout. What if you want to store the nicely formatted output in a string?
Solution
Use “pprint.pformat
” instead.
Example:
>>> import pprint >>> d = {"one": 1, "two": 2} >>> pprint.pprint(d) {'one': 1, 'two': 2} >>> s = pprint.pformat(d) >>> s "{'one': 1, 'two': 2}"
Well, this is a small example, the real pretty formatting is not visible, but you get the point :)
fancy text tables
Problem
Instead of simply printing some data on the screen, I wanted to put them in a nicely formatted ASCII table.
Solution
After some research I found a nice package for this purpose: python-tabulate. It supports both Python 2 and Python 3 (yes, from now on it’s also important for me).
Its usage is very simple. Here is a snippet that creates random usernames and passwords:
from tabulate import tabulate table = [] headers = ["Username #1", "Username #2", "Password #1", "Password #2"] for _ in range(10): name1 = get_username_1() name2 = get_username_2() pass1 = get_password_1(8) pass2 = get_password_2(12) table.append([name1, name2, pass1, pass2]) # print("{:15}{:15}{:15}{:15}".format(name1, name2, pass1, pass2)) # this is the past :) print(tabulate(table, headers=headers, tablefmt="psql"))
Output:
+---------------+---------------+---------------+---------------+ | Username #1 | Username #2 | Password #1 | Password #2 | |---------------+---------------+---------------+---------------| | Adarah | hasana | ygyQsF6u | uTzPqZMDNJ6x | | Alary | begahi | YqW4aY7q | ipZuX0sX2RFg | | Solita | otomot | Xwliu9yi | IjeFibVFaoZq | | Casony | rikari | fw6dk5gt | zbAXO8gd33Lh | | Anne | asakou | MXsXpz43 | aYNiJTwojULG | | Joby | mgomam | vZjiCuyT | qc3Q9caAenJw | | Kallita | aremon | j1ZD1QU9 | AIEsykmYodfy | | Cara | iumina | 75UzkKgK | lK92GdAxn441 | | Fuscie | goomio | uof2C7ct | HFgVlAZ9PSmv | | Dean | utinon | gycncz9f | 61oJzUGdDVKf | +---------------+---------------+---------------+---------------+
The module supports various formatting styles. For more examples, check out the official page.
Update (20191029)
The project has moved to https://github.com/astanin/python-tabulate .
Static HTML file browser for Dropbox
Two of my students worked on a project that creates static HTML files for a public Dropbox folder (find it at github). I use it in production, check it out here.
If you created your Dropbox account before October 4, 2012, then you are lucky and you have a Public folder. Accounts opened after this date have no Public folder :(
So, if you have a Public folder and you want to share the content of a folder recursively, then you can try this script. It produces a file and directory list that is similar to an Apache output.
Screenshot
Authors and Contributors
- Kiss Sándor Ádám (main developer)
- Iváncza Csaba (junior developer)
- Jabba Laci (project idea)