Archive

Archive for the ‘python’ Category

pytumblr with Python 3 support

October 22, 2016 Leave a comment

Problem
I wanted to use the pytumblr library from Python 3 but this library is Python 2 only :( And the repo is quite abandoned. Someone asked Python 3 support more than a year ago and nothing happened.

Solution
As I had a large Python 3 project and I wanted to integrate Tumblr support in it, I decided to modify the library to support Python 3. The result is here: https://github.com/jabbalaci/pytumblr . I only needed to upload photos, so I worked on that part only. But with this version you can upload photos to Tumblr under Python 3.

Categories: python Tags: , , ,

[matplotlib] create figures on a remote server

October 16, 2016 Leave a comment

Problem
On a remote server of mine I wanted to create some nice figures with matplotlib. It worked well on localhost but it failed on the server. First, tkinter was missing. Second, there was a problem with $DISPLAY.

Solution
To install tkinter, check out this earlier post.

The second problem is caused by a missing X server. On a remote server usually there is no graphical interface. To solve it, just add these two lines to the top of your program:

import matplotlib as mpl
mpl.use('Agg')

Example
Before:

#!/usr/bin/env python3
# coding: utf-8

import pylab
import numpy as np

def main():
    x = np.arange(-3.14, 3.14, 0.01)
    y = np.sin(x)
    pylab.plot(x, y, "b")
    pylab.savefig("out.png")

##############################################################################

if __name__ == "__main__":
    main()

After:

#!/usr/bin/env python3
# coding: utf-8

import matplotlib as mpl
mpl.use('Agg')

import pylab
import numpy as np

def main():
    x = np.arange(-3.14, 3.14, 0.01)
    y = np.sin(x)
    pylab.plot(x, y, "b")
    pylab.savefig("out.png")

##############################################################################

if __name__ == "__main__":
    main()

Output:
sine

Categories: matplotlib, python Tags: , , ,

sorted containers

September 9, 2016 1 comment

SortedContainers is an Apache2 licensed sorted collections library, written in pure-Python, and fast as C-extensions.”

Hmm, next time I need a sorted dict, I will try it.

As /u/fernly pointed it out:

‘ordered’ means ‘insertion order’. For a sorted dict see the excellent sortedcontainers module. This provides dicts, lists, and sets that return keys in sequence (including, you can supply a key() func for custom sorting), and maintain the sequence under deletions and insertions, with low overhead. This functionality is still not in the std library.” (source)

Categories: python Tags: ,

Get the IMDb Top 250 list

August 19, 2016 Leave a comment

Problem
From IMDb you want to get the list of the Top 100 movies.

Solution
There is a Top 250 list here: http://akas.imdb.com/chart/top. To access IMDb info, I use the excellent imdbpy package. It has a get_top250_movies() function but it returns an empty list :)

During my research I found this post on SO. It suggests that one should download the official IMDb dump from here. The Top 250 list is in the file ratings.list.gz. However, this file doesn’t contain the IMDb IDs of the movies, so it’s good for nothing :(

There was only one solution left: let’s do some scraping. Here is the Python code that did the job for me. I didn’t use BeautifulSoup just plain ol’ regular expressions:

import requests
import re

top250_url = "http://akas.imdb.com/chart/top"

def get_top250():
    r = requests.get(top250_url)
    html = r.text.split("\n")
    result = []
    for line in html:
        line = line.rstrip("\n")
        m = re.search(r'data-titleid="tt(\d+?)">', line)
        if m:
            _id = m.group(1)
            result.append(_id)
    #
    return result

It returns the IMDb IDs of the Top 250 movies. Then, using the imdbpy package you can ask all the information about a movie, since you have the movie ID.

Links

Categories: python Tags: , , , ,

string distances

August 17, 2016 Leave a comment

See the Jellyfish project: “Jellyfish is a python library for doing approximate and phonetic matching of strings“.

Jellyfish implements the following algorithms: Levenshtein Distance, Damerau-Levenshtein Distance, Jaro Distance, Jaro-Winkler Distance, Match Rating Approach Comparison, Hamming Distance.

See the project page for more info.

Categories: python Tags: ,

compile lxml on Ubuntu 16.04

August 4, 2016 Leave a comment

Problem
lxml doesn’t want to compile on Ubuntu 16.04.

Solution

$ sudo apt install libxml2-dev libxslt1-dev python-dev zlib1g-dev

I was getting the error “/usr/bin/ld: cannot find -lz“. It turned out that the package zlib1g-dev was the cure…

Note that this is for Python 2. For Python 3 you might need to install the package python3-dev.

Categories: python, ubuntu Tags: ,

installing a Flask webapp on a Digital Ocean Ubuntu 16.04 box using Systemd

August 4, 2016 Leave a comment

I’ve updated my Digital Ocean Flask notes on GitHub. Now it includes information about installing a Flask webapp on a Digital Ocean Ubuntu 16.04 box using Systemd.

Categories: flask, python, ubuntu Tags: ,