Home > python > Get the IMDB rating of a movie

Get the IMDB rating of a movie

Problem

You want to get the IMDB rating of a movie. For instance, you have a large collection of movies, and you want to figure out their ratings. An IMDB rating looks like this:
Solution

Here is a script that extracts the rating of a movie from IMDB. The script was inspired by the work of Rag Sagar.

Download link: https://github.com/jabbalaci/Movie-Ratings. Source code:

#!/usr/bin/env python

# ImdbRating

import os
import sys
import re
import urllib
import urlparse

from mechanize import Browser
from BeautifulSoup import BeautifulSoup

class MyOpener(urllib.FancyURLopener):
    version = 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.15) Gecko/20110303 Firefox/3.6.15'

class ImdbRating:
    # title of the movie
    title = None
    # IMDB URL of the movie
    url = None
    # IMDB rating of the movie
    rating = None
    # Did we find a result?
    found = False

    # constant
    BASE_URL = 'http://www.imdb.com'

    def __init__(self, title):
        self.title = title
        self._process()

    def _process(self):
        movie = '+'.join(self.title.split())
        br = Browser()
        url = "%s/find?s=tt&q=%s" % (self.BASE_URL, movie)
        br.open(url)

        if re.search(r'/title/tt.*', br.geturl()):
            self.url = "%s://%s%s" % urlparse.urlparse(br.geturl())[:3]
            soup = BeautifulSoup( MyOpener().open(url).read() )
        else:
            link = br.find_link(url_regex = re.compile(r'/title/tt.*'))
            res = br.follow_link(link)
            self.url = urlparse.urljoin(self.BASE_URL, link.url)
            soup = BeautifulSoup(res.read())

        try:
            self.title = soup.find('h1').contents[0].strip()
            self.rating = soup.find('span',attrs='rating-rating').contents[0]
            self.found = True
        except:
            pass

# class ImdbRating

if __name__ == "__main__":
    if len(sys.argv) == 1:
        print "Usage: %s 'Movie title'" % (sys.argv[0])
    else:
        imdb = ImdbRating(sys.argv[1])
        if imdb.found:
            print imdb.url
            print imdb.title
            print imdb.rating

Related links

Update (20110329):

You will find the latest version of the script at https://github.com/jabbalaci/Movie-Ratings.

[ @reddit ]

Related posts (update 20120222)

Categories: python Tags: , , , ,
  1. October 31, 2011 at 23:18 | #1

    How do I use this script?

    • November 1, 2011 at 20:03 | #2

      The imdb site has changed, I made an update of the script, you’ll find it in the GitHub repo. Usage: ./imdb2.py 'movie_title' .

  1. No trackbacks yet.
You must be logged in to post a comment.
Follow

Get every new post delivered to your Inbox.

Join 63 other followers

%d bloggers like this: