Archive

Archive for February, 2011

Python tutorials at IBM developerWorks

February 28, 2011 Leave a comment

IBM developerWorks has great tutorials. Check out their Python tutorials.

Categories: python Tags: ,

Check downloaded movies on imdb.com

February 27, 2011 3 comments

Recently, I downloaded a nice pack of horror movies. The pack contained more than a hundred movies :) I wanted to see their IMDB ratings to decide which ones to watch, but typing their titles in the browser would be too much work. Could it be automated?

Solution

Each movie was located in a subdirectory. Here is an extract:

...
Subspecies.1991.DVDRip.XviD-NoGrp
Terror.Train.1980.DVDRIP.XVID-NoGrp
The.Changeling.1980.DVDRip-KooKoo
The.Creature.Walks.Among.Us.1956.DVDRip-KooKoo
The.Hills.Have.Eyes.1977.DVDRip-KooKoo
The.Howling.Special.Edition.1981.XviD.6ch-AC3-FTL
The.Monster.Club.1980.DVDRip.DivX-UTOPiA
...

Fortunately, the directories were named in a consistent way: title of the movie (words separated with a dot), year, extra info. Thus, extracting titles was very easy. Idea: collect the titles in a list and open them in Firefox on imdb.com, each in a new tab.

First, I redirected the directory list in a file. It was easier to work with a text file than doing globbing:

ls >a.txt

And finally, here is the script:

#!/usr/bin/env python

import re
import urllib
import webbrowser

base = 'http://www.imdb.com/find?s=all'
firefox = webbrowser.get('firefox')

f1 = open('a.txt', 'r')

for line in f1:
    line = line.rstrip('\n')
    if line.startswith('#'):
        continue

    # else
    result = re.search(r'(.*)\.\d{4}\..*', line)
    if result:
        address = result.group(1).replace('.', ' ')
        url = "%s&q=%s" % ( base, urllib.quote(address) )
        print url
        firefox.open_new_tab(url)
        #webbrowser.open_new_tab(url)    # try this if the line above doesn't work

f1.close()

Achtung! Don’t try it with a huge list, otherwise your system will die :) Firefox won’t handle too many open tabs… Try to open around ten titles at a time. In the input file (a.txt) you can comment lines by adding a leading ‘#‘ sign, thus those lines will be discarded by the script.

Categories: python Tags: , , , ,

Current date and time

February 25, 2011 Leave a comment
#!/usr/bin/env python

from datetime import datetime

now = datetime.now()
date = datetime.date(now)
time = datetime.time(now)
print "%d-%02d-%02d @ %02dh%02d" % (date.year, date.month, date.day, time.hour, time.minute)

Sample output:

2011-02-25 @ 11h23

Update (20110523)
I wanted to use a timestamp in the name of a temporary file. Here is a slightly modified version of the code above:

...
print "{year}{month:02}{day:02}_{hour:02}{minute:02}{second:02}".format(year=date.year, month=date.month, day=date.day, hour=time.hour, minute=time.minute, second=time.second)
...

Sample output:

20110523_235828
Categories: python Tags: , , ,

Python tutorials of Full Circle Magazine in a single PDF

February 21, 2011 Leave a comment

On my other blog, I wrote a post on how to extract the Python tutorials from Full Circle Magazine and join them in a single PDF.

For the lazy pigs, here is the PDF (6 MB). Get it while it’s hot :)

Categories: python Tags: , ,

Create a temporary file with unique name

February 19, 2011 Leave a comment

Problem

I wanted to download an html file with Python, store it in a temporary file, then convert this file to PDF by calling an external program.

Solution #1

#!/usr/bin/env python

import os
import tempfile

temp = tempfile.NamedTemporaryFile(prefix='report_', suffix='.html', dir='/tmp', delete=False)

html_file = temp.name
(dirName, fileName) = os.path.split(html_file)
fileBaseName = os.path.splitext(fileName)[0]
pdf_file = dirName + '/' + fileBaseName + '.pdf'

print html_file   # /tmp/report_kWKEp5.html
print pdf_file    # /tmp/report_kWKEp5.pdf
# calling of HTML to PDF converter is omitted

See the documentation of tempfile.NamedTemporaryFile here.

Solution #2 (update 20110303)

I had a problem with the previous solution. It works well in command-line, but when I tried to call that script in crontab, it stopped at the line “tempfile.NamedTemporaryFile”. No exception, nothing… So I had to use a different approach:

from time import time

temp = "report.%.7f.html" % time()
print temp    # report.1299188541.3830960.html

The function time() returns the time as a floating point number. It may not be suitable in a multithreaded environment, but it was not the case for me. This version works fine when called from crontab.

Learn more

Update (20150712): if you need a temp. file name in the current directory:

>>> import tempfile
>>> tempfile.NamedTemporaryFile(dir='.').name
'/home/jabba/tmpKrBzoY'

Update (20150910): if you need a temp. directory:

import tempfile
import shutil

dirpath = tempfile.mkdtemp()    # the temp dir. is created
# ... do stuff with dirpath
shutil.rmtree(dirpath)

This tip is from here.

Categories: python Tags: , , , , , ,

Send e-mails via Gmail

February 19, 2011 1 comment

The following entry is based on the excellent post of Kutuma. Credits go to him.

I made some minor modifications: (1) conversion to module, (2) “From:” address shows the name too.

send_email_via_gmail.py module:

#!/usr/bin/env python

import smtplib
from email.MIMEMultipart import MIMEMultipart
from email.MIMEBase import MIMEBase
from email.MIMEText import MIMEText
from email import Encoders
import os

gmail_user = "username@gmail.com"
gmail_name = "User Name <username@gmail.com>"
gmail_pwd = "userpassword"

def mail(to, subject, text, attach):
   msg = MIMEMultipart()

   msg['From'] = gmail_name
   msg['To'] = to
   msg['Subject'] = subject

   msg.attach(MIMEText(text))

   part = MIMEBase('application', 'octet-stream')
   part.set_payload(open(attach, 'rb').read())
   Encoders.encode_base64(part)
   part.add_header('Content-Disposition',
           'attachment; filename="%s"' % os.path.basename(attach))
   msg.attach(part)

   mailServer = smtplib.SMTP("smtp.gmail.com", 587)
   #mailServer = smtplib.SMTP_SSL("smtp.gmail.com", 465)   # didn't work for me
   mailServer.ehlo()
   mailServer.starttls()
   mailServer.ehlo()
   mailServer.login(gmail_user, gmail_pwd)
   #mailServer.sendmail(gmail_user, to, msg.as_string())   # just e-mail address in the From: field
   mailServer.sendmail(gmail_name, to, msg.as_string())   # name + e-mail address in the From: field
   # Should be mailServer.quit(), but that crashes...
   mailServer.close()

if __name__ == "__main__":
    mail("send-mail-to-this-person@address.com",
       "Hello from Python!",
       "This is an e-mail sent with Python.",
       "/tmp/some-image.jpg")

How to use it in a script:

#!/usr/bin/env python

import send_email_via_gmail as gmail

gmail.mail( "send-mail-to-this-person@address.com",
                    "Subject of the mail",
                    "Body of the mail.",
                    "/tmp/report.pdf" )

Update (20130113): pushed on github.

Alternatives (20130810)
Recently I came across some alternatives that are easier to use:

  • envelopes
  • gmail (I tried it and it works well. I used it for reading Gmail messages.)
Categories: python Tags: , ,

Should I use Python 2 or Python 3?

February 4, 2011 Leave a comment

Should I use Python 2 or Python 3?

This is a very common question when someone wants to learn Python. Here is a nice article about this topic: http://wiki.python.org/moin/Python2orPython3.

(Thanks Jaume for the link.)

Update (20110404)

If you are ready to dive in Python 3, here are some tutorials:

  • The official Python 3 tutorial (HTML, PDF)
  • Further Python 3 docs (c-api.pdf, distutils.pdf, documenting.pdf, extending.pdf, faq.pdf, howto-advocacy.pdf, howto-cporting.pdf, howto-curses.pdf, howto-descriptor.pdf, howto-doanddont.pdf, howto-functional.pdf, howto-logging-cookbook.pdf, howto-logging.pdf, howto-pyporting.pdf, howto-regex.pdf, howto-sockets.pdf, howto-sorting.pdf, howto-unicode.pdf, howto-urllib2.pdf, howto-webservers.pdf, install.pdf, library.pdf, reference.pdf, tutorial.pdf, using.pdf, whatsnew.pdf)
  • Dive Into Python 3 (HTML and PDF)

Update (20110526)

I follow the following simple guideline: I use that version of Python that comes with Ubuntu by default. In Ubuntu 10.10 it was Python 2.6, in Ubuntu 11.04 it’s Python 2.7. When they switch to Python 3.x, I will switch too.