Archive

Archive for April, 2011

What’s New in Python 2.7

April 29, 2011 Leave a comment

Ubuntu 11.04 comes with Python 2.7 (the previous release contained 2.6). Learn more about Python 2.7 here.

Categories: python Tags:

Calling external commands

April 28, 2011 Leave a comment

How to call external commands in Python:

Categories: python Tags: ,

Debugging in Python

April 24, 2011 4 comments

If you are doing debugging with “print“, it’s time to try one of the following methods.

Debugging with pdb/ipdb

If you want a clear and gentle introduction to the usage of the Python debugger “pdb“, read Steve Ferg’s excellent tutorial Debugging in Python.

Here I make a short summary for reference purposes:

  • import pdb” or “import ipdb as pdb“, then “pdb.set_trace()
  • n (next)
  • ENTER (repeat previous)
  • q (quit)
  • p <variable> (print value)
  • c (continue)
  • l (list where you are)
  • s (step into subroutine)
  • r (continue till the end of the subroutine)
  • ! <python command>

ipdb is like pdb but it adds syntax highlightning and completion. You can install it with “sudo pip install ipdb“. If you used pdb with “import pdb“, just change this line to “import ipdb as pdb“. This way the line “pdb.set_trace()” can be left unchanged.

Debugging with Winpdb

Another interesting debugger is Winpdb, which is a platform independent GUI debugger for Python with support for multiple threads, namespace modification, embedded debugging, encrypted communication… It can be installed from the Ubuntu repos (sudo apt-get install winpdb). Tutorial here.

Short summary again:

  • restart (restart debugging session)
  • exit
  • n (next)
  • go (continue)
  • x <python command> (exec, changes state)
  • v <python command> (eval, print to console, no changes in state)
  • j <line> (jump)
  • s (step in subroutine)
  • r (return from subroutine)
  • bp <line> (breakpoint @ line)
  • bp <line>, <expression> (conditional breakpoint @ line)
  • bl (breakpoint list)
  • bc <id> | * (breakpoint clear)

Debugging with the Eric IDE

I would sum up how to debug with the Eric IDE too:

  • F5 (start debugging; untick “Don’t stop at first line” or set a breakpoint)
  • F10 (stop)
  • F7 (next, step in subroutines)
  • F8 (next, step over subroutines)
  • F9 (return, step out of subroutine)
  • F6 (continue, go, run)
  • Shift+F6 (continue till cursor)
  • conditional breakpoints are supported (set a breakpoint, right click on it, edit)

I find Eric’s debugger is much faster than Winpdb.

Categories: python Tags: , , , , ,

Understanding imports and PYTHONPATH

April 23, 2011 Leave a comment

If you have problems with imports or you want to know how to write your own library and make it globally available, read Dan Fairs’ excellent article entitled Understanding imports and PYTHONPATH.

Categories: python Tags: , ,

Pythex: a real-time regexp editor

April 20, 2011 Leave a comment

Pythex is a real-time regular expressions editor for Python. Just paste in a test string and start writing your regular expression. Pythex will mark in green the part of the test string that is covered by your regexp. Useful stuff!

Related

/ discussion /

Categories: python Tags: ,

Determine the image type (JPG, GIF, PNG, etc.)

April 19, 2011 Leave a comment

Problem

You want to process an image but you want to verify if the user-specified input file is really an image.

Solution #1

There is a standard module for this called imghdr. Its usage is very simple:

>>> import imghdr
>>> imghdr.what('/tmp/bass.gif')
'gif'

The method checks the content of the file.

Solution #2

If you want a more general solution, i.e. you want to figure out the type of an arbitrary file, use the Python binding to the command “file“.

Command-line example:

$ file lolcat.jpg 
lolcat.jpg: JPEG image data, JFIF standard 1.01

If you want to use it from Python, install the package “python-magic” (it’s in the Ubuntu repos). It comes with the following example:

import magic

ms = magic.open(magic.MAGIC_NONE)
ms.load()
type =  ms.file("/path/to/some/file")
print type

f = file("/path/to/some/file", "r")
buffer = f.read(4096)
f.close()

type = ms.buffer(buffer)
print type

ms.close()

Update (20131218)
Here is how to convert the return value of ms.file to a file extension:

FTYPES = {
    'JPEG' : 'jpg',
    'GIF' : 'gif',
    'PNG' : 'png',
}

def get_file_type(fname):
    ftype = ms.file(fname).split()[0]
    return FTYPES.get(ftype)

Solution #3
You can also use the module PIL to verify if the given file is an image. Refer to this thread for some examples.

Categories: python Tags: , ,

Python Module of the Week by Doug Hellmann

April 16, 2011 1 comment

Update (20130225): PyMOTW has moved to http://pymotw.com. Update your bookmarks.

PyMOTW is a series of blog posts written by Doug Hellmann. It was started as a way to build the habit of writing something on a regular basis. The focus of the series is building a set of example code for the modules in the Python standard library.

Doug guides you through the Python standard library with lots of examples. Definitely a must read!

Example:

json – JavaScript Object Notation Serializer

PyCon Italia

April 12, 2011 Leave a comment

Download genomes from Genbank

April 12, 2011 Leave a comment

Problem

For a project, I had to download a bunch of records from the NCBI (National Center for Biotechnology Information) website. A record looks like this: CP002059.1 (almost 5 MB):

LOCUS       CP002059             5354700 bp    DNA ...
DEFINITION  'Nostoc azollae' 0708, complete genome.
ACCESSION   CP002059 ACIR01000000 ACIR01000001-ACIR01000216
VERSION     CP002059.1  GI:298231532
DBLINK      Project: 30807
...
ORIGIN
//

I needed this data in text format.

Solution #1
My first idea was to download the page with wget. However, I was surprised to see that the downloaded file was less than 100 KB instead of 5 MB! When I looked at the source, it turned out that it’s full of AJAX calls. That is, the browser downloads this short HTML and then it is expanded. If you save the page with File -> Save as…, you have the complete HTML but how to automate the download process? How to get the post-AJAX version of a web page?

I will write about this problem and its general solution in another post.

Solution #2
Fortunately, there is a CGI program at NCBI that can return us the required data. For instance, the data of CP002059.1 can be retrieved via the following URL:


http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=CP002059.1&rettype=gb

A (very) short overview of the EFetch CGI is here.

If you use Biopython, you can download this record like this:

from Bio import Entrez

# ref.: http://wilke.openwetware.org/Parsing_Genbank_files_with_Biopython.html

# replace with your real email (optional):
Entrez.email = 'whatever@mail.com'
# accession id works, returns genbank format, looks in the 'nucleotide' database:
handle=Entrez.efetch(db='nucleotide',id='CP002059.1',rettype='gb')
# store locally:
local_file=open('CP002059.1.gb', 'w')
local_file.write(handle.read())
handle.close()
local_file.close()

Solution #3 (in Perl)
Let’s see the same thing in Perl too, using the BioPerl package. Thanks Alix for the Perl code.

#!/usr/bin/perl

use Bio::Perl;
#use Bio::Seq;
#use Bio::Tools::Run::RemoteBlast;
use Bio::DB::GenBank;
#use Data::Dumper;

use strict;

my $gb = new Bio::DB::GenBank;

my $id = 'CP002059.1';

my $seq = $gb->get_Stream_by_acc($id);
while( my $seq_elt =  $seq->next_seq ) {
    write_sequence(">$id.gb", 'genbank', $seq_elt);
}

Update (20110706)
I forgot to mention how to install Biopython:

sudo pip install biopython

GUI for the output of PyLint

April 7, 2011 Leave a comment

Problem

I discovered PyLint yesterday and after some tests I find it very useful. However, one thing bothered me in the workflow. PyLint tells you where (in which lines) you should improve your code but if you add/remove some lines in the source, these line numbers become invalid. Thus, you need to relaunch pylint lots of times until you resolve all the problems.

Solution

Idea: make a simple GUI that shows the output of PyLint. If necessary, refresh the content of this window.

Download:

Visit https://github.com/jabbalaci/PyLint-Output-Visualizer. Source code is here.

Usage:

pylov.py <source_to_be_analyzed.py>

You can refresh the content by pressing ‘r’, ‘u’, or F5.

Update (20110408)
This morning I was notified that PyLint has a simple GUI that is shipped with it; it’s called pylint-gui :)) Great! Why is it nowhere mentioned on the project’s home page? I’ve read several reviews too, nobody says it has a GUI… Now I searched for the string “gui” in the manual and yes, they mention it in two lines, but no screenshot! Either you read it word by word or you miss it. To fill the gap, here is my screenshot of the mysterious pylint-gui:

Well, if you prefer minimal design, you can try Pylov :) Otherwise use the official GUI.

Update (20110426)
I made Pylov because the PyLint plugin of the Eric IDE didn’t have the refresh option. I contacted the author of Eric and he added this feature :) So if you use Eric, it is recommended to use the PyLint plugin.

[ @reddit ]

Categories: python Tags: , , , , ,
Follow

Get every new post delivered to your Inbox.

Join 63 other followers