string distances
See the Jellyfish project: “Jellyfish is a python library for doing approximate and phonetic matching of strings“.
Jellyfish implements the following algorithms: Levenshtein Distance, Damerau-Levenshtein Distance, Jaro Distance, Jaro-Winkler Distance, Match Rating Approach Comparison, Hamming Distance.
See the project page for more info.
New string formatting syntax
I’m still using Python 2.6 but I think it’d be a good idea to start using the new string formatting syntax that was introduced in Python 3. Since it was backported to the 2.6 version, we can start using it right away.
Learn more:
- Common string operations on docs.python.org
- formatting strings in Dive Into Python 3
This post is rather a reminder for me that I should read more about this topic. Later, I’ll add some examples too.
Update (20110704)
I asked a question about string formatting on python-list and got lots of useful answers. Here I’d make a short summary.
Old style, but still supported:
"the %s is %s" % ('sky', 'blue')
New style #1:
"the {0} is {1}".format('sky', 'blue')
New style #2, from Python 2.7+:
"the {} is {}".format('sky', 'blue')
New style #3, very useful for long string formattings:
"the {what} is {color}".format(what='sky', color='blue')
In new codes, I stopped using the old style. I use new style #1 and #3.
Related posts
StringBuilder functionality in Python
Problem
You need to concatenate lots of string elements. Under Java we use a StringBuilder for this, but how to do that in Python?
Solution #1
Use a list, and join the elements of the list at the end. This is much more efficient than concatenating strings since strings are immutable objects, thus if you concatenate a string with another, the result is a NEW string object (the problem is the same with Java strings).
Example:
def g(): sb = [] for i in range(30): sb.append("abcdefg"[i%7]) return ''.join(sb) print g() # abcdefgabcdefgabcdefgabcdefgab
Solution #2 (update 20120110)
Use a StringIO object and print to it. In short:
from cStringIO import StringIO out = StringIO() print >>out, 'arbitrary text' # 'out' behaves like a file return out.getvalue()
Reverse a string
Exercise #1: Take a string and reverse its characters. For instance “ab12” => “21ba”.
Solution:
#!/usr/bin/env python s = 'Python adventures' print s # Python adventures print s[::-1] # serutnevda nohtyP
Slice notation has the form [start:stop:step]
. By default, start
is at the beginning of a sequence, stop
is at the end, and step
is 1. So the slice [::-1]
returns the full sequence in reverse order.
Exercise #2: Decide if a word is a palindrome.
Solution:
#!/usr/bin/env python def is_palindrome(str): return str == str[::-1] print is_palindrome('1367631') # True print is_palindrome('Python') # False