(20131215) This post is out-of-date. BeautifulSoup 4 has built-in support for CSS selectors. Check out this post.
A few days ago I started to explore lxml (it’s been on my list for a long time) and I really like its CSS selector. As I used BeautifulSoup a lot in the past, I wondered if it were possible to add this functionality to BS. I made a quick search on Google and here is what I found: https://code.google.com/p/soupselect/.
“A single function, select(soup, selector), that can be used to select items from a BeautifulSoup instance using CSS selector syntax. Currently supports type selectors, class selectors, id selectors, attribute selectors and the descendant combinator.”
Just what I needed :) You can also patch BS and integrate this new functionality:
>>> from BeautifulSoup import BeautifulSoup as Soup >>> import soupselect; soupselect.monkeypatch() >>> import urllib >>> soup = Soup(urllib.urlopen('http://slashdot.org/')) >>> soup.findSelect('div.title h3') [</pre> <h3>...