Reading HTML with Python

Today I read some HTML data from a table using Python. The tools I ended up using to get the data were:

import urllib
f = urllib.urlopen(<URL_string>)
# Read from the object, storing the page's contents in 's'.
s =

Then to parse tables in the HTML I used lxml which can read HTML and provide an ElementTree like API.

from lxml import etree
html = etree.HTML(s)

I was then able to do everything I needed using list() on the element to get the children and using the attributes .text and .tag to get the text out of tables and links as desired.