Python and HTML Processing
Popularity Report
![]() |
|||
![]() |
|||
![]() |
|||
![]() |
|||
![]() |
|||
![]() |
URL Tag Cloud
Bookmark History
Saved by 25 people (-4 private), first by anonymouse user on 2006-07-27
- Alanbtn on 2009-10-13 - Tags Python , Google_bookmarks
- Imrchen on 2009-06-25 - Tags python , programming
- Fingerstyle on 2009-06-22 - Tags Python
- Agilenature on 2009-02-01 - Tags python , html , parsing
- Gourdoux on 2009-01-28 - Tags python
Public Sticky notes
Fetching standard Web pages over HTTP is very easy with Python:
import urllib
# Get a file-like object for the Python Web site's home page.
f = urllib.urlopen("http://www.python.org")
# Read from the object, storing the page's contents in 's'.
s = f.read()
f.close()
Highlighted by imouthesmp
Supplying Data
Sometimes, it is necessary to pass information to the Web server, such as
information which would come from an HTML form. Of course, you need to know
which fields are available in a form, but assuming that you already know
this, you can supply such data in the urlopen function call:
# Search the Vaults of Parnassus for "XMLForms".
# First, encode the data.
data = urllib.urlencode({"find" : "XMLForms", "findtype" : "t"})
# Now get that file-like object again, remembering to mention the data.
f = urllib.urlopen("http://www.vex.net/parnassus/apyllo.py", data)
# Read the results back.
s = f.read()
s.close()
Highlighted by imouthesmp
Highlighted by reckoner
ng a Parser Class
First of all, let us define a new class inheriting from
SGMLParser with a convenience method that I find very convenient
indeed:
import sgmllib
class M
Highlighted by gialloporpora


Public Comment