Feature Idea

On a couple of the WordPress plugins I use for extracting content, the parameters for what is to be extracted get set visually with a feature generally called visual inspector. You enter a URL you want to scrape and then use the visual inspector to tell the software what sections of the page you scraped. The visual inspector opens a link you provide in a separate window and as you move around the page different css and html elements are highlighted. As you choose the sections you want to be scraped by clicking them, the inspector translates those visual selections to their respective code:

Here are some sections I pulled from an Alltop page:

headline - /html/body/div[4]/div[7]/div[1]/h3
page section - /html/body/div[4]/div[7]/div[2]
whole page - /html/body/div[4]

You can select whatever you want and create as many or as few sections as you need. It’s really quite effective for pinpointing exactly what you want to scrape. Maybe some variation of this idea could be used for your script?

Please delete… the same could be accomplished with the chrome plugins you mention on your blog

Thanks for the suggestion, we would actually like to offer something like this, rather than asking users to rely on plugins, so we do appreciate the feature request.