I’m trying to extract the content from an RSS but aftrex extraction I find some unwanted rows of words. I checked the original html file to try to skip this part of the article bu I noticed that the is no particular div, just links and br codes. My question is: is possible to skyp in asome way this unwanted text. Is possible to ask to the program to look only from the second paragraph or something like that?
The url is: http://tg24.sky.it/tg24/politica/2011/12/07/pensioni_monti_1400_euro_indicizzazioni_sindacati_pd_pdl_polillo.html
Hi Mark, can you post the specific bit of HTML that you’re trying to exclude?