Articles with an intro paragraph

Hi - Lots of news articles feature an intro paragraph in a different style/block to the main article. FiveFilters seems to miss this fairly often. Is there anything I can do to improve the response?


FiveFilters misses the text above the image “Argentinianul Lionel Messi …” but gets the main body ok. Are there any global options to make FF look ‘further’ than the main block?


As you mention, the structure of the page really does influence what’s considered the main content. I notice for this article, Firefox’s Reader view also misses the intro paragraph.

For our tools the way we address this is by writing custom extraction rules for sites to explicitly target the elements we want to preserve. We’ll try to add one for this site too. I’ll update here once we do.

Thanks. That’s ok, we are writing custom patterns, but we see this so much we were hoping there was something we could tweak.

There will be changes in Full-Text RSS 4 planned for later this year that should improve the automatic detection and extraction. There’ll still be plenty of cases where custom patterns will be needed, but hopefully not as many.