FTR getting the wrong content

I tried using Feed Creator on ThreadReader and then feed that to FTR. The Feed Creator feed works well, but FTR gets the wrong stories or sometimes the same stories. I’ve got it set up on my own copy of the tools, but I’ve reproduced it on the public one here:
http://ftr.fivefilters.net/makefulltextfeed.php?summary=1&content=0&use_extracted_title=1&url=https%3A%2F%2Fcreatefeed.fivefilters.org%2Fextract.php%3Furl%3Dhttps%3A%2F%2Fthreadreaderapp.com%2Fuser%2FAdamWagner1%26in_id_or_class%3Dtweet-url%26unique_title%3D0%26max%3D5%26order%3Ddocument%26guid%3D0

If you click on the URLs you’'ll see the posts don’t match. For the time being I ended up just using Feed Creator and manually visiting the ThreadReader pages, but this is unhelpful.

This is the first time I’ve had FTR pull the wrong stories. On my instance I tried changing IP and clearing the browser cache in case there was any caching but that didn’t help, and also the same behaviour happened when subscribed to these feeds in Newsblur.

Any idea what is causing this issue and how to fix it?

Thanks for reporting this. We did actually have custom extraction rules for Full-Text RSS for this site, but it appears it didn’t match on all pages. Should now be fixed: ftr-site-config/threadreaderapp.com.txt at master · fivefilters/ftr-site-config · GitHub

Please update your site config files for Full-Text RSS and try again.

When an extraction rule fails to match, Full-Text RSS will attempt to detect the content block. In this case, it picks out a tweet from further down on the page, rather than the main thread. (This can happen when the HTML structure doesn’t conform to what a typical text article looks like - Mozilla’s Reader mode also fails to extract the content correctly on this site for similar reasons.)

Thank you the explanation and the fix. I confirm it now gets the correct content.

It doesn’t get all the content, such as the graphs in most of the posts of http://ftr.fivefilters.net/makefulltextfeed.php?use_extracted_title=1&url=https%3A%2F%2Fcreatefeed.fivefilters.org%2Fextract.php%3Furl%3Dhttps%3A%2F%2Fthreadreaderapp.com%2Fuser%2Fchrischirp%26in_id_or_class%3Dtweet-url%26unique_title%3D0%26max%3D5%26order%3Ddocument%26guid%3D0
but that’s a different issue that probably can be easily fixed looking at the content.

This should also be fixed now. Please try updating the site config file again to pull in the latest changes.

It is indeed. It looks great. Thank you very much.

FTR and also Feed Creator are tools are rely on every day. Thank you again for these great apps.

1 Like

Thank you! Appreciate the kind words, glad you’re finding them useful.