Try any nytimes.com link. Just says [unable to retrieve full-text content]
Hi there, thanks for letting us know. We’re not seeing a problem with Full-Text RSS itself, but it does seem New York Times is rate limiting requests, so some of our busier Full-Text RSS servers may be affected. Are you self hosting or using our hosted Full-Text RSS service?
I am self-hosted but I see the same issue at ftr https://ftr.fivefilters.net/ as well. For example when I try the feed
I see this
If you make frequent requests to the nytimes.com domain, they will stop responding to your IP. So depending on how many requests are going through to the domain from your server (or ours) you may not get results back.
So when you see “[unable to retrieve full-text content]”, that can either mean that Full-Text RSS got the HTML from the server but couldn’t extract the article content, or that the server refused to respond with the content. Some sites have rules preventing automated access or limiting how often their servers respond to requests coming from the same IP. That’s what’s happening here. Full-Text RSS can extract the article content from the site, but only if that content is actually received from their servers.
We have a solution for this in our Feed Control application. It currently requires a few steps:
- Add feed as it is to your Feed Control account, without enabling any full text fetching
- Edit the feed in Feed Control to enable full-text fetching and the rotating proxy (see screenshot)
- Use the action dropdown for the feed to delete existing items and then to refresh the feed.
- Wait a few seconds for the feed to update.
- From now on the feed should update fine, and you’ll be able to find the feed URL in the Generate RSS/JSON tab.
More information on our blog: Using proxy servers for content retrieval
If you’d like to do this for your self-hosted instance, we have another blog post:
It retrieved the NYT article for me.
Is Feed Control a separate product? Is it available for self hosting? If I purchase Feed Control, will I have to buy Full Text RSS (to convert partial feeds to full feeds) and Feed Creator (to create feeds from web pages) as well?
Feed Control is currently a hosted service we provide and there’s currently no plan to offer a self-hosted version. So currently the only way to use it is to subscribe.
We’ve written about some of the differences between Feed Control, Feed Creator and Full-Text RSS here: