Time between requests.


tweakers.net extraction fails most of the time, because they have a rate limit as far as i can tell, so quick subsequent calls are denied. Is there anyway to make the crawler slow down for this site?




Hi Toma, we’re going to look into this to see if there’s a pattern we can detect. But we don’t use a crawler with Full-Text RSS. Requests are processed as they come in and they’re cached for a while in case similar requests come in for the same URL. For sites that rate limit too aggressively and for which we have many users making requests, this can lead to problems.

We do have plans to offer a service with more control over how URLs are accessed and processed, so for difficult feeds/sites, that will offer more flexibility.

You can exclude items where the extraction fails. Since FTR is caching the requests, the failed items will eventually succeed because the cached items will be skipped.