[unable to retrieve full-text content] for many sites

How do I use these? I found most of the sites I want shows this error. for example:
https://www.kens5.com/article/news/local/police-standoff/273-7b2ee18d-0bfc-412d-a0bb-5240b8c9714c
https://www.kark.com/news/local-news/hot-springs-man-arrested-in-connection-to-sunday-morning-apartment-shooting/
https://www.easttexasmatters.com/crime/houston-man-arrested-in-nacogdoches-with-10-pounds-of-cocaine-two-stolen-guns/
https://13wham.com/news/local/sunday-night-shooting-in-rochester


do I need to change some config to do that? I am using the API calls to do it for now. So I need to host my own to make it work? Thanks!

Some of these sites appear to be blocking access to EU users and servers.

For developers we do offer a hosted Full-Text RSS service running on servers outside the EU: https://rapidapi.com/fivefilters/api/full-text-rss-us

If you’re running it yourself or are thinking about hosting it yourself, you should install Full-Text RSS on servers in the US or Canada and some of those access denied messages will go away.

Thank you very much. I am in US and will try your new endpoint. Quick question: usually what may cause the parsing failed? what’s the percentage of sites that block this server? If we host our own, is there some config we can use to enable more sites manually?
Thanks again!

There can be any number of reasons why we cannot extract content from a given web page. It might have to do with not being able to access the server (either because it’s down or because it blocks our request). It might be content that’s behind a paywall or a GDPR/cookie wall. It might be that we get a response but the way the content is presented results in bad article extraction.

In some cases we can get around these with site config files, but in others we simply can’t.

We don’t have any numbers on who blocks the servers we use. In some of the examples above, where the site has chosen to block access to EU IPs, you should find that running the software on a US server will get you the content you need. But we encourage you test for yourself with sites that matter to you.

Thank you very much for the tips. I just purchased the self hosted version and hosted on my site successfully. I got one more question: when it get data for description and some other fields, it will truncate at certain length, showing dot dot dot at the end. And obviously the raw data is there and longer. Can I config to make that length longer instead of truncate?
Thanks!

We don’t currently let you set that in the config file, but if you open makefulltextfeed.php you’ll be able to search for the get_excerpt function. It’s set to use 55 words, which you can change by editing the function definition:

get_excerpt($text, $num_words=55, $more=null) {

We hope to move this number to the config file in a future release.