Hi,
I’m using 3.2 and trying to use next_page_link for a specific blog, but I’m getting a “This article appears to continue on subsequent pages which we could not extract” message that I’m not sure how to address. (Blog and sample multi-page post here http://www.thefashionisto.com/salieu-jalloh-ronald-epps-liam-vandiar-get-sporty-vman/) I tried
next_page_link: //div[@class='page-nav page-nav-post']/a
and
next_page_link: //a[contains(@href, '/2/')]
and get the same message.
Would you be able to please point me in the right direction? Any help would be greatly appreciated.
Thank you.
Hi there,
My guess is that your next_page_link XPath expression returns the URL of the current page, or a URL which has already been processed - for example, the last page of a multi-page article could still contain a link (either to itself or one of the preceding pages which have already been processed) which your XPath expression treats as a next page link. In these cases Full-Text RSS will not retrieve it and will show you that message. If that’s the case, you will have to be more explicit with your XPath expression to make sure the next page link does not a return a URL on the last page.
Hope that’s some help. If you still have trouble, please enable debug mode (there’s a checkbox on the form which you can tick) and send us the output. That should contain more information about the cause.