We use full-text-RSS to extract the articles content on an RSS aggregator website. A bug was reported to us today:
It seems like anarchistfoderation is only grabbing the main part of indymedia germany articles with the class " field-name-body", and not the “introduction/abstract” with the class “field-name-field-abstract”. Many times people write a different abstract that is NOT included in the main article part. That means when only the article part is grabbed, many times is missing a first part of the article, because people most times cut it out to put in the abstract/summary/introduction…
Here one example:
the indymedia germany article WITH abstract/summary/introduction: https://de.indymedia.org/node/315930
The grabbed article at anarchistischefoderation.de WITHOUT the abstract, only main part: (B) Demo am 25.11. Gegen Tom Schwarz: Frauenschläger aus der City jagen! – 🏴 Anarchistische Föderation
We tried to fix the issue by creating a custom config file for de.indymedia.org.txt and directly targeted elements containing the full article text with the introduction, like this:
`body: //div[@id='#block-system-main'] body: //div[@class='content'] body: //div[@class='node'] body: //div[@class='node-artikel.content'] `
But the result is the same and the introduction is still missing. What are we doing wrong?