Strip all content after -ends-

I have a feed with a lot of text.

Any text that comes after the string “-ends” is not needed and i want to remove this from the feed including the word “-ends” itself.

any help with the syntax?

thanks
Andy

Hi Andy, depending on the structure of the page, you might be able to do this with XPath. Otherwise, one hacky way to do it is is to use the find_string and replace_string commands to end the document early. E.g. in the site config for this site, you would put:

find_string: -ends
replace_string: <!–

Hope that’s some help.

ok thanks for this…

I have named my file as:

.capellahotels.com.txt

and i have put this in the custom folder. This is the content of the file:

title: //h1[@class=’title’]
body: //div[@class = ‘content’]
strip: //div[@class=‘lastUpdated’]

find_string: -ENDS-
replace_string: <!–

It doesn’t seem to do anything… am I missing a step?

Andy

Hi Andy, can you give us the URL you’re trying to extract from? You can reply here or email it to us at help@fivefilters.org

Sure that will be:

http://bit.ly/U8f8qA

Andy

Hi Andy, the wildcard naming (beginning with a dot) is intended for non-www subdomains. It will not be used for www. hostnames. So in this case, you should name it simply capellahotels.com.txt

Also, bear in mind that the find_string command is case sensitive. One of the articles I looked at used -Ends- instead of -ENDS-

Hope that’s some help.

ok, i have changed that but still doesn’t work… i am looking at the first rss item which has capital -ENDS-

it’s not being truncated … is there something wrong with my code? did it work for you?

Andy

Hi Andy, sorry, forgot to reply. If you’re still having trouble with this, can you please email us the URL where you’ve installed Full-Text RSS and the URL of the article which you’re trying to write a custom site config file for. Thanks.