Extracting text from URL that have an '&' (i.e. ampersand) character.

I have been experimenting with the latest version of Full-text RSS. Version 3.2 that is.

So, usually, when I want to extract text from a webpage, I use the following URL :-

http://myfulltextrssinstallationurl/makefulltextfeed.php?key=MYKEY&max=1&format=xml&html=1&url=urltothepageiwanttoextracttextfrom

So, say, if I want to extract text from the URL: http://www.leggo.it/index.php?p=articolo&id=283088

I would use the following URL :-

http://myfulltextrssinstallationurl/makefulltextfeed.php?key=MYKEY&max=1&format=xml&html=1&url=http://www.leggo.it/index.php?p=articolo&id=283088

but, it does not appear to work. Even adding a custom site config file for that particular site did not hel either. I think it is because the URL already has an ampersand in it which Full-text RSS does not appear to like for some reason. My assumption is further confirmed by the fact that when Full-text rss does finish loading the styled xml in the browser, it is pointing the the following URL :-

http://myfulltextrssinstallationurl/makefulltextfeed.php?url=www.leggo.it%2Findex.php%3Fp%3Darticolo&key=0&hash=MYKEYHASH&html=1&max=1&format=xml

Please notice that the whole “&id=283088” part of the URL string is missing from the above URL.

I have tried substituting the ‘&’ symbol from that URL with ‘&’ but that didn’t seem to help either.

So, my question is: Is there any way to get around this in the current latest version?

Thanks for your time!

So, I found the work around after a little more testing. Apparently, if you percentage escape the URL string before giving it to Full-text RSS, it seems to work alright. So, in my case, the following works, with a site config file for leggo.it :-

http://myfulltextrssinstallationurl/makefulltextfeed.php?key=MYKEY&max=1&format=xml&html=1&url=http://www.leggo.it%2Findex.php%3Fp%3Darticolo%26id%3D283088

Joesph S.

Hi Joseph, yes, the URL should be encoded before it is passed to Full-Text RSS. When you input the URL in our web form and click ‘Create Feed’, your browser will automatically encode it and produce the correct URL. You should then be able to copy the URL from your browser’s address bar and use it wherever you like.

If you’re not using the form, you will have to encode it yourself. If you’re generating the URL in code, you can use a URL encoding function. In PHP this is called urlencode(), see http://php.net/urlencode If you’d like to produce the URL manually, you can use the encoder on this site: http://meyerweb.com/eric/tools/dencoder/ - simply paste the URL and click ‘Encode’.

Hope that’s some help.

Hi Keyvan,
Thanks for the heads up.
I’m generating the URL in code. More specifically in Objective-C code.
And, this is what I have found to work :-

[myUrlString stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding]

Thanks again for the help, and the awesome product.
Also, eagerly waiting for the self-hosted version of Push to Kindle to become available.

Best,

Joe

Joesph S.

Thanks for the update, Joe. Glad you got it working.

As for Push to Kindle, we haven’t forgotten about the self-hosted version. We’re working on something which I hope will be ready soon. :slight_smile: