Something weird RSS and related question

http://help.fivefilters.org/customer/portal/questions/263879-differences-between-real-article-s-address-and-rss-s-address

First of all, it’s too late to buy your best product. So sorry for about it.

I have question about reading full text from partly-written rss.

There is an one of the popular blog site in korea, it called “NAVER Blog”.
And I requested for common support for those NAVER Blog.

So, after that, I could get the full text RSS from any NAVER Blog site.

But, I had to face with the another problem with naver blog.

NAVER Blog support “blog.naver.com/OOO” and recently they added the domain “OOO.blog.me”.

so when I tried to get the rss from “OOO.blog.me” and tried same extract rule. but it failed.

Because “OOO.blog.me” has very strange structure for get the address of every RSS.

For example, the site I want to get the full text is “http://santa_croce.blog.me/”
They gave the “http://santa_croce.blog.me/rss” and the first link of RSS is connected to “http://santa_croce.blog.me/220418678330”.

And then, in the HTML source of site,

“”

So, to in the “src”,

“”

It is almost similar to the custom extraction rule for “blog.naver.com”.
But compared to “blog.naver.com”, it has one more frame and one more things to get the text.

I know that it isn’t able to get the custom rule for personal use.

but I think it need the new way of get the text from RSS, and It could be better to get more RSS and better quality.

Thank you for your continuous update and making of Full-Text RSS.

Can I get the answer for this question later? I know that this is far difficult to answer, but I think I have any answer(sorry or any futher compromise like that) for this because I bought Full-Text RSS.

I’m so sorry to write this public answer in public page, but I want to get the answer for this question. It was written more than 15 days before.

KIHYEN

Hi Kihyen, sorry it’s taken a while to answer. I did actually look into this nearer the time of your first post and thought I had posted a reply.

The site is, like you describe, unusual in that it hides the content away in iframes that then load other iframes. If it was just the one iframe when you loaded the item URL in the RSS, we could create an XPath expression to follow the iframe URL using the single_page_link directive. But in this case we’d just end up with another page that hides the content away in an iframe. The way Full-Text RSS works at the moment, we assume that when we follow single_page_link, we’ll reach the content - that’s not the case here.

On top of that, the example you give - http://santa_croce.blog.me/220418678330 - will not work due to the underscore in the hostname. Underscores are not common in hostnames, and generally not recommended from what I can find. We rely on PHP’s URL validation which seems to think the underscore makes the URL invalid (I haven’t looked into the RFCs, but brief reading suggests they’re not invalid, just not recommended and not supported by all systems).

If either of these issues turn out to be more common and not just quirks of naver.com, we’ll try to find a solution. But so far I haven’t come across many reports like this.

Thank you for your reply.
Naver want to give the sub-domain(XXX.blog.me) to every naver blog user.
So many people changed the domain like it and someone already used their own domain(XXX.com)

I know Naver has very weird structure like making two frames for showing contents.
That’s bizarreness of korean blog site(blog.daum.net also has inner frame for blog.)

I hope someday it will be able to be worked by Full-text RSS(until my 1 year free updates :p).

Kihyen