<p class="spot_post_date">2021.06.17. <span class="lt_bar"><span>55,060</span>read</span></p>
I want to extract date in that code
It can’t extract date that setting Item date selector to p.spot_post_date
.
How can I delete <span>
tag?
<p class="spot_post_date">2021.06.17. <span class="lt_bar"><span>55,060</span>read</span></p>
I want to extract date in that code
It can’t extract date that setting Item date selector to p.spot_post_date
.
How can I delete <span>
tag?
When you select the date with p.spot_post_date
you’ll get the following string:
2021.06.17. 55,060read
You can delete span elements using Feed Creator’s cleanup by entering p span
in the “Source HTML: Remove elements (CSS)” field.
However, doing so will leave with you with the following string:
2021.06.17.
PHP’s strtotime doesn’t recognise the Y.M.D format, it recognises Y-M-D (with dashes).
So you’ll have to use Feed Creator’s “Item date format” field to guide it. You can enter something like this:
Y.m.d+
The plus (+) sign at the end tells it to ignore the rest of the string, so you can use it even without removing the span elements.
So you enter two pieces of information for the date:
p.spot_post_date
Y.m.d+
Thank you so much! please solve this one too!
https://forum.fivefilters.org/t/failed-to-fetch-url-why/1226/3
I’m having a similar issue to extract the date from the following html:
<div class="author-date">By <span class="author">John Doe</span> on 05/26/2023</div>
I’m using the following options in Feed Creator:
I’m adding the + sign to ignore anything coming before the date, but it’s not working.
Any other options to extract the date?
Thanks.
It doesn’t look like you can isolate the date using CSS selectors for the HTML you’ve provided. So I don’t think you can get the date to be parsed by Feed Creator.
As for using the + sign, that’s part of PHP’s date format handling. It can be used for trailing data only, not leading data. From the docs:
+
If this format specifier is present, trailing data in the string will not cause an error, but a warning instead
You could try something like the following format, but I’m not sure if it will match all entries if the author names have more than 2/3 parts ***** m/d/Y
You might get more consistent results if you use Feed Creator’s cleanup to remove span.author
and then use a format like: **** m/d/Y
Both of your suggestions work perfectly. Thanks so much @fivefilters!