HTML tags in content will make the parsed result a mess

Thanks for your great tool! When I use Push to Kindle to push a page in Safari online books, it does not handle the HTML tags in the article content well. It seems recognize them as real HTML tags.

With a tool like egrep, it doesn’t seem particularly common or useful to simply match lines with HTML tags. But, exploring a regular expression that matches HTML tags exactly can be quite fruitful, especially when we delve into more advanced tools in the next chapter.

Looking at simple cases like ‘’ and ‘


’, we might think to try <.>. This simplistic approach is a frequent first thought, but it’s certainly incorrect. Converting <.> into English reads “match a ‘<

Hi there, could you please give us the URL of the page you’re trying to send so we can take a look?