Parsing IETF documentation

Push to Kindle is inconsistent in it’s ability to parse documents from IETF in the “htmlized” format. For example, draft-ietf-oauth-sd-jwt-vc-01 does not parse correctly. This is not universal across documents available through IETF.

Push to Kindle on Chrome 119, MacOS Sonoma

What’s the best way to resolve this without using the paste functionality?

Thanks,
-dhs

@dhs, just wrote a site-config for this page but I need a few more information before setting this live.

As we can match our configs only on hostname (domain/subdomain) basis, we can’t set a config only valid for /doc/html/ so I need to get sure my config doesn’t make things worse for other parts of ietf.

  • Are there pages with other formats under datatracker.ietf.org. Please post example links
  • Please post other links of the htmlized pages, where you found different behaviors of P2K.
  • Are there other subdomains as datatracker worth to have a look on it? Example links please.

Configs for *.ietf.org are now live. Please test and report problems with example URLs

2 Likes

Testing with a handful of URLs that were previously broken all appear to work. URLs that worked previously continue to work without any issues. Thank you for the quick support!

1 Like

Thanksk @HolgerAusB and @dhs! Great to hear extraction has improved! :+1:

I found another doc that doesn’t parse correctly.

Following " An OAuth 2.0 application using this specification MUST specify what well-​known URI string it will use for this purpose. The same protected resource MAY choose to publish its metadata at multiple well-​known locations relative to its resource identifier, for example, publishing metadata at both /.well-known/example-resource-configuration and /.well-known/oauth-protected-resource ." the parser skips the remainder of section 2 and moves on to Section 3.

Yes, the automatic cleanup was too excessive. I switched it off and also removed the pilcrows. Unfortunately, I don’t know how long it takes for the cache to empty in P2K. Please try again in an hour or two.

Thanks. Looks like the filter is updated and working correctly. Thanks for the quick response!