A bug has been discovered in the current build of PDFTextStream (v1.2) that can result in some Icelandic characters being outputted improperly.
It has come to our attention that a bug in v1.2 of PDFTextStream may result in some Icelandic characters being outputted improperly. This issue will manifest itself only if:
- PDFTextStream is configured with strictEncoding set to true (via PDFTextStreamOptions.setUseStrictEncoding(boolean))
- PDFTextStream is used to extract text and metadata from a PDF containing certain Icelandic characters, including Ð (Eth), ð (eth), Þ (Thorn), and þ (thorn)
We have found the root of the problem, and a fix is being developed. A bugfix release including this fix will be released by the end of this week.
Update: This issue has been resolved, but not by a bugfix release — the issue originally arose because of a malformed PDF document. See this post for the gory details…