podlators 2.2.0

While incorporating some of the podlators fixes into the Perl that will be released with lenny, Niko Tyni found more problems with the current implementation. I'd written the test suite to force the output encoding on the file handle used by Pod::Man and Pod::Text, but pod2man and pod2text weren't doing that, and hence the output wasn't properly encoded.

I'm finding Perl's Unicode support very tricky, particularly if I want to support the PERL_UNICODE option that adds input and output encodings on all file handles. I can't use encode before output because PERL_UNICODE encoding layers will then double-encode. The approach taken in this release is to force the appropriate output encoding on the output file handle (possibly destroying caller state). This works, but it's ugly.

Pod::Text had another problem. Its historic behavior is to copy the input encoding to the output, but doing that with properly-tagged UTF-8 POD input means printing out wide characters in the internal encoding, resulting in Perl warnings. This version now, in the absence of the utf8 option, forces the output encoding to match the input encoding if possible. This seems like the best compromise and matches its historic behavior. pod2text now supports a --utf8 option since it's now useful to force a different output encoding.

I wanted to get something out to stop the bleeding, but there are a few problems with the current approach. As mentioned above, I'm forcing file handle encoding layers, which may destroy encoding layers created by the caller that were intentional. This implementation also depends on PerlIO support, which is an optional Perl feature and not always enabled.

I think the right long-term approach is to probe for PerlIO support, and if present, use it to check the encoding layer on the output file handle. If PerlIO is not set, or if no encoding layer is present, I can safely use encode before output, which doesn't mess with the file handle state and is much cleaner. If PerlIO is present and an output encoding is set, just trust that output encoding (although if the utf8 option is set and the output encoding is something else, it may be a good idea to throw a warning).

I'm not sure when I'll have time to implement this, though, since it will require another reworking of the code and more testing, and I'm behind on a number of other projects.

In any event, the current version will work for most people in most cases. You can get it from the podlators distribution page.

Posted: 2008-10-05 12:52 — Why no comments?

Last spun 2022-02-06 from thread modified 2013-01-04