Fix `TypeError` by Ignoring null characters in PSBaseParser (#768)

* Ignore null characters in PSBaseParser Beforehand, null characters were encoded as PSKeyword tokens. This caused issue #617, as pdfdevice.py would attempt to decode the null character PSKeyword, when it expects a byte string, as opposed to a PSKeyword, causing pdfminer.six to crash. As null characters are superfluous within PSBaseParser, ignore them. * Update CHANGELOG.md Co-authored-by: Pieter Marsman <pietermarsman@gmail.com>
2022-06-26 15:46:39 +00:00 · 2022-06-26 15:46:39 +00:00 · ebf92acf0c
parent f63e9fbee9
commit ebf92acf0c
2 changed files with 3 additions and 0 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -11,6 +11,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
 - `ValueError` when trying to decrypt empty metadata values ([#766](https://github.com/pdfminer/pdfminer.six/issues/766))
 - Sphinx errors during building of documentation ([#760](https://github.com/pdfminer/pdfminer.six/pull/760))
 - `TypeError` when getting default width of font ([#720](https://github.com/pdfminer/pdfminer.six/issues/720))
 - `TypeError` in cmapdb.py when parsing null characters ([#768](https://github.com/pdfminer/pdfminer.six/pull/768))
 ### Deprecated
--- a/pdfminer/psparser.py
+++ b/pdfminer/psparser.py
@ -334,6 +334,8 @@ class PSBaseParser:
            self._curtoken = b""
            self._parse1 = self._parse_wclose
            return j + 1
        elif c == b"\x00":
            return j + 1
        else:
            self._add_token(KWD(c))
            return j + 1