Fix for when 'trailer' is indented (#513)

* Fix for when 'trailer' is indented

Closes #214

* Address CR comments - strip line after parsing

* Update CHANGELOG.md

Co-authored-by: Pieter Marsman <pietermarsman@gmail.com>
pull/522/head^2
Jake Stockwin 2020-10-24 17:55:07 +01:00 committed by GitHub
parent 61300eef70
commit ec223d1f1d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 5 additions and 5 deletions

View File

@ -31,6 +31,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
### Fixed ### Fixed
- Rename PDFTextExtractionNotAllowedError to PDFTextExtractionNotAllowed to revert breaking change ([#461](https://github.com/pdfminer/pdfminer.six/pull/461)) - Rename PDFTextExtractionNotAllowedError to PDFTextExtractionNotAllowed to revert breaking change ([#461](https://github.com/pdfminer/pdfminer.six/pull/461))
- Always try to get CMap, not only for identity encodings ([#438](https://github.com/pdfminer/pdfminer.six/pull/438)) - Always try to get CMap, not only for identity encodings ([#438](https://github.com/pdfminer/pdfminer.six/pull/438))
- Recognizing 'trailer' keyword with spaces as prefix or suffix ([#513](https://github.com/pdfminer/pdfminer.six/pull/513))
## [20200720] ## [20200720]

View File

@ -93,16 +93,15 @@ class PDFXRef(PDFBaseXRef):
while True: while True:
try: try:
(pos, line) = parser.nextline() (pos, line) = parser.nextline()
if not line.strip(): line = line.strip()
if not line:
continue continue
except PSEOF: except PSEOF:
raise PDFNoValidXRef('Unexpected EOF - file corrupted?') raise PDFNoValidXRef('Unexpected EOF - file corrupted?')
if not line:
raise PDFNoValidXRef('Premature eof: %r' % parser)
if line.startswith(b'trailer'): if line.startswith(b'trailer'):
parser.seek(pos) parser.seek(pos)
break break
f = line.strip().split(b' ') f = line.split(b' ')
if len(f) != 2: if len(f) != 2:
error_msg = 'Trailer not found: {!r}: line={!r}'\ error_msg = 'Trailer not found: {!r}: line={!r}'\
.format(parser, line) .format(parser, line)
@ -118,7 +117,7 @@ class PDFXRef(PDFBaseXRef):
(_, line) = parser.nextline() (_, line) = parser.nextline()
except PSEOF: except PSEOF:
raise PDFNoValidXRef('Unexpected EOF - file corrupted?') raise PDFNoValidXRef('Unexpected EOF - file corrupted?')
f = line.strip().split(b' ') f = line.split(b' ')
if len(f) != 3: if len(f) != 3:
error_msg = 'Invalid XRef format: {!r}, line={!r}'\ error_msg = 'Invalid XRef format: {!r}, line={!r}'\
.format(parser, line) .format(parser, line)