Fix for when 'trailer' is indented (#535)

* Fix for when trailer is indented

* Store stripped line

* This commit breaks things...

* Or maybe this one breaks things?

* Remove commented code because no longer used.

* Add CHANGELOG.md

* Add poetry venv management files to gitignore since I started using poetry to manage the python envs for this project

Co-authored-by: Pieter Marsman <pietermarsman@gmail.com>
pull/663/head
Jake Stockwin 2021-08-15 16:49:56 +01:00 committed by GitHub
parent 016239c146
commit 19c1372984
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 11 additions and 6 deletions

6
.gitignore vendored
View File

@ -17,5 +17,9 @@ tests/*.xml
tests/*.txt tests/*.txt
.idea/ .idea/
.tox/ .tox/
# python venv management tools
Pipfile Pipfile
Pipfile.lock Pipfile.lock
pyproject.toml
poetry.lock

View File

@ -8,6 +8,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
### Fixed ### Fixed
- Fix issue of TypeError: cannot unpack non-iterable PDFObjRef object, when unpacking the value of 'DW2' ([#529](https://github.com/pdfminer/pdfminer.six/pull/529)) - Fix issue of TypeError: cannot unpack non-iterable PDFObjRef object, when unpacking the value of 'DW2' ([#529](https://github.com/pdfminer/pdfminer.six/pull/529))
- `PermissionError` when creating temporary filepaths on windows when running tests ([#469](https://github.com/pdfminer/pdfminer.six/issues/469)) - `PermissionError` when creating temporary filepaths on windows when running tests ([#469](https://github.com/pdfminer/pdfminer.six/issues/469))
- Detecting trailer correctly when surrounded with needless whitespace ([#535](https://github.com/pdfminer/pdfminer.six/pull/535))
- Fix `.paint_path` logic for handling single line segments and extracting point-on-curve positions of Beziér path commands ([#530](https://github.com/pdfminer/pdfminer.six/pull/530)) - Fix `.paint_path` logic for handling single line segments and extracting point-on-curve positions of Beziér path commands ([#530](https://github.com/pdfminer/pdfminer.six/pull/530))
## Removed ## Removed

View File

@ -93,16 +93,15 @@ class PDFXRef(PDFBaseXRef):
while True: while True:
try: try:
(pos, line) = parser.nextline() (pos, line) = parser.nextline()
if not line.strip(): line = line.strip()
if not line:
continue continue
except PSEOF: except PSEOF:
raise PDFNoValidXRef('Unexpected EOF - file corrupted?') raise PDFNoValidXRef('Unexpected EOF - file corrupted?')
if not line:
raise PDFNoValidXRef('Premature eof: %r' % parser)
if line.startswith(b'trailer'): if line.startswith(b'trailer'):
parser.seek(pos) parser.seek(pos)
break break
f = line.strip().split(b' ') f = line.split(b' ')
if len(f) != 2: if len(f) != 2:
error_msg = 'Trailer not found: {!r}: line={!r}'\ error_msg = 'Trailer not found: {!r}: line={!r}'\
.format(parser, line) .format(parser, line)
@ -116,9 +115,10 @@ class PDFXRef(PDFBaseXRef):
for objid in range(start, start+nobjs): for objid in range(start, start+nobjs):
try: try:
(_, line) = parser.nextline() (_, line) = parser.nextline()
line = line.strip()
except PSEOF: except PSEOF:
raise PDFNoValidXRef('Unexpected EOF - file corrupted?') raise PDFNoValidXRef('Unexpected EOF - file corrupted?')
f = line.strip().split(b' ') f = line.split(b' ')
if len(f) != 3: if len(f) != 3:
error_msg = 'Invalid XRef format: {!r}, line={!r}'\ error_msg = 'Invalid XRef format: {!r}, line={!r}'\
.format(parser, line) .format(parser, line)