Commit Graph

835 Commits (d04c38fb8d55f52413609b8d19dfe2aed4d3f65e)

Author SHA1 Message Date
SUZUKI Masaya d4118cf5e8 Enabled PDFDevice in the with statement (#88) 2017-08-18 08:15:04 +02:00
Peter Bittner e39800f14c Move package description into package docstring (#87)
Convert Windows/DOS line endings CR/LF to Unix LF (again!)

Add Python 3.6 to classifiers, update project URL
2017-08-18 08:13:15 +02:00
Venelin Stoykov 5ef5484bbe Add tox configuration for easy local testing (#85) 2017-08-18 08:11:32 +02:00
Venelin Stoykov 171cdcc69d Microoptimization for singlebyte fonts (#84)
Instead of list comprehension which will call a function to get the integer value of the bytes directly convert it to bytearray which is more optimal structure for storing list of bytes.
2017-08-18 08:10:27 +02:00
Venelin Stoykov 14de393d5e Cleanup psparser (#83)
- Do not use bytesindex function. Use native slices instead
- Fix import ordering
2017-08-18 08:10:06 +02:00
Venelin Stoykov 496bfd0778 Remove leftover from removing shebangs (#81) 2017-08-18 08:09:00 +02:00
Venelin Stoykov c2432c32f1 Fix assert message for PDFLayoutAnalyzer.end_page (#80)
stack is undefined
2017-08-18 08:08:08 +02:00
Philippe Guglielmetti 4c604828e8 v. 20170720 2017-07-20 21:35:49 +02:00
Philippe Guglielmetti b010db6049 solves https://github.com/pdfminer/pdfminer.six/issues/65 2017-07-20 21:17:06 +02:00
Sergei Maertens 67bf5ab124 Compare byte with byte instead of int (#78) 2017-07-20 20:47:14 +02:00
Sergei Maertens 3e364354da Fixes #64 -- be less strict when inspecting a tree type (#76)
In the PDFStream it's possible that the /Type element is not
present, but /type is. According to the spec, these are different
elements, but in the case in point they had the same meaning.

If PDFMiner is not running in STRICT mode and /Type doesn't resolve,
a fallback to /type is used to determine the tree type.
2017-07-20 20:46:35 +02:00
Attila Szász 938419c476 Align dumppdf tool to modified data structures. (#73)
* Align dumppdf tool to modified data structures.
TOC page numbers should also work now, counting from 1.

* Update version number.
2017-07-20 20:46:11 +02:00
Sergei Maertens d79612c455 Resolve unresolved PDFObjectRefs (#70)
Thank you !
2017-06-02 13:35:12 +02:00
Michał Pasternak 87726d8b4f No, thank you. (#69)
* Enable 3.6 and replace pycrypto with cryptodome

* Upgrade version number

* PyPI badge; whitespace cleanup
2017-05-30 10:02:24 +02:00
Hugh Secker-Walker 35a58ee5b5 Add tools/pdfstats.py which counts all LT* types in a PDF (#68) 2017-05-29 09:11:58 +02:00
Hugh Secker-Walker 488545ddc7 Add string expressions to asserts showing local data (#67) 2017-05-29 09:06:09 +02:00
Michał Pasternak fe21725f07 Please replace pycrypto with pycryptodome (#63)
* Enable 3.6 and replace pycrypto with cryptodome

* Upgrade version number
2017-05-29 09:04:38 +02:00
Anton Oleynick 4bc0a0c105 Update pdftypes.py (#61)
Fix errors with:
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdfinterp.py", line 850, in process_page
    self.render_contents(page.resources, page.contents, ctm=ctm)
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdfinterp.py", line 860, in render_contents
    self.init_resources(resources)
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdfinterp.py", line 360, in init_resources
    self.fontmap[fontid] = self.rsrcmgr.get_font(objid, spec)
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdfinterp.py", line 210, in get_font
    font = self.get_font(None, subspec)
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdfinterp.py", line 201, in get_font
    font = PDFCIDFont(self, spec)
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdffont.py", line 667, in __init__
    BytesIO(self.fontfile.get_data()))
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdftypes.py", line 297, in get_data
    self.decode()
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdftypes.py", line 278, in decode
    if 'Predictor' in params:
TypeError: argument of type 'NoneType' is not iterable
2017-05-29 08:55:02 +02:00
Peter Bittner 4e59fb66de Convert Windows/DOS line endings CR/LF to Unix LF (#58) 2017-05-29 08:53:26 +02:00
Philippe Guglielmetti baddb25df6 v 20170419 (patches a stupid bug from yesterday...) 2017-04-19 14:24:13 +02:00
Philippe Guglielmetti 82af7f0aac issue #56 reproduced, solution attempt unsucessful 2017-04-19 14:19:14 +02:00
Philippe Guglielmetti cd92883925 logging (stupid bug) 2017-04-19 13:48:45 +02:00
Philippe Guglielmetti f28ce1ebed Merge branch 'master' of https://github.com/pdfminer/pdfminer.six.git 2017-04-19 12:28:03 +02:00
Philippe Guglielmetti 11a4c8b6c1 version 20170418 2017-04-18 19:13:20 +02:00
Philippe Guglielmetti 5ef8333c5f new test fails on Linux & TRavis-CI. TODO: find why 2017-04-18 18:28:48 +02:00
Philippe Guglielmetti 7055862eaf solves https://github.com/pdfminer/pdfminer.six/issues/50 2017-04-18 18:20:31 +02:00
Sergei Maertens f2b0650ad5 Fixes #54 -- don't pass bytestrings through ord() (#55) 2017-04-18 16:57:53 +02:00
Philippe Guglielmetti 3427dcaf20 Merge branch 'master' of https://github.com/pdfminer/pdfminer.six.git 2017-02-06 15:56:54 +01:00
Andrew Baumann 9439a3a31a Miscellaneous bug fixes (#47)
* utils.decode_text: fix "TypeError: ord() expected string of length 1, but int found"

fixes https://github.com/goulu/pdfminer/issues/24

* pdfinterp.execute: don't assume that every keyword name can be decoded as utf-8

fixes "'str' does not support the buffer interface", https://github.com/goulu/pdfminer/issues/23

* default settings.STRICT to False, for compatibility with the original pdfminer

* PDFCIDFont: handle font registry/orderings that may be PDFObjRefs

* utils.nunpack: handle 8-byte integers
2017-02-06 14:57:01 +01:00
Philippe Guglielmetti 1e5db2b02d some keywords can't be decoded 2017-01-20 10:55:50 +01:00
Philippe Guglielmetti fd63dbf62e no more skipped tests 2017-01-20 10:12:00 +01:00
Philippe Guglielmetti 9b9d69aee9 image export works again with Py3 (issue #15)
https://github.com/pdfminer/pdfminer.six/issues/15
2017-01-20 10:11:19 +01:00
Philippe Guglielmetti f094f0b380 v. 20170119 RC 2017-01-19 08:42:20 +01:00
Philippe Guglielmetti 7c96fe29ed links updated to new https://github.com/pdfminer owner 2017-01-19 08:37:53 +01:00
Philippe Guglielmetti 52feb22eeb Merge remote-tracking branch 'origin/master'
Conflicts:
	MANIFEST.in
	README.md
	pdfminer/latin_enc.py
	pdfminer/pdfdocument.py
	pdfminer/pdfinterp.py
	pdfminer/pdfpage.py
	pdfminer/pdftypes.py
	pdfminer/psparser.py
	pdfminer/utils.py
	samples/Makefile
	setup.py
2017-01-19 08:03:16 +01:00
Jin-tae Hwang 61d423d81c bugfix: if fontname is bytes then skip (#43) 2016-12-14 17:34:16 +01:00
Gabriel Augendre 6cc4abbaa8 Fix import of Django settings (#41)
Settings in Django are imported as such, see https://docs.djangoproject.com/en/1.10/topics/settings/#using-settings-in-python-code
2016-11-26 20:26:23 +01:00
Humberto Pereira e6ad15af79 Added painting information (#37)
* added color support to stroking and non stroking color spaces

* extended LTCurve, LTLine and LTRect to save painting information

* modified PDFLayoutAnalyzer to populate the shapes with painting information
2016-11-08 20:01:58 +01:00
Antonio Ercole De Luca 0fdebc6739 Removing all the "#!/usr/bin/env python" lines, they do not need for … (#34)
* Removing all the "#!/usr/bin/env python" lines, they do not need for python3, solving issue number: #19.

* Restored all the shebangs in the tools and tests folders (because they are real executables) but used "#!/usr/bin/env python" instead of "#!/usr/bin/python" as this blog points out: https://www.peterbe.com/plog/importance-of-env
Removed also the shebang from pdfminer/psparser.py file.
2016-11-08 20:01:11 +01:00
Goulu bc78fd2bea Merge pull request #33 from pudo/backports
Backport changes in pdfminer since this summer
2016-10-31 07:45:25 +01:00
Yusuke Shinyama 8150458718 Added: a simpler ordering mode when 1<F. 2016-09-26 18:06:34 +09:00
Friedrich Lindenberg 447adcf02f fix STRICT reference 2016-09-24 12:03:22 +02:00
Friedrich Lindenberg 70918095cc Return an empty list when no `Differences` are found. 2016-09-24 11:57:11 +02:00
Goulu a7f9623b47 Merge pull request #25 from Daniel-KM/fix_tests
Fixed tests.
2016-09-23 15:52:11 +02:00
Friedrich Lindenberg 865246bd0c fix print, upstream: 0112112458 2016-09-23 15:04:07 +02:00
Friedrich Lindenberg 0cb13983f7 Backport LICENSE. 2016-09-23 14:57:28 +02:00
Friedrich Lindenberg 1820f96481 backport changes for upstream: #145, #95, #111, #117, #129, #132. 2016-09-23 14:31:31 +02:00
Friedrich Lindenberg 19155d35c6 remove lf rule 2016-09-23 14:18:26 +02:00
Yusuke Shinyama 44977b6726 Merge pull request #149 from jwilk/spelling
Fix typos
2016-09-14 13:44:11 +09:00
Jakub Wilk 5ddbecb551 Fix typos 2016-09-13 16:25:09 +02:00