Commit Graph

345 Commits (488545ddc7942137b9a9bd359299a0382d575846)

Author SHA1 Message Date
Hugh Secker-Walker 488545ddc7 Add string expressions to asserts showing local data (#67) 2017-05-29 09:06:09 +02:00
Michał Pasternak fe21725f07 Please replace pycrypto with pycryptodome (#63)
* Enable 3.6 and replace pycrypto with cryptodome

* Upgrade version number
2017-05-29 09:04:38 +02:00
Anton Oleynick 4bc0a0c105 Update pdftypes.py (#61)
Fix errors with:
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdfinterp.py", line 850, in process_page
    self.render_contents(page.resources, page.contents, ctm=ctm)
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdfinterp.py", line 860, in render_contents
    self.init_resources(resources)
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdfinterp.py", line 360, in init_resources
    self.fontmap[fontid] = self.rsrcmgr.get_font(objid, spec)
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdfinterp.py", line 210, in get_font
    font = self.get_font(None, subspec)
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdfinterp.py", line 201, in get_font
    font = PDFCIDFont(self, spec)
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdffont.py", line 667, in __init__
    BytesIO(self.fontfile.get_data()))
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdftypes.py", line 297, in get_data
    self.decode()
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdftypes.py", line 278, in decode
    if 'Predictor' in params:
TypeError: argument of type 'NoneType' is not iterable
2017-05-29 08:55:02 +02:00
Philippe Guglielmetti baddb25df6 v 20170419 (patches a stupid bug from yesterday...) 2017-04-19 14:24:13 +02:00
Philippe Guglielmetti 82af7f0aac issue #56 reproduced, solution attempt unsucessful 2017-04-19 14:19:14 +02:00
Philippe Guglielmetti cd92883925 logging (stupid bug) 2017-04-19 13:48:45 +02:00
Philippe Guglielmetti 11a4c8b6c1 version 20170418 2017-04-18 19:13:20 +02:00
Philippe Guglielmetti 7055862eaf solves https://github.com/pdfminer/pdfminer.six/issues/50 2017-04-18 18:20:31 +02:00
Sergei Maertens f2b0650ad5 Fixes #54 -- don't pass bytestrings through ord() (#55) 2017-04-18 16:57:53 +02:00
Andrew Baumann 9439a3a31a Miscellaneous bug fixes (#47)
* utils.decode_text: fix "TypeError: ord() expected string of length 1, but int found"

fixes https://github.com/goulu/pdfminer/issues/24

* pdfinterp.execute: don't assume that every keyword name can be decoded as utf-8

fixes "'str' does not support the buffer interface", https://github.com/goulu/pdfminer/issues/23

* default settings.STRICT to False, for compatibility with the original pdfminer

* PDFCIDFont: handle font registry/orderings that may be PDFObjRefs

* utils.nunpack: handle 8-byte integers
2017-02-06 14:57:01 +01:00
Philippe Guglielmetti 9b9d69aee9 image export works again with Py3 (issue #15)
https://github.com/pdfminer/pdfminer.six/issues/15
2017-01-20 10:11:19 +01:00
Philippe Guglielmetti f094f0b380 v. 20170119 RC 2017-01-19 08:42:20 +01:00
Philippe Guglielmetti 52feb22eeb Merge remote-tracking branch 'origin/master'
Conflicts:
	MANIFEST.in
	README.md
	pdfminer/latin_enc.py
	pdfminer/pdfdocument.py
	pdfminer/pdfinterp.py
	pdfminer/pdfpage.py
	pdfminer/pdftypes.py
	pdfminer/psparser.py
	pdfminer/utils.py
	samples/Makefile
	setup.py
2017-01-19 08:03:16 +01:00
Jin-tae Hwang 61d423d81c bugfix: if fontname is bytes then skip (#43) 2016-12-14 17:34:16 +01:00
Gabriel Augendre 6cc4abbaa8 Fix import of Django settings (#41)
Settings in Django are imported as such, see https://docs.djangoproject.com/en/1.10/topics/settings/#using-settings-in-python-code
2016-11-26 20:26:23 +01:00
Humberto Pereira e6ad15af79 Added painting information (#37)
* added color support to stroking and non stroking color spaces

* extended LTCurve, LTLine and LTRect to save painting information

* modified PDFLayoutAnalyzer to populate the shapes with painting information
2016-11-08 20:01:58 +01:00
Antonio Ercole De Luca 0fdebc6739 Removing all the "#!/usr/bin/env python" lines, they do not need for … (#34)
* Removing all the "#!/usr/bin/env python" lines, they do not need for python3, solving issue number: #19.

* Restored all the shebangs in the tools and tests folders (because they are real executables) but used "#!/usr/bin/env python" instead of "#!/usr/bin/python" as this blog points out: https://www.peterbe.com/plog/importance-of-env
Removed also the shebang from pdfminer/psparser.py file.
2016-11-08 20:01:11 +01:00
Yusuke Shinyama 8150458718 Added: a simpler ordering mode when 1<F. 2016-09-26 18:06:34 +09:00
Friedrich Lindenberg 447adcf02f fix STRICT reference 2016-09-24 12:03:22 +02:00
Friedrich Lindenberg 70918095cc Return an empty list when no `Differences` are found. 2016-09-24 11:57:11 +02:00
Friedrich Lindenberg 865246bd0c fix print, upstream: 0112112458 2016-09-23 15:04:07 +02:00
Friedrich Lindenberg 0cb13983f7 Backport LICENSE. 2016-09-23 14:57:28 +02:00
Friedrich Lindenberg 1820f96481 backport changes for upstream: #145, #95, #111, #117, #129, #132. 2016-09-23 14:31:31 +02:00
Jakub Wilk 5ddbecb551 Fix typos 2016-09-13 16:25:09 +02:00
Yusuke Shinyama 3068dcdb4a Merge pull request #145 from vinayak-mehta/glyphlist_link
Replace old Adobe glyphlist link
2016-09-12 00:18:24 +09:00
Yusuke Shinyama c753dbac4c Merge pull request #117 from native-api/png_pred_errors
make ValueError's descriptive
2016-09-11 23:55:34 +09:00
Yusuke Shinyama f1dd9ea6d2 Merge pull request #129 from lucanaso/lucanaso-patch-1
Fixed for rendering non breaking spaces (cid:160)
2016-09-11 23:53:03 +09:00
Yusuke Shinyama 177a4ab937 Fixed: #132 (PDFStream.get_filters: support multiple parameterless filters) 2016-09-11 23:52:13 +09:00
Yusuke Shinyama e95a483790 Merge pull request #134 from speedplane/feature/Fix-Get-Filters
Fix Bug with PDF Stream Decoder
2016-09-11 23:48:42 +09:00
Yusuke Shinyama 64fe538b24 Fixed: #114 (UnicodeEncodeError in PSLiteral) 2016-09-11 23:43:22 +09:00
Vinayak Mehta 2926002017 Replace old Adobe glyphlist link 2016-09-08 16:34:53 +05:30
Philippe Guglielmetti 881ea17553 v 20160614 2016-06-14 19:02:07 +02:00
speedplane 2049462f6f Revert changes unrelated to this branch. 2016-06-13 23:42:21 -04:00
speedplane b0b8818a41 Fix a bug with pdfminer which occurs when two or more filters are applied to a stream, even though no parameters are specified. The code would previously drop all of the streams after the first due to misapplication of the zip function. 2016-06-13 23:35:11 -04:00
Friedrich Lindenberg 1d54ecd31c Make the logger run in a namespace. 2016-05-20 21:12:05 +02:00
Philippe Guglielmetti 21fd2bbd23 v 20160202 with Py 2.6 & Py 3.5 support 2016-02-02 15:38:51 +01:00
Goulu 5a23fad6fd Merge pull request #14 from orangain/close-device
Close device to write footer of xml/html files
2016-01-18 11:22:35 +01:00
Goulu 2103e5875e Merge pull request #13 from orangain/include-cmap
Include compiled cmap resources to simplify installation for CJK languages
2016-01-18 11:22:08 +01:00
Steve Hair 92c71436b9 Improved settings management 2016-01-10 12:17:38 -05:00
orangain f8a051adbd Close device to write footer of xml/html files 2015-12-27 20:57:00 +09:00
orangain f1d5d681b6 Include compiled cmap resources to simplify installation for CJK languages
* Run `make cmap` and `git add pdfminer/cmap`.
* Modify MANIFEST.in not to include cmaprsrc dir in the sdist package.
* Add pdfminer/cmap/README.txt to include license in the sdist package.
* Remove installation guide specific to CJK languages from README.md.
2015-12-27 13:32:29 +09:00
lucanaso 63bb3caec2 Fixed for rendering non breaking spaces (cid:160)
As stated in the PDF specification ISO 32000-1, table in Annex D.2 Latin Character Set and Encodings page 653 to 656 (available here: http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf):
"The SPACE character shall also be encoded as 312 in MacRomanEncoding and as 240 in WinAnsiEncoding. This duplicate code shall signify a nonbreaking space; it shall be typographically the same as (U+003A) SPACE."
The duplicate key was missing, therefore PDFMiner was returning the string "(cid:160)". 

This fix adds the duplicate key in latin_enc.py
glyphlist.py does not need to be modified as it already contains a key for non breaking space https://github.com/lucanaso/pdfminer/blob/master/pdfminer/glyphlist.py#L2755.
2015-12-09 16:47:32 +01:00
Chris Hager 8149be1669 bugfixes 2015-12-06 00:17:58 +01:00
Chris Hager 2e1be5721f removed settings.ENFORCE_CHECK_EXTRACTABLE 2015-11-01 22:34:18 +01:00
Chris Hager b686dd0139 pdfminer/settings.py for STRICT and added ENFORCE_CHECK_EXTRACTABLE 2015-11-01 22:28:08 +01:00
Ivan Pozdeev 63c9378b8b make ValueError's descriptive 2015-08-10 03:14:51 +03:00
Alex Zagorodniuk 131cb1ea92 change STRICT to be a settings attribute 2015-06-22 19:08:35 -04:00
Goulu 623bd98452 Update __init__.py
version 20150601
2015-06-01 10:21:51 +02:00
Cathal Garvey 403711ed13 Whoops, forgot to version-gate chardet in the actual code. Thanks Travis! 2015-05-30 19:33:35 +01:00
Cathal Garvey a2ad7a6d03 Fixed some bugs preventing all tests from passing in Py2. 2015-05-30 18:02:29 +01:00