pdfminer.six

Commit Graph

Author	SHA1	Message	Date
Philippe Guglielmetti	82af7f0aac	issue #56 reproduced, solution attempt unsucessful	2017-04-19 14:19:14 +02:00
Philippe Guglielmetti	cd92883925	logging (stupid bug)	2017-04-19 13:48:45 +02:00
Philippe Guglielmetti	11a4c8b6c1	version 20170418	2017-04-18 19:13:20 +02:00
Philippe Guglielmetti	7055862eaf	solves https://github.com/pdfminer/pdfminer.six/issues/50	2017-04-18 18:20:31 +02:00
Sergei Maertens	f2b0650ad5	Fixes #54 -- don't pass bytestrings through ord() (#55 )	2017-04-18 16:57:53 +02:00
Andrew Baumann	9439a3a31a	Miscellaneous bug fixes (#47 ) * utils.decode_text: fix "TypeError: ord() expected string of length 1, but int found" fixes https://github.com/goulu/pdfminer/issues/24 * pdfinterp.execute: don't assume that every keyword name can be decoded as utf-8 fixes "'str' does not support the buffer interface", https://github.com/goulu/pdfminer/issues/23 * default settings.STRICT to False, for compatibility with the original pdfminer * PDFCIDFont: handle font registry/orderings that may be PDFObjRefs * utils.nunpack: handle 8-byte integers	2017-02-06 14:57:01 +01:00
Philippe Guglielmetti	9b9d69aee9	image export works again with Py3 (issue #15 ) https://github.com/pdfminer/pdfminer.six/issues/15	2017-01-20 10:11:19 +01:00
Philippe Guglielmetti	f094f0b380	v. 20170119 RC	2017-01-19 08:42:20 +01:00
Philippe Guglielmetti	52feb22eeb	Merge remote-tracking branch 'origin/master' Conflicts: MANIFEST.in README.md pdfminer/latin_enc.py pdfminer/pdfdocument.py pdfminer/pdfinterp.py pdfminer/pdfpage.py pdfminer/pdftypes.py pdfminer/psparser.py pdfminer/utils.py samples/Makefile setup.py	2017-01-19 08:03:16 +01:00
Jin-tae Hwang	61d423d81c	bugfix: if fontname is bytes then skip (#43 )	2016-12-14 17:34:16 +01:00
Gabriel Augendre	6cc4abbaa8	Fix import of Django settings (#41 ) Settings in Django are imported as such, see https://docs.djangoproject.com/en/1.10/topics/settings/#using-settings-in-python-code	2016-11-26 20:26:23 +01:00
Humberto Pereira	e6ad15af79	Added painting information (#37 ) * added color support to stroking and non stroking color spaces * extended LTCurve, LTLine and LTRect to save painting information * modified PDFLayoutAnalyzer to populate the shapes with painting information	2016-11-08 20:01:58 +01:00
Antonio Ercole De Luca	0fdebc6739	Removing all the "#!/usr/bin/env python" lines, they do not need for … (#34 ) * Removing all the "#!/usr/bin/env python" lines, they do not need for python3, solving issue number: #19. * Restored all the shebangs in the tools and tests folders (because they are real executables) but used "#!/usr/bin/env python" instead of "#!/usr/bin/python" as this blog points out: https://www.peterbe.com/plog/importance-of-env Removed also the shebang from pdfminer/psparser.py file.	2016-11-08 20:01:11 +01:00
Yusuke Shinyama	8150458718	Added: a simpler ordering mode when 1<F.	2016-09-26 18:06:34 +09:00
Friedrich Lindenberg	447adcf02f	fix STRICT reference	2016-09-24 12:03:22 +02:00
Friedrich Lindenberg	70918095cc	Return an empty list when no `Differences` are found.	2016-09-24 11:57:11 +02:00
Friedrich Lindenberg	865246bd0c	fix print, upstream: `0112112458`	2016-09-23 15:04:07 +02:00
Friedrich Lindenberg	0cb13983f7	Backport LICENSE.	2016-09-23 14:57:28 +02:00
Friedrich Lindenberg	1820f96481	backport changes for upstream: #145 , #95 , #111 , #117 , #129 , #132 .	2016-09-23 14:31:31 +02:00
Jakub Wilk	5ddbecb551	Fix typos	2016-09-13 16:25:09 +02:00
Yusuke Shinyama	3068dcdb4a	Merge pull request #145 from vinayak-mehta/glyphlist_link Replace old Adobe glyphlist link	2016-09-12 00:18:24 +09:00
Yusuke Shinyama	c753dbac4c	Merge pull request #117 from native-api/png_pred_errors make ValueError's descriptive	2016-09-11 23:55:34 +09:00
Yusuke Shinyama	f1dd9ea6d2	Merge pull request #129 from lucanaso/lucanaso-patch-1 Fixed for rendering non breaking spaces (cid:160)	2016-09-11 23:53:03 +09:00
Yusuke Shinyama	177a4ab937	Fixed: #132 (PDFStream.get_filters: support multiple parameterless filters)	2016-09-11 23:52:13 +09:00
Yusuke Shinyama	e95a483790	Merge pull request #134 from speedplane/feature/Fix-Get-Filters Fix Bug with PDF Stream Decoder	2016-09-11 23:48:42 +09:00
Yusuke Shinyama	64fe538b24	Fixed: #114 (UnicodeEncodeError in PSLiteral)	2016-09-11 23:43:22 +09:00
Vinayak Mehta	2926002017	Replace old Adobe glyphlist link	2016-09-08 16:34:53 +05:30
Philippe Guglielmetti	881ea17553	v 20160614	2016-06-14 19:02:07 +02:00
speedplane	2049462f6f	Revert changes unrelated to this branch.	2016-06-13 23:42:21 -04:00
speedplane	b0b8818a41	Fix a bug with pdfminer which occurs when two or more filters are applied to a stream, even though no parameters are specified. The code would previously drop all of the streams after the first due to misapplication of the zip function.	2016-06-13 23:35:11 -04:00
Friedrich Lindenberg	1d54ecd31c	Make the logger run in a namespace.	2016-05-20 21:12:05 +02:00
Philippe Guglielmetti	21fd2bbd23	v 20160202 with Py 2.6 & Py 3.5 support	2016-02-02 15:38:51 +01:00
Goulu	5a23fad6fd	Merge pull request #14 from orangain/close-device Close device to write footer of xml/html files	2016-01-18 11:22:35 +01:00
Goulu	2103e5875e	Merge pull request #13 from orangain/include-cmap Include compiled cmap resources to simplify installation for CJK languages	2016-01-18 11:22:08 +01:00
Steve Hair	92c71436b9	Improved settings management	2016-01-10 12:17:38 -05:00
orangain	f8a051adbd	Close device to write footer of xml/html files	2015-12-27 20:57:00 +09:00
orangain	f1d5d681b6	Include compiled cmap resources to simplify installation for CJK languages * Run `make cmap` and `git add pdfminer/cmap`. * Modify MANIFEST.in not to include cmaprsrc dir in the sdist package. * Add pdfminer/cmap/README.txt to include license in the sdist package. * Remove installation guide specific to CJK languages from README.md.	2015-12-27 13:32:29 +09:00
lucanaso	63bb3caec2	Fixed for rendering non breaking spaces (cid:160) As stated in the PDF specification ISO 32000-1, table in Annex D.2 Latin Character Set and Encodings page 653 to 656 (available here: http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf): "The SPACE character shall also be encoded as 312 in MacRomanEncoding and as 240 in WinAnsiEncoding. This duplicate code shall signify a nonbreaking space; it shall be typographically the same as (U+003A) SPACE." The duplicate key was missing, therefore PDFMiner was returning the string "(cid:160)". This fix adds the duplicate key in latin_enc.py glyphlist.py does not need to be modified as it already contains a key for non breaking space https://github.com/lucanaso/pdfminer/blob/master/pdfminer/glyphlist.py#L2755.	2015-12-09 16:47:32 +01:00
Chris Hager	8149be1669	bugfixes	2015-12-06 00:17:58 +01:00
Chris Hager	2e1be5721f	removed settings.ENFORCE_CHECK_EXTRACTABLE	2015-11-01 22:34:18 +01:00
Chris Hager	b686dd0139	pdfminer/settings.py for STRICT and added ENFORCE_CHECK_EXTRACTABLE	2015-11-01 22:28:08 +01:00
Ivan Pozdeev	63c9378b8b	make ValueError's descriptive	2015-08-10 03:14:51 +03:00
Alex Zagorodniuk	131cb1ea92	change STRICT to be a settings attribute	2015-06-22 19:08:35 -04:00
Goulu	623bd98452	Update __init__.py version 20150601	2015-06-01 10:21:51 +02:00
Cathal Garvey	403711ed13	Whoops, forgot to version-gate chardet in the actual code. Thanks Travis!	2015-05-30 19:33:35 +01:00
Cathal Garvey	a2ad7a6d03	Fixed some bugs preventing all tests from passing in Py2.	2015-05-30 18:02:29 +01:00
Cathal Garvey	79c97ac221	Docstrings.	2015-05-30 17:16:06 +01:00
Cathal Garvey	3b7edba48c	Forgot to add the actual compartmentalised function..	2015-05-30 17:04:28 +01:00
Cathal Garvey	08cb217983	Progress, progress.. not nearly atomic enough, sorry.	2015-05-30 16:14:24 +01:00
Cathal Garvey	1b47bed306	Many changes to make pdf2txt.py work better in Py3, some in that script, others in module! Sorry, changes should have been more atomic. In pdf2txt.py: * Re-wrote main function to use argparse instead of optparse. * Manually tested in Py2/Py3 to get partial consistency. * Errors abound including Tags mode, but most modes weren't working at all in Py3 anyway. * Py2 mode probably unchanged, cannot find any bugs yet... * Kept old main function for posterity, for now. In utils: * Added a few compatibility functions (some string hax required chardet, new dependency): - make_compat_bytes(in_str)-> (py3->bytes \| py2->str) - make_compat_str(in_str)-> (str) - compatible_encode_method(bytesorstring, encoding, erraction)-> (str) In pdfdevice: * To handle different output filetypes in Py3, injected lots of calls to new utils methods, as well as some six.PYX checks and logic. These changes are largely responsible for enhanced Py2/Py3 consistency. In converter: * To handle output filetypes in Py2, injected a few checks and fixes particularly around the py2 `str.encode` method and its assumed usual use-analogies in Py3.	2015-05-17 21:08:57 +01:00

1 2 3 4 5 ...

341 Commits (82af7f0aaca9091ab4891439e0638626e9b76af2)