pdfminer.six

Commit Graph

Author	SHA1	Message	Date
Pieter Marsman	9fd7172f7b	Cleanup utils.py	2019-10-17 12:14:02 +02:00
jet457	7e40fde320	Removing assertion in drange to allow equal inputs (#246 ) and mimic behaviour of built-in method range Fixes #66, since it now allows the bbox to have 0 width or 0 height Added tests for Plane since it is the API that uses drange	2019-10-17 12:04:25 +02:00
Tata Ganesh	e03ecab856	Merge pull request #141 from timb07/speedup_layout Speed up layout of text boxes	2018-11-08 20:28:40 +05:30
Tim Bell	8f8a78bb88	Remove now-unused csort()	2018-04-11 09:37:32 +10:00
Gregory Mori	335c25c045	only check for bytes input to enc() in python3 In python2, isinstance("", bytes) is true, causing enc() to suppress any string input. This results in fontnames being lost when running pdf2txt.py in python2. As this check was not present in the original python2 version of pdfminer, restrict it to only check when running in python3.	2018-04-09 12:21:59 -07:00
KOLANICH	3bf3c97bbb	Added a vector between 2 boxes which may be useful for users of the library	2018-02-16 14:49:12 +00:00
Hugh Secker-Walker	488545ddc7	Add string expressions to asserts showing local data (#67 )	2017-05-29 09:06:09 +02:00
Andrew Baumann	9439a3a31a	Miscellaneous bug fixes (#47 ) * utils.decode_text: fix "TypeError: ord() expected string of length 1, but int found" fixes https://github.com/goulu/pdfminer/issues/24 * pdfinterp.execute: don't assume that every keyword name can be decoded as utf-8 fixes "'str' does not support the buffer interface", https://github.com/goulu/pdfminer/issues/23 * default settings.STRICT to False, for compatibility with the original pdfminer * PDFCIDFont: handle font registry/orderings that may be PDFObjRefs * utils.nunpack: handle 8-byte integers	2017-02-06 14:57:01 +01:00
Jin-tae Hwang	61d423d81c	bugfix: if fontname is bytes then skip (#43 )	2016-12-14 17:34:16 +01:00
Antonio Ercole De Luca	0fdebc6739	Removing all the "#!/usr/bin/env python" lines, they do not need for … (#34 ) * Removing all the "#!/usr/bin/env python" lines, they do not need for python3, solving issue number: #19. * Restored all the shebangs in the tools and tests folders (because they are real executables) but used "#!/usr/bin/env python" instead of "#!/usr/bin/python" as this blog points out: https://www.peterbe.com/plog/importance-of-env Removed also the shebang from pdfminer/psparser.py file.	2016-11-08 20:01:11 +01:00
Friedrich Lindenberg	1820f96481	backport changes for upstream: #145 , #95 , #111 , #117 , #129 , #132 .	2016-09-23 14:31:31 +02:00
Cathal Garvey	403711ed13	Whoops, forgot to version-gate chardet in the actual code. Thanks Travis!	2015-05-30 19:33:35 +01:00
Cathal Garvey	a2ad7a6d03	Fixed some bugs preventing all tests from passing in Py2.	2015-05-30 18:02:29 +01:00
Cathal Garvey	1b47bed306	Many changes to make pdf2txt.py work better in Py3, some in that script, others in module! Sorry, changes should have been more atomic. In pdf2txt.py: * Re-wrote main function to use argparse instead of optparse. * Manually tested in Py2/Py3 to get partial consistency. * Errors abound including Tags mode, but most modes weren't working at all in Py3 anyway. * Py2 mode probably unchanged, cannot find any bugs yet... * Kept old main function for posterity, for now. In utils: * Added a few compatibility functions (some string hax required chardet, new dependency): - make_compat_bytes(in_str)-> (py3->bytes \| py2->str) - make_compat_str(in_str)-> (str) - compatible_encode_method(bytesorstring, encoding, erraction)-> (str) In pdfdevice: * To handle different output filetypes in Py3, injected lots of calls to new utils methods, as well as some six.PYX checks and logic. These changes are largely responsible for enhanced Py2/Py3 consistency. In converter: * To handle output filetypes in Py2, injected a few checks and fixes particularly around the py2 `str.encode` method and its assumed usual use-analogies in Py3.	2015-05-17 21:08:57 +01:00
enkore	d0379a2c44	Fix utils.decode_text	2014-12-04 17:09:52 +01:00
cybjit	9b2e29396b	apply_png_predictor py3	2014-09-16 22:59:29 +02:00
cybjit	51a361c145	clean up HTMLConverter and XMLConverter encoding	2014-09-16 22:57:00 +02:00
unknown	29c07ea770	Python 3.4 support and tests	2014-09-03 15:26:08 +02:00
unknown	a6475b61b4	Python 3.4 support added and tested	2014-09-03 13:17:41 +02:00
unknown	faea7291a8	tests pass under Py 2.7 and 3.4	2014-09-01 14:16:49 +02:00
Yusuke Shinyama	1ccfaff411	String-Bytes distinction (first attempt).	2014-06-30 19:05:56 +09:00
Yusuke Shinyama	2e900e5d10	Fixed for consistent test results. (hopefully...)	2014-06-26 17:41:31 +09:00
Yusuke Shinyama	0387a6c260	Removed: tuple-unpacking args.	2014-06-15 12:12:13 +09:00
Yusuke Shinyama	d9680fca7e	Plane: preserve the object order so that the test result is always consistent.	2014-06-14 14:44:53 +09:00
Yusuke Shinyama	340387bfc6	Cleanup: isinstance	2014-03-28 17:50:59 +09:00
Yusuke Shinyama	636d4caeb3	Fixed the PNG predictor bug. Thanks to Gabor Molnar.	2014-03-24 19:57:05 +09:00
Yusuke Shinyama	c97ec3048e	Changed / to // for clarity.	2013-11-26 21:35:16 +09:00
Yusuke Shinyama	c8b6d4112a	Fixed: crash with negative layout bbox.	2013-11-09 15:10:14 +09:00
Matthew Duggan	2caa5edc25	PEP8: Whitespace changes to match pep8	2013-11-07 17:35:04 +09:00
Matthew Duggan	c1da8b835c	PEP8: Remove trailing whitespace	2013-11-07 16:14:53 +09:00
Yusuke Shinyama	e927bd307e	fixed: https://github.com/euske/pdfminer/issues/8	2013-10-22 18:24:39 +09:00
Yusuke Shinyama	0ea08890d4	renamed: python2 -> python.	2013-10-17 23:05:27 +09:00
Yusuke Shinyama	557c2c72e6	Removed ObjIdRange for terseness.	2013-10-10 18:34:43 +09:00
jcushman	da3f023b2d	Use set instead of list for Plane's internal collection of objects.	2012-06-22 16:36:33 -03:00
Yusuke Shinyama	46bb0107aa	fixed: crash due to small layout elements (thanks to hsoft)	2011-07-31 17:44:09 +10:00
Yusuke Shinyama	dc8fde0e47	added CCITTFaxFilter support and a very crude image extraction.	2011-07-18 21:07:00 +10:00
Yusuke Shinyama	0278076ea8	PNG predictor added	2011-06-07 00:46:33 +09:00
Yusuke Shinyama	18a5058af6	separated predictor functions.	2011-06-07 00:31:03 +09:00
Yusuke Shinyama	c134596e2f	code cleanup and testcase stabilization	2011-05-15 01:22:19 +09:00
Yusuke Shinyama	b8d516fc52	extended Plane class.	2011-05-14 14:16:40 +09:00
Yusuke Shinyama	8f9684f6a6	code cleanup: layout analysis	2011-04-21 22:07:04 +09:00
Yusuke Shinyama	18e782f330	canonicalize package names	2011-03-02 23:43:03 +09:00
Yusuke Shinyama	bb26cf9180	eliminate empty textboxes	2011-03-01 20:47:20 +09:00
Yusuke Shinyama	a8bf9b159e	docstring fix	2011-02-27 13:09:12 +09:00
Yusuke Shinyama	cabaa10e4f	layout analysis improvement	2011-02-27 12:56:28 +09:00
Yusuke Shinyama	b2d13db29a	code cleanup	2011-02-14 22:51:20 +09:00
yusuke.shinyama.dummy	9584845358	layout analysis improved git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@268 1aa58f4a-7d42-0410-adbc-911cccaed67c	2010-11-09 10:40:05 +00:00
yusuke.shinyama.dummy	1904b61355	documentation git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@266 1aa58f4a-7d42-0410-adbc-911cccaed67c	2010-11-09 10:39:40 +00:00
yusuke.shinyama.dummy	509ab66319	stay with python2 git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@264 1aa58f4a-7d42-0410-adbc-911cccaed67c	2010-10-19 09:57:01 +00:00
yusuke.shinyama.dummy	69d9d85685	nunpack TypeError fix git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@246 1aa58f4a-7d42-0410-adbc-911cccaed67c	2010-10-17 05:13:52 +00:00

1 2

64 Commits (40aa2533c98fb9c6b700891356e638bd6821ad13)