* Fix font name by removing subset tag
* Added line to CHANGELOG.md
* Add documentation and clear variable name
* Use `html.escape()` to encode strings for html and always return `str` instead of `bytes`
Fixes#186
* Tread the permissions (the /P entry) as unsigned long, fix#186
* handle negative values for p
* Extract function for resolving an twos-complement
* Add test for issue #352
* Add line to CHANGELOG.md
* Only ints can be converted to a uint using two's-complement method
* Standardize import style; multiple imports from same module on one line
Co-authored-by: Pieter Marsman <pietermarsman@gmail.com>
* Drop support for legacy Python 2
* Add python_requires to help pip
* Upgrade Python syntax with pyupgrade
* Upgrade Python syntax with pyupgrade --py3-plus
* Python 3 imports
* Replace six
* Update CONTRIBUTING.md
* Added line to changelog
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
* Code Refractor: Use code-style enforcement #312
* Add flake8 to travis-ci
* Remove python 2 3 comment on six library. 891 errors > 870 errors.
* Remove class and functions comments that consist of just the name. 870 errors > 855 errors.
* Fix flake8 errors in pdftypes.py. 855 errors > 833 errors.
* Moving flake8 testing from .travis.yml to tox.ini to ensure local testing before commiting
* Cleanup pdfinterp.py and add documentation from PDF Reference
* Cleanup pdfpage.py
* Cleanup pdffont.py
* Clean psparser.py
* Cleanup high_level.py
* Cleanup layout.py
* Cleanup pdfparser.py
* Cleanup pdfcolor.py
* Cleanup rijndael.py
* Cleanup converter.py
* Rename klass to cls if it is the class variable, to be more consistent with standard practice
* Cleanup cmap.py
* Cleanup pdfdevice.py
* flake8 ignore fontmetrics.py
* Cleanup test_pdfminer_psparser.py
* Fix flake8 in pdfdocument.py; 339 errors to go
* Fix flake8 utils.py; 326 errors togo
* pep8 correction for few files in /tools/ 328 > 160 to go (#342)
* pep8 correction for few files in /tools/ 328 > 160 to go
* pep8 correction: 160 > 5 to go
* Fix ascii85.py errors
* Fix error in getting index from target that does not exists
* Remove commented print lines
* Fix flake8 error in pdfinterp.py
* Fix python2 specific error by removing argument from print statement
* Ignore invalid python2 syntax
* Update contributing.md
* Added changelog
* Remove unused import
Co-authored-by: Fakabbir Amin <f4amin@gmail.com>
In python2, isinstance("", bytes) is true, causing enc() to
suppress any string input. This results in fontnames being lost
when running pdf2txt.py in python2.
As this check was not present in the original python2 version of
pdfminer, restrict it to only check when running in python3.
* utils.decode_text: fix "TypeError: ord() expected string of length 1, but int found"
fixes https://github.com/goulu/pdfminer/issues/24
* pdfinterp.execute: don't assume that every keyword name can be decoded as utf-8
fixes "'str' does not support the buffer interface", https://github.com/goulu/pdfminer/issues/23
* default settings.STRICT to False, for compatibility with the original pdfminer
* PDFCIDFont: handle font registry/orderings that may be PDFObjRefs
* utils.nunpack: handle 8-byte integers
* Removing all the "#!/usr/bin/env python" lines, they do not need for python3, solving issue number: #19.
* Restored all the shebangs in the tools and tests folders (because they are real executables) but used "#!/usr/bin/env python" instead of "#!/usr/bin/python" as this blog points out: https://www.peterbe.com/plog/importance-of-env
Removed also the shebang from pdfminer/psparser.py file.
Sorry, changes should have been more atomic.
*In pdf2txt.py:*
* Re-wrote main function to use argparse instead of optparse.
* Manually tested in Py2/Py3 to get partial consistency.
* Errors abound including Tags mode, but most modes weren't working at all in Py3 anyway.
* Py2 mode *probably* unchanged, cannot find any bugs yet...
* Kept old main function for posterity, for now.
*In utils:*
* Added a few compatibility functions (some string hax required chardet, new dependency):
- make_compat_bytes(in_str)-> (py3->bytes | py2->str)
- make_compat_str(in_str)-> (str)
- compatible_encode_method(bytesorstring, encoding, erraction)-> (str)
*In pdfdevice:*
* To handle different output filetypes in Py3, injected lots of calls to new utils methods,
as well as some six.PYX checks and logic. These changes are largely responsible for
enhanced Py2/Py3 consistency.
*In converter:*
* To handle output filetypes in Py2, injected a few checks and fixes particularly around the
py2 `str.encode` method and its *assumed* usual use-analogies in Py3.