* swap pycryptodome to the faster, smaller, and industry standard crytography io
* update changelog
* fixlint
* Update CHANGELOG.md
* from MR, unneeded ex and naming
* add samples to nosetests
* fix lint
* show mismatch
* fix lint
* typo and newline
* Revert "add samples to nosetests"
This reverts commit a49ca302
* Add tests for encrypted documents to nose test suite
* Optimize imports of pdfdocument.py
Co-authored-by: Oren Tysor <oren@atakama.com>
Co-authored-by: Pieter Marsman <pietermarsman@gmail.com>
* Fix converting path to multiple rectangles
For path that consists of a series of rectangles
(shape is 'mlllhmlllh...'), call paint_path again with each group of
5 points. The result is multiple rects instead of a single curve.
fixes#369
* Reduce pdf size by removing font
* Add unittest for PDFLayoutAnalyzer.paint_path()
* Add line to CHANGELOG.md
* Add reference to pdf reference manual
* Cleanup function paint_path a bit
* Reduce line length of tests
* Reduce line length of tests
Co-authored-by: Pieter Marsman <pietermarsman@gmail.com>
* Changed error to warning for 'Text extraction is not allowed'
* updated changelog
* fix lint
* made changes suggested in review
* Update CHANGELOG.md
* Add regression test for failing pdf
* Reduce line length to <80
Co-authored-by: Pieter Marsman <pietermarsman@gmail.com>
* Remove scaling font height/width with size of font bounding box
* Refactor LTChar bounding box computation
* Change expected outcome of `python tools/pdf2txt.py samples/simple3.pdf`, because it looks like an improvement. However, when I view `samples/simple3.pdf` I don't see any text at all. The change in expected outcome is explained by the fact that the bounding boxes of characters can be different, depending on the `/FontBBox` parameter of the font.
* Add test for font sizes, and for this a high-level function that returns an iterator of LTPage objects
* Add line to CHANGELOG
* Fix getting filename when extracting embedded files
* Add test for pdf that contains embedded pdf, and fix additional errors in looping over multiple xrefs
* Add line to CHANGELOG
Fixes#186
* Tread the permissions (the /P entry) as unsigned long, fix#186
* handle negative values for p
* Extract function for resolving an twos-complement
* Add test for issue #352
* Add line to CHANGELOG.md
* Only ints can be converted to a uint using two's-complement method
* Standardize import style; multiple imports from same module on one line
Co-authored-by: Pieter Marsman <pietermarsman@gmail.com>
* Drop support for legacy Python 2
* Add python_requires to help pip
* Upgrade Python syntax with pyupgrade
* Upgrade Python syntax with pyupgrade --py3-plus
* Python 3 imports
* Replace six
* Update CONTRIBUTING.md
* Added line to changelog
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>