Jake Stockwin
7254530d27
Fix ordering of textlines within a textbox when boxes_flow is disabled ( #412 )
...
* Fix ordering of textlines within a textbox when boxes_flow is disabled
* Add new test PDF sample
2020-05-09 15:37:49 +02:00
Pieter Marsman
1c3047b68b
Remove samples/ directory from source distribution to prevent downloading all pdf's when installing pdfminer.six ( #364 )
...
Fixes #363
* Remove samples/ and docs/ from source distribution. The samples/ dictionairy contains pdf's for testing purposes and the docs/ contain readthedocs documentation and is published online.
* Remove issue-00152-embedded-pdf.pdf because it contains a possible exploit.
See https://www.microsoft.com/en-us/wdsi/threats/malware-encyclopedia-description?Name=Exploit%3AJS%2FShellCode.gen
And https://github.com/pdfminer/pdfminer.six/issues/363
* Added line to CHANGELOG.md
* Remove unused imports
2020-01-24 12:36:02 +01:00
Pieter Marsman
fff3ac2ba6
Fix bug in computing character bounding box ( #348 )
...
* Remove scaling font height/width with size of font bounding box
* Refactor LTChar bounding box computation
* Change expected outcome of `python tools/pdf2txt.py samples/simple3.pdf`, because it looks like an improvement. However, when I view `samples/simple3.pdf` I don't see any text at all. The change in expected outcome is explained by the fact that the bounding boxes of characters can be different, depending on the `/FontBBox` parameter of the font.
* Add test for font sizes, and for this a high-level function that returns an iterator of LTPage objects
* Add line to CHANGELOG
2020-01-16 22:15:50 +01:00
Pieter Marsman
2f7f5d2667
Fallback on backwards-compatible key (F) for embedded files URL's when the unicode URL (UF) does not exist ( #338 )
...
* Fix getting filename when extracting embedded files
* Add test for pdf that contains embedded pdf, and fix additional errors in looping over multiple xrefs
* Add line to CHANGELOG
2020-01-16 22:11:42 +01:00
Recursing
0b1741b9bf
Pack the /P (ermissions) entry from the /Encrypt dictionionary in the file trailer, as unsigned long ( #352 )
...
Fixes #186
* Tread the permissions (the /P entry) as unsigned long, fix #186
* handle negative values for p
* Extract function for resolving an twos-complement
* Add test for issue #352
* Add line to CHANGELOG.md
* Only ints can be converted to a uint using two's-complement method
* Standardize import style; multiple imports from same module on one line
Co-authored-by: Pieter Marsman <pietermarsman@gmail.com>
2020-01-07 21:59:13 +01:00
Pieter Marsman
3502dc9f3b
Drop support for legacy Python 2 ( #346 )
...
* Drop support for legacy Python 2
* Add python_requires to help pip
* Upgrade Python syntax with pyupgrade
* Upgrade Python syntax with pyupgrade --py3-plus
* Python 3 imports
* Replace six
* Update CONTRIBUTING.md
* Added line to changelog
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
2020-01-04 16:47:07 +01:00
Pieter Marsman
1c4a4167ed
Fix failing test on develop & cleaning up test files ( #319 )
2019-10-26 18:42:33 +02:00
jbarlow83
733ddf7e57
Added: tests for extracting tests from pdfs with Type3 fonts ( #205 )
2019-10-22 18:15:59 +02:00
Pieter Marsman
373c6e7b97
Added: extraction of JBIG2 encoded images ( #311 )
...
And added test for pdf with JBIG2 image.
Fixes #26
Closes #46
2019-10-22 17:37:06 +02:00
Fakabbir Amin
5b210981c9
Adds Test Case
2019-08-10 10:19:20 +05:30
Sebastian Schuberth
ec8530f6cf
Add a test for the previous fix
2017-10-16 12:35:16 +02:00
Philippe Guglielmetti
b010db6049
solves https://github.com/pdfminer/pdfminer.six/issues/65
2017-07-20 21:17:06 +02:00
Philippe Guglielmetti
82af7f0aac
issue #56 reproduced, solution attempt unsucessful
2017-04-19 14:19:14 +02:00
Philippe Guglielmetti
7055862eaf
solves https://github.com/pdfminer/pdfminer.six/issues/50
2017-04-18 18:20:31 +02:00
Daniel Berthereau
10815bff7b
Fixed tests.
2016-06-27 00:00:00 +02:00
cybjit
2ee7153f6e
add python3 in sample Makefile
2014-09-16 22:56:13 +02:00
Yusuke Shinyama
2e900e5d10
Fixed for consistent test results. (hopefully...)
2014-06-26 17:41:31 +09:00
Yusuke Shinyama
a3ab6c253b
Fixed: loose autotesting.
2014-06-25 19:50:20 +09:00
Yusuke Shinyama
8f9c4dedff
Test rig cleanup.
2014-06-15 11:41:30 +09:00
Yusuke Shinyama
a8ec99a848
More autotest tweaks.
2014-06-15 10:52:59 +09:00
Yusuke Shinyama
fb3f2d9629
Further test tweaks.
2014-06-14 12:00:31 +09:00
Yusuke Shinyama
a7489aaabe
Fixed: autotests
2014-06-14 10:54:40 +09:00
numion
a4997d6f10
Implement revision 4 and 5 encryption handler.
2014-05-19 16:27:43 +02:00
Yusuke Shinyama
c8b6d4112a
Fixed: crash with negative layout bbox.
2013-11-09 15:10:14 +09:00
Matthew Duggan
f02cb11945
Update test references based on recent layout analysis improvements
2013-11-07 17:44:09 +09:00
Yusuke Shinyama
56917a213c
testcase updated
2011-05-15 01:22:51 +09:00
Yusuke Shinyama
e8cd880409
testdata changed
2011-02-27 19:48:22 +09:00
yusuke.shinyama.dummy
5d98a27d9c
test cases updated
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@282 1aa58f4a-7d42-0410-adbc-911cccaed67c
2010-12-25 08:41:11 +00:00
yusuke.shinyama.dummy
509ab66319
stay with python2
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@264 1aa58f4a-7d42-0410-adbc-911cccaed67c
2010-10-19 09:57:01 +00:00
yusuke.shinyama.dummy
607d4734db
update test cases
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@255 1aa58f4a-7d42-0410-adbc-911cccaed67c
2010-10-17 05:15:28 +00:00
yusuke.shinyama.dummy
3305c07ba2
layout analysis improved
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@245 1aa58f4a-7d42-0410-adbc-911cccaed67c
2010-10-17 05:13:39 +00:00
yusuke.shinyama.dummy
0944cfaded
test file simple3.pdf added.
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@240 1aa58f4a-7d42-0410-adbc-911cccaed67c
2010-08-29 06:39:41 +00:00
yusuke.shinyama.dummy
83d2086f19
fix minor layout issue
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@239 1aa58f4a-7d42-0410-adbc-911cccaed67c
2010-08-29 06:39:31 +00:00
yusuke.shinyama.dummy
f5aff374fc
some wordings and documentations
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@229 1aa58f4a-7d42-0410-adbc-911cccaed67c
2010-06-19 03:56:50 +00:00
yusuke.shinyama.dummy
f2005bee55
non-free sample files moved into a separate directory
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@227 1aa58f4a-7d42-0410-adbc-911cccaed67c
2010-06-13 04:35:18 +00:00
yusuke.shinyama.dummy
aa7e7d3e35
add a README file to show credits of the sample files.
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@223 1aa58f4a-7d42-0410-adbc-911cccaed67c
2010-06-06 05:16:37 +00:00
yusuke.shinyama.dummy
836eb37b47
test reference results changed
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@204 1aa58f4a-7d42-0410-adbc-911cccaed67c
2010-04-10 11:29:40 +00:00
yusuke.shinyama.dummy
5f822f6dcb
improved layout analysis.
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@197 1aa58f4a-7d42-0410-adbc-911cccaed67c
2010-03-26 11:11:35 +00:00
yusuke.shinyama.dummy
2e5b92c18a
writing mode detection
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@196 1aa58f4a-7d42-0410-adbc-911cccaed67c
2010-03-25 11:38:47 +00:00
yusuke.shinyama.dummy
40b36a7c42
consistent test results
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@191 1aa58f4a-7d42-0410-adbc-911cccaed67c
2010-03-22 06:04:54 +00:00
yusuke.shinyama.dummy
fa13122f09
add regression tests.
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@189 1aa58f4a-7d42-0410-adbc-911cccaed67c
2010-03-22 04:34:52 +00:00
yusuke.shinyama.dummy
cd39642abe
code cleanup
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@188 1aa58f4a-7d42-0410-adbc-911cccaed67c
2010-03-22 04:00:18 +00:00
yusuke.shinyama.dummy
2dee2efad9
apply more patches
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@181 1aa58f4a-7d42-0410-adbc-911cccaed67c
2010-02-13 15:00:43 +00:00
yusuke.shinyama.dummy
0f8fe3f19e
Page rotation bug fixed.
...
Various minor fixes.
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@176 1aa58f4a-7d42-0410-adbc-911cccaed67c
2010-01-31 02:09:28 +00:00
yusuke.shinyama.dummy
77986b8273
fix CMapDB initialization stuff. more code cleanup.
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@148 1aa58f4a-7d42-0410-adbc-911cccaed67c
2009-11-03 13:39:34 +00:00
yusuke.shinyama.dummy
78f7866554
sgml to xml
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@146 1aa58f4a-7d42-0410-adbc-911cccaed67c
2009-10-31 03:04:56 +00:00
yusuke.shinyama.dummy
e8b1309e76
testcase added
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@140 1aa58f4a-7d42-0410-adbc-911cccaed67c
2009-10-24 02:50:07 +00:00
yusuke.shinyama.dummy
57025ee632
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@122 1aa58f4a-7d42-0410-adbc-911cccaed67c
2009-07-21 16:06:50 +00:00
yusuke.shinyama.dummy
8a5bec5065
layout analysis improved.
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@120 1aa58f4a-7d42-0410-adbc-911cccaed67c
2009-07-21 07:55:19 +00:00
yusuke.shinyama.dummy
fee33266aa
test added
...
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@111 1aa58f4a-7d42-0410-adbc-911cccaed67c
2009-05-17 14:05:41 +00:00