Commit Graph

649 Commits (0911703eba93b18c3063ce47e4afa65cd2885ec3)

Author SHA1 Message Date
Quentin Pradet 0911703eba
pdfcolor: Fix Python 2.6 compatibility 2018-03-06 14:53:11 +04:00
Quentin Pradet 94f3d61bb2
converter: Fix XML syntax 2018-03-06 14:41:52 +04:00
Quentin Pradet 2231f0892e
Send non-stroke color to XML conversion
Inspired by https://github.com/euske/pdfminer/pull/158 from @andruo11
and https://github.com/euske/pdfminer/pull/197 from @staccatosound.
2018-03-06 14:11:48 +04:00
Quentin Pradet b6c63bedc6
Make DeviceGray the default color as it should be 2018-03-06 11:24:07 +04:00
Quentin Pradet 0ce9a29f83
Fix colorspace determinism with OrderedDict 2018-03-06 11:23:32 +04:00
Tata Ganesh 3e6cc20cb2
Merge pull request #96 from sschuberth/patch-1
TrueTypeFont: Check for enough data to unpack
2018-01-31 18:26:54 +05:30
Tata Ganesh 27abd17711
Merge pull request #106 from oculushut/master
Minor change to README file
2017-12-20 11:19:34 +05:30
oculushut 46d6e571eb
Update README.md
Adds specific location for HTML file containing more information on dumppdf.py command line tool.
2017-12-20 00:23:00 +00:00
oculushut 125bae23cc
Update README.md
Adds specific location for location of html file with more information for Command Line Tools pdf2txt.py.
2017-12-20 00:20:46 +00:00
Guglielmetti Philippe 6d3210d206 pdfdiff tool (and .spec files for compilation with pyinstaller) 2017-11-21 10:48:45 +01:00
ganeshtata 1b88575e79 FIX: Null character replaced by blank
-The presence of the character '\0' was causing an error with some PDFs.
-It has been fixed by replacing all occurences of '\0' with ''.
2017-11-08 12:50:50 +05:30
Sebastian Schuberth fcd3e6ce00 Catch an error unpack might throw instead of checking the length before 2017-10-30 19:31:58 +01:00
Sebastian Schuberth ec8530f6cf Add a test for the previous fix 2017-10-16 12:35:16 +02:00
Sebastian Schuberth 39428fb4f0 TrueTypeFont: Check for enough data to unpack
Fixes https://github.com/euske/pdfminer/issues/96
and https://github.com/euske/pdfminer/issues/144.
2017-10-16 12:35:04 +02:00
SUZUKI Masaya d4118cf5e8 Enabled PDFDevice in the with statement (#88) 2017-08-18 08:15:04 +02:00
Peter Bittner e39800f14c Move package description into package docstring (#87)
Convert Windows/DOS line endings CR/LF to Unix LF (again!)

Add Python 3.6 to classifiers, update project URL
2017-08-18 08:13:15 +02:00
Venelin Stoykov 5ef5484bbe Add tox configuration for easy local testing (#85) 2017-08-18 08:11:32 +02:00
Venelin Stoykov 171cdcc69d Microoptimization for singlebyte fonts (#84)
Instead of list comprehension which will call a function to get the integer value of the bytes directly convert it to bytearray which is more optimal structure for storing list of bytes.
2017-08-18 08:10:27 +02:00
Venelin Stoykov 14de393d5e Cleanup psparser (#83)
- Do not use bytesindex function. Use native slices instead
- Fix import ordering
2017-08-18 08:10:06 +02:00
Venelin Stoykov 496bfd0778 Remove leftover from removing shebangs (#81) 2017-08-18 08:09:00 +02:00
Venelin Stoykov c2432c32f1 Fix assert message for PDFLayoutAnalyzer.end_page (#80)
stack is undefined
2017-08-18 08:08:08 +02:00
Philippe Guglielmetti 4c604828e8 v. 20170720 2017-07-20 21:35:49 +02:00
Philippe Guglielmetti b010db6049 solves https://github.com/pdfminer/pdfminer.six/issues/65 2017-07-20 21:17:06 +02:00
Sergei Maertens 67bf5ab124 Compare byte with byte instead of int (#78) 2017-07-20 20:47:14 +02:00
Sergei Maertens 3e364354da Fixes #64 -- be less strict when inspecting a tree type (#76)
In the PDFStream it's possible that the /Type element is not
present, but /type is. According to the spec, these are different
elements, but in the case in point they had the same meaning.

If PDFMiner is not running in STRICT mode and /Type doesn't resolve,
a fallback to /type is used to determine the tree type.
2017-07-20 20:46:35 +02:00
Attila Szász 938419c476 Align dumppdf tool to modified data structures. (#73)
* Align dumppdf tool to modified data structures.
TOC page numbers should also work now, counting from 1.

* Update version number.
2017-07-20 20:46:11 +02:00
Sergei Maertens d79612c455 Resolve unresolved PDFObjectRefs (#70)
Thank you !
2017-06-02 13:35:12 +02:00
Michał Pasternak 87726d8b4f No, thank you. (#69)
* Enable 3.6 and replace pycrypto with cryptodome

* Upgrade version number

* PyPI badge; whitespace cleanup
2017-05-30 10:02:24 +02:00
Hugh Secker-Walker 35a58ee5b5 Add tools/pdfstats.py which counts all LT* types in a PDF (#68) 2017-05-29 09:11:58 +02:00
Hugh Secker-Walker 488545ddc7 Add string expressions to asserts showing local data (#67) 2017-05-29 09:06:09 +02:00
Michał Pasternak fe21725f07 Please replace pycrypto with pycryptodome (#63)
* Enable 3.6 and replace pycrypto with cryptodome

* Upgrade version number
2017-05-29 09:04:38 +02:00
Anton Oleynick 4bc0a0c105 Update pdftypes.py (#61)
Fix errors with:
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdfinterp.py", line 850, in process_page
    self.render_contents(page.resources, page.contents, ctm=ctm)
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdfinterp.py", line 860, in render_contents
    self.init_resources(resources)
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdfinterp.py", line 360, in init_resources
    self.fontmap[fontid] = self.rsrcmgr.get_font(objid, spec)
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdfinterp.py", line 210, in get_font
    font = self.get_font(None, subspec)
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdfinterp.py", line 201, in get_font
    font = PDFCIDFont(self, spec)
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdffont.py", line 667, in __init__
    BytesIO(self.fontfile.get_data()))
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdftypes.py", line 297, in get_data
    self.decode()
  File "/app/python/lib/python3.5/site-packages/pdfminer/pdftypes.py", line 278, in decode
    if 'Predictor' in params:
TypeError: argument of type 'NoneType' is not iterable
2017-05-29 08:55:02 +02:00
Peter Bittner 4e59fb66de Convert Windows/DOS line endings CR/LF to Unix LF (#58) 2017-05-29 08:53:26 +02:00
Philippe Guglielmetti baddb25df6 v 20170419 (patches a stupid bug from yesterday...) 2017-04-19 14:24:13 +02:00
Philippe Guglielmetti 82af7f0aac issue #56 reproduced, solution attempt unsucessful 2017-04-19 14:19:14 +02:00
Philippe Guglielmetti cd92883925 logging (stupid bug) 2017-04-19 13:48:45 +02:00
Philippe Guglielmetti f28ce1ebed Merge branch 'master' of https://github.com/pdfminer/pdfminer.six.git 2017-04-19 12:28:03 +02:00
Philippe Guglielmetti 11a4c8b6c1 version 20170418 2017-04-18 19:13:20 +02:00
Philippe Guglielmetti 5ef8333c5f new test fails on Linux & TRavis-CI. TODO: find why 2017-04-18 18:28:48 +02:00
Philippe Guglielmetti 7055862eaf solves https://github.com/pdfminer/pdfminer.six/issues/50 2017-04-18 18:20:31 +02:00
Sergei Maertens f2b0650ad5 Fixes #54 -- don't pass bytestrings through ord() (#55) 2017-04-18 16:57:53 +02:00
Philippe Guglielmetti 3427dcaf20 Merge branch 'master' of https://github.com/pdfminer/pdfminer.six.git 2017-02-06 15:56:54 +01:00
Andrew Baumann 9439a3a31a Miscellaneous bug fixes (#47)
* utils.decode_text: fix "TypeError: ord() expected string of length 1, but int found"

fixes https://github.com/goulu/pdfminer/issues/24

* pdfinterp.execute: don't assume that every keyword name can be decoded as utf-8

fixes "'str' does not support the buffer interface", https://github.com/goulu/pdfminer/issues/23

* default settings.STRICT to False, for compatibility with the original pdfminer

* PDFCIDFont: handle font registry/orderings that may be PDFObjRefs

* utils.nunpack: handle 8-byte integers
2017-02-06 14:57:01 +01:00
Philippe Guglielmetti 1e5db2b02d some keywords can't be decoded 2017-01-20 10:55:50 +01:00
Philippe Guglielmetti fd63dbf62e no more skipped tests 2017-01-20 10:12:00 +01:00
Philippe Guglielmetti 9b9d69aee9 image export works again with Py3 (issue #15)
https://github.com/pdfminer/pdfminer.six/issues/15
2017-01-20 10:11:19 +01:00
Philippe Guglielmetti f094f0b380 v. 20170119 RC 2017-01-19 08:42:20 +01:00
Philippe Guglielmetti 7c96fe29ed links updated to new https://github.com/pdfminer owner 2017-01-19 08:37:53 +01:00
Philippe Guglielmetti 52feb22eeb Merge remote-tracking branch 'origin/master'
Conflicts:
	MANIFEST.in
	README.md
	pdfminer/latin_enc.py
	pdfminer/pdfdocument.py
	pdfminer/pdfinterp.py
	pdfminer/pdfpage.py
	pdfminer/pdftypes.py
	pdfminer/psparser.py
	pdfminer/utils.py
	samples/Makefile
	setup.py
2017-01-19 08:03:16 +01:00
Jin-tae Hwang 61d423d81c bugfix: if fontname is bytes then skip (#43) 2016-12-14 17:34:16 +01:00