Commit Graph

534 Commits (f1d5d681b6d2ab0ddeaea925ba784ebb94f6d509)

Author SHA1 Message Date
orangain f1d5d681b6 Include compiled cmap resources to simplify installation for CJK languages
* Run `make cmap` and `git add pdfminer/cmap`.
* Modify MANIFEST.in not to include cmaprsrc dir in the sdist package.
* Add pdfminer/cmap/README.txt to include license in the sdist package.
* Remove installation guide specific to CJK languages from README.md.
2015-12-27 13:32:29 +09:00
Goulu 72b2bc3197 Merge pull request #11 from metachris/pdfminerX
Pdfminer Updates
2015-12-06 18:56:53 +01:00
Chris Hager 8149be1669 bugfixes 2015-12-06 00:17:58 +01:00
Chris Hager a9a026b796 Merge remote-tracking branch 'origin/patch-1'
* origin/patch-1:
  Updated setup.py to work with Python 2.6
2015-12-06 00:13:31 +01:00
Chris Hager 146abb459f Updated setup.py to work with Python 2.6
Simple fix. Mind to add and push to PyPi?
2015-11-08 02:32:23 +01:00
Chris Hager 2e1be5721f removed settings.ENFORCE_CHECK_EXTRACTABLE 2015-11-01 22:34:18 +01:00
Chris Hager b686dd0139 pdfminer/settings.py for STRICT and added ENFORCE_CHECK_EXTRACTABLE 2015-11-01 22:28:08 +01:00
Goulu a46ea52e20 Merge pull request #7 from orangain/install_requires
Ensure to install required libraries on installation
2015-08-11 12:38:15 +02:00
orangain e143ad7ba8 Ensure to install required libraries on installation 2015-08-06 20:55:57 +09:00
Goulu bc8d631a7c Merge pull request #6 from GreenLightGo/hotfix/strict-setting
change STRICT to be a settings attribute
2015-07-21 10:43:39 +02:00
Alex Zagorodniuk 131cb1ea92 change STRICT to be a settings attribute 2015-06-22 19:08:35 -04:00
Goulu 623bd98452 Update __init__.py
version 20150601
2015-06-01 10:21:51 +02:00
Goulu 30e14ddf65 Merge pull request #5 from cathalgarvey/master
Lots of changes to improve compatibility and modularity
2015-06-01 10:18:49 +02:00
Cathal Garvey e2d3adc8c1 Adding chardet to Travis 2015-05-30 19:35:05 +01:00
Cathal Garvey 403711ed13 Whoops, forgot to version-gate chardet in the actual code. Thanks Travis! 2015-05-30 19:33:35 +01:00
Cathal Garvey a2ad7a6d03 Fixed some bugs preventing all tests from passing in Py2. 2015-05-30 18:02:29 +01:00
Cathal Garvey 79c97ac221 Docstrings. 2015-05-30 17:16:06 +01:00
Cathal Garvey 268e9fb2bd Removed typechecking, nothing's exploded yet and argparse does lots of heavy lifting already. 2015-05-30 17:05:28 +01:00
Cathal Garvey 3b7edba48c Forgot to add the actual compartmentalised function.. 2015-05-30 17:04:28 +01:00
Cathal Garvey b3553cef10 Cleaning up pdf2txt.py after the partition/move. 2015-05-30 17:03:55 +01:00
Cathal Garvey cbe270a4bf Killed the old main function for pdf2txt.py 2015-05-30 16:37:22 +01:00
Cathal Garvey ead8e778a6 Successfully compartmentalised code, getting closer to moving pdf->text as a module function. 2015-05-30 16:27:58 +01:00
Cathal Garvey 08cb217983 Progress, progress.. not nearly atomic enough, sorry. 2015-05-30 16:14:24 +01:00
Cathal Garvey 1b47bed306 Many changes to make pdf2txt.py work better in Py3, some in that script, others in module!
Sorry, changes should have been more atomic.

*In pdf2txt.py:*

* Re-wrote main function to use argparse instead of optparse.
* Manually tested in Py2/Py3 to get partial consistency.
* Errors abound including Tags mode, but most modes weren't working at all in Py3 anyway.
* Py2 mode *probably* unchanged, cannot find any bugs yet...
* Kept old main function for posterity, for now.

*In utils:*

* Added a few compatibility functions (some string hax required chardet, new dependency):
    - make_compat_bytes(in_str)-> (py3->bytes | py2->str)
    - make_compat_str(in_str)-> (str)
    - compatible_encode_method(bytesorstring, encoding, erraction)-> (str)

*In pdfdevice:*

* To handle different output filetypes in Py3, injected lots of calls to new utils methods,
  as well as some six.PYX checks and logic. These changes are largely responsible for
  enhanced Py2/Py3 consistency.

*In converter:*

* To handle output filetypes in Py2, injected a few checks and fixes particularly around the
  py2 `str.encode` method and its *assumed* usual use-analogies in Py3.
2015-05-17 21:08:57 +01:00
Philippe Guglielmetti 448aa08bc4 Merge pull request #4 from enkore/master
Fix utils.decode_text
2014-12-05 09:58:58 +01:00
enkore d0379a2c44 Fix utils.decode_text 2014-12-04 17:09:52 +01:00
Philippe Guglielmetti 0e40264071 Merge pull request #3 from Cybjit/master
Samples and latin1 passwords
2014-09-17 07:22:52 +02:00
cybjit 515687e1bb more xrange to range 2014-09-16 23:17:31 +02:00
cybjit 2639b15ef4 guess argv encoding in py2 using sys.stdin.encoding 2014-09-16 23:17:26 +02:00
cybjit 9b2e29396b apply_png_predictor py3 2014-09-16 22:59:29 +02:00
cybjit ad05121c69 password py3 2014-09-16 22:59:00 +02:00
cybjit 14585987c3 keep password api unicode, latin1 or utf-8 is encoded in handler 2014-09-16 22:58:25 +02:00
cybjit 2260f77b19 fix dict_value usage in strict mode 2014-09-16 22:57:29 +02:00
cybjit 51a361c145 clean up HTMLConverter and XMLConverter encoding 2014-09-16 22:57:00 +02:00
cybjit 2ee7153f6e add python3 in sample Makefile 2014-09-16 22:56:13 +02:00
Goulu f577f76c52 renamed as pdfminer.six in PyPi 2014-09-15 11:10:00 +02:00
Goulu 03de0f4db8 forgot 'six' requirement ... 2014-09-15 10:42:08 +02:00
Goulu 8861d7e0ed version 20140915 pushed to PyPi as pdfminer_six 2014-09-15 10:33:04 +02:00
Philippe Guglielmetti 4f8aa9ff5b Merge pull request #2 from Cybjit/master
CMap fixes and speed improvements
2014-09-12 07:33:06 +02:00
cybjit 714423883c setup logging for pdf2txt and fix dumppdf 2014-09-12 00:29:31 +02:00
cybjit 39942b6642 avoid string formating when not logging 2014-09-12 00:29:31 +02:00
cybjit 01821c7d1e rename bytes to avoid built-in collision 2014-09-12 00:29:31 +02:00
cybjit 31e6afc7cf faster and simpler bytes implementation 2014-09-12 00:29:30 +02:00
cybjit ed13f7c47d conv_cmap py3 compat 2014-09-12 00:29:30 +02:00
cybjit cba5a42ba8 decipher_all bytes 2014-09-12 00:29:30 +02:00
cybjit 6357e2da80 code2cid uses int, not byte 2014-09-12 00:29:27 +02:00
cybjit 9b0a3ee53e decode cmap font name 2014-09-11 23:30:02 +02:00
Philippe Guglielmetti 7b620b3146 Merge pull request #1 from Cybjit/master
Python 3 text conversion issues
2014-09-09 20:42:37 +02:00
cybjit a6f31a713d cmap bytes and decode 2014-09-07 18:41:04 +02:00
cybjit cc733c8217 fixes for ARC4 2014-09-07 18:38:22 +02:00