Chris Hager
b686dd0139
pdfminer/settings.py for STRICT and added ENFORCE_CHECK_EXTRACTABLE
2015-11-01 22:28:08 +01:00
Goulu
a46ea52e20
Merge pull request #7 from orangain/install_requires
...
Ensure to install required libraries on installation
2015-08-11 12:38:15 +02:00
orangain
e143ad7ba8
Ensure to install required libraries on installation
2015-08-06 20:55:57 +09:00
Goulu
bc8d631a7c
Merge pull request #6 from GreenLightGo/hotfix/strict-setting
...
change STRICT to be a settings attribute
2015-07-21 10:43:39 +02:00
Alex Zagorodniuk
131cb1ea92
change STRICT to be a settings attribute
2015-06-22 19:08:35 -04:00
Goulu
623bd98452
Update __init__.py
...
version 20150601
2015-06-01 10:21:51 +02:00
Goulu
30e14ddf65
Merge pull request #5 from cathalgarvey/master
...
Lots of changes to improve compatibility and modularity
2015-06-01 10:18:49 +02:00
Cathal Garvey
e2d3adc8c1
Adding chardet to Travis
2015-05-30 19:35:05 +01:00
Cathal Garvey
403711ed13
Whoops, forgot to version-gate chardet in the actual code. Thanks Travis!
2015-05-30 19:33:35 +01:00
Cathal Garvey
a2ad7a6d03
Fixed some bugs preventing all tests from passing in Py2.
2015-05-30 18:02:29 +01:00
Cathal Garvey
79c97ac221
Docstrings.
2015-05-30 17:16:06 +01:00
Cathal Garvey
268e9fb2bd
Removed typechecking, nothing's exploded yet and argparse does lots of heavy lifting already.
2015-05-30 17:05:28 +01:00
Cathal Garvey
3b7edba48c
Forgot to add the actual compartmentalised function..
2015-05-30 17:04:28 +01:00
Cathal Garvey
b3553cef10
Cleaning up pdf2txt.py after the partition/move.
2015-05-30 17:03:55 +01:00
Cathal Garvey
cbe270a4bf
Killed the old main function for pdf2txt.py
2015-05-30 16:37:22 +01:00
Cathal Garvey
ead8e778a6
Successfully compartmentalised code, getting closer to moving pdf->text as a module function.
2015-05-30 16:27:58 +01:00
Cathal Garvey
08cb217983
Progress, progress.. not nearly atomic enough, sorry.
2015-05-30 16:14:24 +01:00
Cathal Garvey
1b47bed306
Many changes to make pdf2txt.py work better in Py3, some in that script, others in module!
...
Sorry, changes should have been more atomic.
*In pdf2txt.py:*
* Re-wrote main function to use argparse instead of optparse.
* Manually tested in Py2/Py3 to get partial consistency.
* Errors abound including Tags mode, but most modes weren't working at all in Py3 anyway.
* Py2 mode *probably* unchanged, cannot find any bugs yet...
* Kept old main function for posterity, for now.
*In utils:*
* Added a few compatibility functions (some string hax required chardet, new dependency):
- make_compat_bytes(in_str)-> (py3->bytes | py2->str)
- make_compat_str(in_str)-> (str)
- compatible_encode_method(bytesorstring, encoding, erraction)-> (str)
*In pdfdevice:*
* To handle different output filetypes in Py3, injected lots of calls to new utils methods,
as well as some six.PYX checks and logic. These changes are largely responsible for
enhanced Py2/Py3 consistency.
*In converter:*
* To handle output filetypes in Py2, injected a few checks and fixes particularly around the
py2 `str.encode` method and its *assumed* usual use-analogies in Py3.
2015-05-17 21:08:57 +01:00
Philippe Guglielmetti
448aa08bc4
Merge pull request #4 from enkore/master
...
Fix utils.decode_text
2014-12-05 09:58:58 +01:00
enkore
d0379a2c44
Fix utils.decode_text
2014-12-04 17:09:52 +01:00
Philippe Guglielmetti
0e40264071
Merge pull request #3 from Cybjit/master
...
Samples and latin1 passwords
2014-09-17 07:22:52 +02:00
cybjit
515687e1bb
more xrange to range
2014-09-16 23:17:31 +02:00
cybjit
2639b15ef4
guess argv encoding in py2 using sys.stdin.encoding
2014-09-16 23:17:26 +02:00
cybjit
9b2e29396b
apply_png_predictor py3
2014-09-16 22:59:29 +02:00
cybjit
ad05121c69
password py3
2014-09-16 22:59:00 +02:00
cybjit
14585987c3
keep password api unicode, latin1 or utf-8 is encoded in handler
2014-09-16 22:58:25 +02:00
cybjit
2260f77b19
fix dict_value usage in strict mode
2014-09-16 22:57:29 +02:00
cybjit
51a361c145
clean up HTMLConverter and XMLConverter encoding
2014-09-16 22:57:00 +02:00
cybjit
2ee7153f6e
add python3 in sample Makefile
2014-09-16 22:56:13 +02:00
Goulu
f577f76c52
renamed as pdfminer.six in PyPi
2014-09-15 11:10:00 +02:00
Goulu
03de0f4db8
forgot 'six' requirement ...
2014-09-15 10:42:08 +02:00
Goulu
8861d7e0ed
version 20140915 pushed to PyPi as pdfminer_six
2014-09-15 10:33:04 +02:00
Philippe Guglielmetti
4f8aa9ff5b
Merge pull request #2 from Cybjit/master
...
CMap fixes and speed improvements
2014-09-12 07:33:06 +02:00
cybjit
714423883c
setup logging for pdf2txt and fix dumppdf
2014-09-12 00:29:31 +02:00
cybjit
39942b6642
avoid string formating when not logging
2014-09-12 00:29:31 +02:00
cybjit
01821c7d1e
rename bytes to avoid built-in collision
2014-09-12 00:29:31 +02:00
cybjit
31e6afc7cf
faster and simpler bytes implementation
2014-09-12 00:29:30 +02:00
cybjit
ed13f7c47d
conv_cmap py3 compat
2014-09-12 00:29:30 +02:00
cybjit
cba5a42ba8
decipher_all bytes
2014-09-12 00:29:30 +02:00
cybjit
6357e2da80
code2cid uses int, not byte
2014-09-12 00:29:27 +02:00
cybjit
9b0a3ee53e
decode cmap font name
2014-09-11 23:30:02 +02:00
Philippe Guglielmetti
7b620b3146
Merge pull request #1 from Cybjit/master
...
Python 3 text conversion issues
2014-09-09 20:42:37 +02:00
cybjit
a6f31a713d
cmap bytes and decode
2014-09-07 18:41:04 +02:00
cybjit
cc733c8217
fixes for ARC4
2014-09-07 18:38:22 +02:00
cybjit
f9a67db89b
change xrange to range
2014-09-07 18:36:12 +02:00
cybjit
0a2d90c051
pdf2txt: do not double encode stdout
2014-09-07 18:34:11 +02:00
unknown
28c2a4e6ad
2.7/3.4 encoding corrected
2014-09-04 10:31:33 +02:00
unknown
58b8492783
no logging in travis.ci
2014-09-04 10:19:50 +02:00
unknown
1c93468c7e
faster, less verbose tests
2014-09-04 10:02:29 +02:00
unknown
7b610b34be
tools must be a module to enable scripts tests
2014-09-04 09:47:33 +02:00