Pieter Marsman
bc034c8e59
Create sphinx documentation for Read the Docs ( #329 )
...
Fixes #171
Fixes #199
Fixes #118
Fixes #178
Added: tests for building documentation and example code in documentation
Added: docstrings for common used functions and classes
Removed: old documentation
2019-11-07 21:12:34 +01:00
Igor Moura
40aa2533c9
Added: simple wrapper to extract text from pdf ( #330 )
...
Fixes #327
2019-11-07 07:54:10 +01:00
Martin Hasoň
ed1b09c6f2
Fix debug logging for pdf2txt.py and dumppdf.py ( #325 )
...
Fixes #313
2019-11-06 21:47:19 +01:00
Pieter Marsman
33b16b3f07
Deprecate the use of _py2_no_more_posargs ( #328 )
...
Fixes #324
2019-11-02 10:29:39 +01:00
Jianfeng
44b223cf0a
Speedup grouping of textboxes ( #315 )
...
Changed: using a heap instead of a SortedList and avoid rebuilding the heap in each iteration
Changed: avoid potentially huge number of variable assignments in list comprehension.
Changed: avoid repeatly evaluating `obj is obj` in list comprehension by storing id(obj).
2019-10-31 09:22:58 +01:00
Pieter Marsman
d88d6020a2
Remove webapp and other (un)helpful application references: django, cgi, and pyinstaller. ( #320 )
...
Fixes #314
Fixes #105
2019-10-26 19:16:37 +02:00
Pieter Marsman
a238a19999
Fix assertionerror when dumping pdf with reference to objid 0 ( #318 )
...
Fixes #94
Added: test to get check if `PDFObjectNotFound` error is raised if objid 0 is requested.
2019-10-25 22:49:58 +02:00
Serj Sintsov
cb9cd8ea46
Use named logger instead of root logger ( #236 )
2019-10-22 20:52:43 +02:00
Pieter Marsman
373c6e7b97
Added: extraction of JBIG2 encoded images ( #311 )
...
And added test for pdf with JBIG2 image.
Fixes #26
Closes #46
2019-10-22 17:37:06 +02:00
Pieter Marsman
694aa508c3
Release 20191020
2019-10-20 14:21:48 +02:00
Pieter Marsman
adc4726e06
Add warning about dropping python2 support ( #307 )
...
Fix #303
2019-10-20 13:59:29 +02:00
Pieter Marsman
9fd7172f7b
Cleanup utils.py
2019-10-17 12:14:02 +02:00
jet457
7e40fde320
Removing assertion in drange to allow equal inputs ( #246 ) and mimic behaviour of built-in method range
...
Fixes #66 , since it now allows the bbox to have 0 width or 0 height
Added tests for Plane since it is the API that uses drange
2019-10-17 12:04:25 +02:00
D.A.Bashkirtsev
4df6d4e5ca
Changed: comparations for image colorspace literals ( #132 )
...
Fixes #131
Changed: comparations for image colorspace literals
Added: test for extracting images from pdfs
2019-10-15 16:11:54 +02:00
Pieter Marsman
63b2e09ac3
Merge pull request #203 from jbarlow83/negative-descent
...
Interpret font Descent as a negative number even if specified as positive
2019-10-13 20:06:52 +02:00
Tony Tong
106a09c5bb
fix stoke color and non-stroke color in PDFGraphicState
2019-10-12 17:35:46 -04:00
Tata Ganesh
f218996fe9
Merge pull request #273 from igormp/develop
...
Use resolve_all on PdfFont widths and bbox
2019-10-12 21:24:29 +05:30
Fakabbir Amin
7c03d96d25
Corrects Comment
2019-08-20 17:16:10 +05:30
Fakabbir Amin
abd685fdc6
Corrects Code Comment
2019-08-20 17:13:27 +05:30
Fakabbir Amin
3d549ea48c
Removes code comments
2019-08-20 16:48:40 +05:30
Igor Moura
cf4641d877
Merge branch 'develop' into develop
2019-08-15 08:11:28 -03:00
Fakabbir Amin
fe38695739
Merge branch 'develop' into pdfstream-as-cmap
2019-08-10 10:44:31 +05:30
Fakabbir Amin
5a0d8db052
Adds decoder for OnebyteIdentityH/V instead of using default CMap
2019-08-10 10:07:23 +05:30
Tata Ganesh
42e2c8143b
Merge pull request #263 from pietermarsman/261-glyph-list-specification
...
name2unicode() should follow the Adobe Glyph List Specification
2019-07-26 22:13:34 +05:30
Igor Moura
2f4518231f
Use resolve_all on PdfFont widths and bbox
...
Fixes #268
2019-07-24 15:10:13 -03:00
Igor Moura
540df9f676
Replaced .iteritems() and with six.iteritems() for Python 3 compat
...
This is a squashed commit, the previous messages can be seen bellow
This is the 1st commit message:
Replaced .iteritems() usage for .items()
Fixed some python 2 leftovers, as discussed in #267 . Also formatted code according to Black.\nThis possibly breaks some python 2 compatibility
This is the commit message #2 :
Reverted formatting and more spread six usage
2019-07-24 14:08:30 -03:00
Fakabbir Amin
f1a4dcea88
Adds Test Cases, Neater Code For CMap Assignment
2019-07-24 11:56:06 +05:30
Fakabbir Amin
fa400431f5
Adds Test, Removes Unnecessary Assumptions
2019-07-17 11:38:00 +05:30
Pieter Marsman
6f362f53fe
Raise a `KeyError` with a useful message if `unicode2name()` does not match any glyph name. Use this message to log debug statements.
2019-07-16 08:52:24 +02:00
Pieter Marsman
0fb83366b6
Remove intermediate variable `full_stop` because it is just a dot
2019-07-16 08:49:57 +02:00
Fakabbir Amin
cc40af3d2b
Removes @property, Adds docstring
2019-07-15 14:21:21 +05:30
Pieter Marsman
c597e95a9f
Use KeyError to signal that the name does not resemble any unicode, this pattern is also used in the rest of pdfminer.six
2019-07-14 15:37:15 +02:00
Pieter Marsman
33cc9861ae
Add docstring to Type1FontHeaderParser.get_encoding() that describes that the custom CharStrings of the font are mapped to ''
2019-07-14 15:19:17 +02:00
Pieter Marsman
f0392f8049
Change implementation of name2unicode such that it follows the Adobe Glyph specs (with allowing lowercase)
2019-07-14 15:16:42 +02:00
Fakabbir Amin
8e4a82ad8b
Corrects Indentation
2019-07-13 05:00:25 +05:30
Fakabbir Amin
c022358c8d
Encapsulates character map name
2019-07-13 04:52:24 +05:30
John Kesegich
8ab2e287be
Handle PDFStream as character map name in PDFCIDFont
2019-02-25 11:42:30 -06:00
ganeshtata
b6a5848208
FEAT: Release 20181108
2018-11-08 22:37:11 +05:30
Tata Ganesh
e03ecab856
Merge pull request #141 from timb07/speedup_layout
...
Speed up layout of text boxes
2018-11-08 20:28:40 +05:30
James R. Barlow
2ede124142
Interpet font Descent as a negative number even if specified as positive
...
The PDF RM specifies that Descent should be negative. Fonts that claim
to have a positive Descent (not that it would make sense) always seem
to be wrong about this claim.
2018-11-03 23:17:48 -07:00
Tata Ganesh
259b29299e
Merge pull request #133 from timb07/speedup
...
Speed up handling of PDFs with large images
2018-07-15 11:27:35 +05:30
Martin Wolf
edaf2c9e3f
move unittest to main()
2018-06-26 00:51:51 +02:00
Martin Wolf
eff3f19886
Merge remote-tracking branch 'upstream/master'
2018-06-25 23:32:52 +02:00
Tata Ganesh
9c7bdcc716
Merge pull request #157 from h2ri/master
...
decode cid: 160 and 173 to spaces
2018-06-25 11:19:27 +05:30
Charles Reid
7b08cdbff9
apply dos2unix to files in pdfminer/ and tools/ to remove \r\n windows line endings
2018-06-21 12:19:48 -07:00
Goulu
1db260609e
render_string must have 5 params in all PDFDevice classes ( #158 )
2018-06-21 10:21:26 +02:00
Guglielmetti Philippe
70624a64dd
render_string() now takes 3 parameters, not 5 (reverted from commit 95b65536af
)
2018-06-21 09:49:45 +02:00
Guglielmetti Philippe
95b65536af
render_string() now takes 3 parameters, not 5
2018-06-21 09:28:55 +02:00
Healthi
65eb0cef82
decode cid: 160 and 170 to spaces
2018-06-20 17:17:03 +05:30
Martin Wolf
26f80715ed
Merge remote-tracking branch 'upstream/master'
2018-06-20 13:27:18 +02:00