Commit Graph

423 Commits (733ddf7e570d9d20c28de7f965c07aec464a936d)

Author SHA1 Message Date
Pieter Marsman 373c6e7b97
Added: extraction of JBIG2 encoded images (#311)
And added test for pdf with JBIG2 image.

Fixes #26 
Closes #46
2019-10-22 17:37:06 +02:00
Pieter Marsman 694aa508c3 Release 20191020 2019-10-20 14:21:48 +02:00
Pieter Marsman adc4726e06
Add warning about dropping python2 support (#307)
Fix #303
2019-10-20 13:59:29 +02:00
Pieter Marsman 9fd7172f7b Cleanup utils.py 2019-10-17 12:14:02 +02:00
jet457 7e40fde320 Removing assertion in drange to allow equal inputs (#246) and mimic behaviour of built-in method range
Fixes #66, since it now allows the bbox to have 0 width or 0 height
Added tests for Plane since it is the API that uses drange
2019-10-17 12:04:25 +02:00
D.A.Bashkirtsev 4df6d4e5ca Changed: comparations for image colorspace literals (#132)
Fixes #131 

Changed: comparations for image colorspace literals
Added: test for extracting images from pdfs
2019-10-15 16:11:54 +02:00
Pieter Marsman 63b2e09ac3
Merge pull request #203 from jbarlow83/negative-descent
Interpret font Descent as a negative number even if specified as positive
2019-10-13 20:06:52 +02:00
Tony Tong 106a09c5bb fix stoke color and non-stroke color in PDFGraphicState 2019-10-12 17:35:46 -04:00
Tata Ganesh f218996fe9
Merge pull request #273 from igormp/develop
Use resolve_all on PdfFont widths and bbox
2019-10-12 21:24:29 +05:30
Fakabbir Amin 7c03d96d25 Corrects Comment 2019-08-20 17:16:10 +05:30
Fakabbir Amin abd685fdc6 Corrects Code Comment 2019-08-20 17:13:27 +05:30
Fakabbir Amin 3d549ea48c Removes code comments 2019-08-20 16:48:40 +05:30
Igor Moura cf4641d877
Merge branch 'develop' into develop 2019-08-15 08:11:28 -03:00
Fakabbir Amin fe38695739
Merge branch 'develop' into pdfstream-as-cmap 2019-08-10 10:44:31 +05:30
Fakabbir Amin 5a0d8db052 Adds decoder for OnebyteIdentityH/V instead of using default CMap 2019-08-10 10:07:23 +05:30
Tata Ganesh 42e2c8143b
Merge pull request #263 from pietermarsman/261-glyph-list-specification
name2unicode() should follow the Adobe Glyph List Specification
2019-07-26 22:13:34 +05:30
Igor Moura 2f4518231f Use resolve_all on PdfFont widths and bbox
Fixes #268
2019-07-24 15:10:13 -03:00
Igor Moura 540df9f676 Replaced .iteritems() and with six.iteritems() for Python 3 compat
This is a squashed commit, the previous messages can be seen bellow

This is the 1st commit message:

Replaced .iteritems() usage for .items()

Fixed some python 2 leftovers, as discussed in #267. Also formatted code according to Black.\nThis possibly breaks some python 2 compatibility

This is the commit message #2:

Reverted formatting and more spread six usage
2019-07-24 14:08:30 -03:00
Fakabbir Amin f1a4dcea88 Adds Test Cases, Neater Code For CMap Assignment 2019-07-24 11:56:06 +05:30
Fakabbir Amin fa400431f5 Adds Test, Removes Unnecessary Assumptions 2019-07-17 11:38:00 +05:30
Pieter Marsman 6f362f53fe Raise a `KeyError` with a useful message if `unicode2name()` does not match any glyph name. Use this message to log debug statements. 2019-07-16 08:52:24 +02:00
Pieter Marsman 0fb83366b6 Remove intermediate variable `full_stop` because it is just a dot 2019-07-16 08:49:57 +02:00
Fakabbir Amin cc40af3d2b Removes @property, Adds docstring 2019-07-15 14:21:21 +05:30
Pieter Marsman c597e95a9f Use KeyError to signal that the name does not resemble any unicode, this pattern is also used in the rest of pdfminer.six 2019-07-14 15:37:15 +02:00
Pieter Marsman 33cc9861ae Add docstring to Type1FontHeaderParser.get_encoding() that describes that the custom CharStrings of the font are mapped to '' 2019-07-14 15:19:17 +02:00
Pieter Marsman f0392f8049 Change implementation of name2unicode such that it follows the Adobe Glyph specs (with allowing lowercase) 2019-07-14 15:16:42 +02:00
Fakabbir Amin 8e4a82ad8b Corrects Indentation 2019-07-13 05:00:25 +05:30
Fakabbir Amin c022358c8d Encapsulates character map name 2019-07-13 04:52:24 +05:30
John Kesegich 8ab2e287be Handle PDFStream as character map name in PDFCIDFont 2019-02-25 11:42:30 -06:00
ganeshtata b6a5848208 FEAT: Release 20181108 2018-11-08 22:37:11 +05:30
Tata Ganesh e03ecab856
Merge pull request #141 from timb07/speedup_layout
Speed up layout of text boxes
2018-11-08 20:28:40 +05:30
James R. Barlow 2ede124142 Interpet font Descent as a negative number even if specified as positive
The PDF RM specifies that Descent should be negative. Fonts that claim
to have a positive Descent (not that it would make sense) always seem
to be wrong about this claim.
2018-11-03 23:17:48 -07:00
Tata Ganesh 259b29299e
Merge pull request #133 from timb07/speedup
Speed up handling of PDFs with large images
2018-07-15 11:27:35 +05:30
Martin Wolf edaf2c9e3f move unittest to main() 2018-06-26 00:51:51 +02:00
Martin Wolf eff3f19886 Merge remote-tracking branch 'upstream/master' 2018-06-25 23:32:52 +02:00
Tata Ganesh 9c7bdcc716
Merge pull request #157 from h2ri/master
decode cid: 160 and 173 to spaces
2018-06-25 11:19:27 +05:30
Charles Reid 7b08cdbff9 apply dos2unix to files in pdfminer/ and tools/ to remove \r\n windows line endings 2018-06-21 12:19:48 -07:00
Goulu 1db260609e
render_string must have 5 params in all PDFDevice classes (#158) 2018-06-21 10:21:26 +02:00
Guglielmetti Philippe 70624a64dd render_string() now takes 3 parameters, not 5 (reverted from commit 95b65536af) 2018-06-21 09:49:45 +02:00
Guglielmetti Philippe 95b65536af render_string() now takes 3 parameters, not 5 2018-06-21 09:28:55 +02:00
Healthi 65eb0cef82 decode cid: 160 and 170 to spaces 2018-06-20 17:17:03 +05:30
Martin Wolf 26f80715ed Merge remote-tracking branch 'upstream/master' 2018-06-20 13:27:18 +02:00
Tata Ganesh 67bc581bd3
Merge pull request #134 from timb07/issue_90
FIX: TypeError caused by bug in _parse_comment; #90 #89 #109
2018-06-14 09:27:34 +05:30
Tata Ganesh 7084d81bd1
Merge pull request #129 from clustree/xml-color
FEAT: Send color to XML conversion
2018-06-10 21:02:34 +05:30
Martin Wolf 4bdb3ba8cc Fixes needed to be able to compile pdfminer.six with Cython 2018-04-12 00:05:38 +02:00
Tim Bell 1cbeaebfce Fix Python 2.6 incompatibility 2018-04-11 10:34:15 +10:00
Tim Bell 0c8cf748fe Fix copy-paste error 2018-04-11 10:15:32 +10:00
Tim Bell 8f8a78bb88 Remove now-unused csort() 2018-04-11 09:37:32 +10:00
Tim Bell 2dda2b12b4 Speedup layout with .sort() and sortedcontainers.SortedListWithKey() 2018-04-11 09:03:32 +10:00
Gregory Mori 335c25c045 only check for bytes input to enc() in python3
In python2, isinstance("", bytes) is true, causing enc() to
suppress any string input. This results in fontnames being lost
when running pdf2txt.py in python2.

As this check was not present in the original python2 version of
pdfminer, restrict it to only check when running in python3.
2018-04-09 12:21:59 -07:00