Matthew Duggan
|
10a68c83bd
|
Remove unused imports identified by pyflakes
|
2013-11-07 16:09:44 +09:00 |
Yusuke Shinyama
|
4ef81ae9d8
|
Improved word spacing.
|
2013-11-05 18:25:19 +09:00 |
Yusuke Shinyama
|
02ad086f6a
|
fixed: HTMLConverter.
|
2013-10-25 18:10:40 +09:00 |
Yusuke Shinyama
|
87842233b3
|
Version bump!
|
2013-10-22 22:19:38 +09:00 |
Yusuke Shinyama
|
d3730a29ec
|
API change: process_pdf -> PDFPage.get_pages
|
2013-10-22 18:59:16 +09:00 |
Yusuke Shinyama
|
e927bd307e
|
fixed: https://github.com/euske/pdfminer/issues/8
|
2013-10-22 18:24:39 +09:00 |
Yusuke Shinyama
|
2aa757978b
|
Reverted to Python2.x syntax. Fixed LZW decoding.
|
2013-10-19 08:19:40 +09:00 |
Yusuke Shinyama
|
bfd9e93c12
|
Merge branch 'master' of https://github.com/JordanReiter/pdfminer into JordanReiter-master
|
2013-10-19 07:46:45 +09:00 |
Yusuke Shinyama
|
8e4c0c88e3
|
fixed: https://github.com/euske/pdfminer/issues/26
|
2013-10-17 23:20:08 +09:00 |
Yusuke Shinyama
|
0ea08890d4
|
renamed: python2 -> python.
|
2013-10-17 23:05:27 +09:00 |
Yusuke Shinyama
|
8d42eec94d
|
in_cmap is on by default.
|
2013-10-17 21:40:43 +09:00 |
Yusuke Shinyama
|
de9f9715e3
|
Added: Adobe-UCS
|
2013-10-17 21:35:25 +09:00 |
Yusuke Shinyama
|
1455f134c6
|
Fixed: missing ObjStm due to invalid seek.
|
2013-10-10 20:10:57 +09:00 |
Yusuke Shinyama
|
f85c374cae
|
Separated PDFPage to pdfpage.py.
|
2013-10-10 19:54:55 +09:00 |
Yusuke Shinyama
|
2df67d85ae
|
Expand ObjStm in XRefFallback.
|
2013-10-10 19:40:43 +09:00 |
Yusuke Shinyama
|
e4bc4e43b1
|
Code cleanup.
|
2013-10-10 19:17:58 +09:00 |
Yusuke Shinyama
|
cfd60eafbf
|
Removed PDFDocument.read_xref().
|
2013-10-10 18:57:08 +09:00 |
Yusuke Shinyama
|
658be970b8
|
Separated PDFXRefFallback.
|
2013-10-10 18:44:12 +09:00 |
Yusuke Shinyama
|
c926874d20
|
API Change: the PDFDocument cstr now takes PDFParser. set_parser() is removed.
|
2013-10-10 18:40:06 +09:00 |
Yusuke Shinyama
|
557c2c72e6
|
Removed ObjIdRange for terseness.
|
2013-10-10 18:34:43 +09:00 |
Yusuke Shinyama
|
2221163b94
|
Split pdfparser.py and pdfdocument.py.
|
2013-10-10 18:29:30 +09:00 |
Yusuke Shinyama
|
1467fc674c
|
Added fallback for broken PDFs.
|
2013-10-09 22:45:54 +09:00 |
Yusuke Shinyama
|
eabe72ee63
|
Prevent crash with empty layout box.
|
2013-10-09 22:13:22 +09:00 |
Yusuke Shinyama
|
87143cb36f
|
Fallback when /Pages does not exist.
|
2013-10-09 22:08:16 +09:00 |
Yusuke Shinyama
|
06425bba00
|
Introducing PDFObjectNotFound
|
2013-10-09 21:39:23 +09:00 |
Yusuke Shinyama
|
3c3cba2ecc
|
Moved: import PIL.
|
2013-04-09 18:42:32 +09:00 |
Yusuke Shinyama
|
19e7d70ac1
|
Merge pull request #15 from jcushman/patch-1
2x faster layout analysis: Use set instead of list for Plane's internal collection of objects.
|
2013-04-09 02:39:46 -07:00 |
Yusuke Shinyama
|
4faccff9c9
|
Merge pull request #16 from jcushman/master
2x faster group_textboxes function.
|
2013-04-09 01:58:56 -07:00 |
Yusuke Shinyama
|
d8bc13b3af
|
Merge pull request #13 from gendoc/master
PDFDocument.lookup_name.lookup isn't searching for 'Names' key.
|
2013-04-09 01:55:54 -07:00 |
Jordan Reiter
|
e28b75a462
|
StringIO
|
2013-03-27 13:14:58 -04:00 |
Jordan Reiter
|
44653071c3
|
Fixes for LZW error (see https://bitbucket.org/hsoft/pdfminer3k/commits/ae9a4ca0691a/)
|
2013-03-27 13:05:29 -04:00 |
jcushman
|
f77f196cd3
|
2x faster group_textboxes function.
|
2012-06-22 18:11:45 -03:00 |
jcushman
|
da3f023b2d
|
Use set instead of list for Plane's internal collection of objects.
|
2012-06-22 16:36:33 -03:00 |
Humberto Pereira
|
89c81db295
|
PDFDocument.lookup_names.lookup didn't find 'Names' in some files
|
2012-03-19 16:42:58 -03:00 |
Jim Morrison
|
6413eb7de4
|
Deal with CMYK images by converting them to RGB. PIL does not invert CMYK images as of PIL 1.1.7, so the invert happens in ImageWriter.
|
2012-01-24 16:18:36 -08:00 |
Yusuke Shinyama
|
c7709045e9
|
fixed: invalid bmp file output
|
2011-11-08 00:29:24 +10:00 |
Yusuke Shinyama
|
82ff98c7b3
|
imagewriter now works with text output
|
2011-11-07 01:15:10 +10:00 |
Yusuke Shinyama
|
91174b5665
|
avoid crash when colorspace is null.
|
2011-11-06 20:10:48 +10:00 |
Yusuke Shinyama
|
3d1652963a
|
Merge github.com:euske/pdfminer
|
2011-10-30 15:44:49 +10:00 |
dwilson
|
60dbf6bb69
|
avoids crash in pdf syntax error for missing ids
when an object id is out of range, rather than crashing, only raise a
pdf syntax error if STRICT is enabled and return None otherwise
|
2011-08-31 17:03:10 -04:00 |
Yusuke Shinyama
|
f638784e1d
|
experimental layout analysis improvements
|
2011-08-14 09:44:21 +09:00 |
Yusuke Shinyama
|
cbb8d869c7
|
removed initial cmap/ directory
|
2011-07-31 18:05:07 +10:00 |
Yusuke Shinyama
|
cdef0d7883
|
Merge github.com:euske/pdfminer
|
2011-07-31 17:47:20 +10:00 |
Yusuke Shinyama
|
46bb0107aa
|
fixed: crash due to small layout elements (thanks to hsoft)
|
2011-07-31 17:44:09 +10:00 |
Yusuke Shinyama
|
eec317ae10
|
Merge pull request #6 from rsennrich/master
cleaner widths for Adobe core 14 fonts. (thanks to rsennrich)
|
2011-07-31 00:39:36 -07:00 |
Yusuke Shinyama
|
24cd161fb7
|
CCITTFaxFilter.reversed fix
|
2011-07-31 17:36:02 +10:00 |
Rico
|
6e4f36d9a1
|
get width based on utf-8 char.
fills some gaps and fixes inconsistencies between standard encodings
|
2011-07-23 16:34:11 +02:00 |
Yusuke Shinyama
|
dc8fde0e47
|
added CCITTFaxFilter support and a very crude image extraction.
|
2011-07-18 21:07:00 +10:00 |
Yusuke Shinyama
|
2707ba75df
|
added CCITTFaxFilter support and a very crude image extraction.
|
2011-07-18 21:06:50 +10:00 |
Yusuke Shinyama
|
fda6f7ba5d
|
ccitt.py added.
|
2011-07-18 17:36:37 +10:00 |