Commit Graph

504 Commits (ad05121c69c3fdc9720e93b8a028b101ca80d322)

Author SHA1 Message Date
Matthew Duggan f02cb11945 Update test references based on recent layout analysis improvements 2013-11-07 17:44:09 +09:00
Matthew Duggan 2caa5edc25 PEP8: Whitespace changes to match pep8 2013-11-07 17:35:04 +09:00
Matthew Duggan c1da8b835c PEP8: Remove trailing whitespace 2013-11-07 16:14:53 +09:00
Matthew Duggan 024b821056 Make pyflakes happy by defining variable 2013-11-07 16:10:14 +09:00
Matthew Duggan 10a68c83bd Remove unused imports identified by pyflakes 2013-11-07 16:09:44 +09:00
Yusuke Shinyama 9ff6aa0463 Updated the document. 2013-11-05 18:25:37 +09:00
Yusuke Shinyama 4ef81ae9d8 Improved word spacing. 2013-11-05 18:25:19 +09:00
Yusuke Shinyama 96667d286f Updated documentation. 2013-10-27 00:05:26 +09:00
Yusuke Shinyama 02ad086f6a fixed: HTMLConverter. 2013-10-25 18:10:40 +09:00
Yusuke Shinyama a1cae26a74 Documentation updated. 2013-10-23 00:21:03 +09:00
Yusuke Shinyama 86348eba2f Documentation updated. 2013-10-23 00:17:12 +09:00
Yusuke Shinyama 87842233b3 Version bump! 2013-10-22 22:19:38 +09:00
Yusuke Shinyama ead3137121 updated documents. 2013-10-22 19:09:14 +09:00
Yusuke Shinyama 4f677b6bcf fixed: wrong dates in index.html 2013-10-22 19:00:26 +09:00
Yusuke Shinyama d3730a29ec API change: process_pdf -> PDFPage.get_pages 2013-10-22 18:59:16 +09:00
Yusuke Shinyama 8a70a9f657 fixed: encoding problem with vertical characters. 2013-10-22 18:44:40 +09:00
Yusuke Shinyama e927bd307e fixed: https://github.com/euske/pdfminer/issues/8 2013-10-22 18:24:39 +09:00
Yusuke Shinyama 32844507ea Fixed some style issues. 2013-10-19 08:41:01 +09:00
Yusuke Shinyama 28cb424f8f Merge pull request #21 from eug48/master
dumppdf: support for extracting embedded files using the -E option
2013-10-18 16:23:09 -07:00
Yusuke Shinyama 2aa757978b Reverted to Python2.x syntax. Fixed LZW decoding. 2013-10-19 08:19:40 +09:00
Yusuke Shinyama bfd9e93c12 Merge branch 'master' of https://github.com/JordanReiter/pdfminer into JordanReiter-master 2013-10-19 07:46:45 +09:00
Yusuke Shinyama 8e4c0c88e3 fixed: https://github.com/euske/pdfminer/issues/26 2013-10-17 23:20:08 +09:00
Yusuke Shinyama 6ca9ac5434 chmod fix. 2013-10-17 23:06:07 +09:00
Yusuke Shinyama 0ea08890d4 renamed: python2 -> python. 2013-10-17 23:05:27 +09:00
Yusuke Shinyama 6ad82e355c Beating the codepage dragon. 2013-10-17 22:57:48 +09:00
Yusuke Shinyama 8d42eec94d in_cmap is on by default. 2013-10-17 21:40:43 +09:00
Yusuke Shinyama de9f9715e3 Added: Adobe-UCS 2013-10-17 21:35:25 +09:00
Yusuke Shinyama 774827b4ce Code cleanup: conv_cmap.py 2013-10-12 13:20:40 +09:00
Yusuke Shinyama 1455f134c6 Fixed: missing ObjStm due to invalid seek. 2013-10-10 20:10:57 +09:00
Yusuke Shinyama f85c374cae Separated PDFPage to pdfpage.py. 2013-10-10 19:54:55 +09:00
Yusuke Shinyama 2df67d85ae Expand ObjStm in XRefFallback. 2013-10-10 19:40:43 +09:00
Yusuke Shinyama e4bc4e43b1 Code cleanup. 2013-10-10 19:17:58 +09:00
Yusuke Shinyama cfd60eafbf Removed PDFDocument.read_xref(). 2013-10-10 18:57:08 +09:00
Yusuke Shinyama 658be970b8 Separated PDFXRefFallback. 2013-10-10 18:44:12 +09:00
Yusuke Shinyama c926874d20 API Change: the PDFDocument cstr now takes PDFParser. set_parser() is removed. 2013-10-10 18:40:06 +09:00
Yusuke Shinyama 557c2c72e6 Removed ObjIdRange for terseness. 2013-10-10 18:34:43 +09:00
Yusuke Shinyama 2221163b94 Split pdfparser.py and pdfdocument.py. 2013-10-10 18:29:30 +09:00
Yusuke Shinyama 1467fc674c Added fallback for broken PDFs. 2013-10-09 22:45:54 +09:00
Yusuke Shinyama eabe72ee63 Prevent crash with empty layout box. 2013-10-09 22:13:22 +09:00
Yusuke Shinyama 87143cb36f Fallback when /Pages does not exist. 2013-10-09 22:08:16 +09:00
Yusuke Shinyama 06425bba00 Introducing PDFObjectNotFound 2013-10-09 21:39:23 +09:00
Yusuke Shinyama 3c3cba2ecc Moved: import PIL. 2013-04-09 18:42:32 +09:00
Yusuke Shinyama 19e7d70ac1 Merge pull request #15 from jcushman/patch-1
2x faster layout analysis: Use set instead of list for Plane's internal collection of objects.
2013-04-09 02:39:46 -07:00
Yusuke Shinyama d932bf675e Merge branch 'master' of github.com:euske/pdfminer 2013-04-09 18:28:24 +09:00
Yusuke Shinyama 9df9df58c1 Merged: dangra/785080b4eae259015d4fa63991356a4034751683 2013-04-09 18:21:45 +09:00
Yusuke Shinyama 4faccff9c9 Merge pull request #16 from jcushman/master
2x faster group_textboxes function.
2013-04-09 01:58:56 -07:00
Yusuke Shinyama d8bc13b3af Merge pull request #13 from gendoc/master
PDFDocument.lookup_name.lookup isn't searching for 'Names' key.
2013-04-09 01:55:54 -07:00
Yusuke Shinyama 93a53059dd Merge pull request #12 from sorced-jim/master
Deal with CMYK images by converting them to RGB
2013-04-09 01:55:04 -07:00
Jordan Reiter e28b75a462 StringIO 2013-03-27 13:14:58 -04:00
Jordan Reiter 44653071c3 Fixes for LZW error (see https://bitbucket.org/hsoft/pdfminer3k/commits/ae9a4ca0691a/) 2013-03-27 13:05:29 -04:00