pdfminer.six/pdfminer
speedplane 806ee603ff More fixes to layout. The compute neighbors function for horizontal lines is only intended to find neighbors on differing lines. However, it's entirely possible that horizontal neighbors could appear.
This commit finds horizontal neighbors in a horizonal line and merges them together into a single horizontal line if necessary.  This leads to much better text extraction  if the PDF was created in a funky way.

For example (test case coming), I have seen PDFs which are written almost like vertical columns, but the text is entirely horizontal.
2014-12-12 00:36:59 -05:00
..
Makefile apply more patches 2010-02-13 15:00:43 +00:00
__init__.py Use print as a function. 2014-06-15 12:14:33 +09:00
arcfour.py String-Bytes distinction (first attempt). 2014-06-30 19:05:56 +09:00
ascii85.py String-Bytes distinction (first attempt). 2014-06-30 19:05:56 +09:00
ccitt.py String-Bytes distinction (first attempt). 2014-06-30 19:05:56 +09:00
cmapdb.py Use KWD instead of token.name. 2014-06-30 19:15:21 +09:00
converter.py Cleanup imports. Use relative imports. 2014-06-26 18:12:39 +09:00
encodingdb.py Cleanup imports. Use relative imports. 2014-06-26 18:12:39 +09:00
fontmetrics.py PEP8: Remove trailing whitespace 2013-11-07 16:14:53 +09:00
glyphlist.py renamed: python2 -> python. 2013-10-17 23:05:27 +09:00
image.py Fixed: DecodeParms array support. 2014-07-09 19:07:27 +09:00
latin_enc.py renamed: python2 -> python. 2013-10-17 23:05:27 +09:00
layout.py More fixes to layout. The compute neighbors function for horizontal lines is only intended to find neighbors on differing lines. However, it's entirely possible that horizontal neighbors could appear. 2014-12-12 00:36:59 -05:00
lzw.py String-Bytes distinction (first attempt). 2014-06-30 19:05:56 +09:00
pdfcolor.py Cleanup imports. Use relative imports. 2014-06-26 18:12:39 +09:00
pdfdevice.py Cleanup imports. Use relative imports. 2014-06-26 18:12:39 +09:00
pdfdocument.py Add debug flags for much of the debug output. 2014-11-11 23:36:58 -05:00
pdffont.py String-Bytes distinction (first attempt). 2014-06-30 19:05:56 +09:00
pdfinterp.py Add debug flags for much of the debug output. 2014-11-11 23:36:58 -05:00
pdfpage.py Add debug flags for much of the debug output. 2014-11-11 23:36:58 -05:00
pdfparser.py String-Bytes distinction (first attempt). 2014-06-30 19:05:56 +09:00
pdftypes.py Fixed: DecodeParms array support. 2014-07-09 19:07:27 +09:00
psparser.py Fixed: crash on invalid chr number. 2014-12-09 22:55:47 +09:00
rijndael.py String-Bytes distinction (first attempt). 2014-06-30 19:05:56 +09:00
runlength.py String-Bytes distinction (first attempt). 2014-06-30 19:05:56 +09:00
utils.py String-Bytes distinction (first attempt). 2014-06-30 19:05:56 +09:00