git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@199 1aa58f4a-7d42-0410-adbc-911cccaed67c

pull/1/head
yusuke.shinyama.dummy 2010-04-04 12:18:57 +00:00
parent 71defb2272
commit 434720f767
2 changed files with 8 additions and 3 deletions

View File

@ -19,7 +19,7 @@ Python PDF parser and analyzer
<div align=right class=lastmod> <div align=right class=lastmod>
<!-- hhmts start --> <!-- hhmts start -->
Last Modified: Fri Mar 26 11:14:17 UTC 2010 Last Modified: Sun Mar 28 07:21:28 UTC 2010
<!-- hhmts end --> <!-- hhmts end -->
</div> </div>
@ -63,6 +63,9 @@ PDF parser that can be used for other purposes instead of text analysis.
<li> Tagged contents extraction. <li> Tagged contents extraction.
<li> Reconstruct the original layout by grouping text chunks. <li> Reconstruct the original layout by grouping text chunks.
</ul> </ul>
<p>
On the performance, PDFMiner is about 20 times slower than
other C/C++-based software such as XPdf.
<a name="source"></a> <a name="source"></a>
<p> <p>

View File

@ -7,11 +7,12 @@ setup(
version=__version__, version=__version__,
description='PDF parser and analyzer', description='PDF parser and analyzer',
long_description='''PDFMiner is a suite of programs that help long_description='''PDFMiner is a suite of programs that help
extracting and analyzing text data of PDF documents. extracting and analyzing text data from PDF documents.
Unlike other PDF-related tools, it allows to obtain Unlike other PDF-related tools, it allows to obtain
the exact location of texts in a page, as well as the exact location of texts in a page, as well as
other extra information such as font information or ruled lines. other extra information such as font information or ruled lines.
It includes a PDF converter that can transform PDF files It can also infer its text flow and reconstruct the original layout.
PDFMiner includes a PDF converter that can transform PDF files
into other text formats (such as HTML). It has an extensible into other text formats (such as HTML). It has an extensible
PDF parser that can be used for other purposes instead of text analysis.''', PDF parser that can be used for other purposes instead of text analysis.''',
license='MIT/X', license='MIT/X',
@ -33,5 +34,6 @@ PDF parser that can be used for other purposes instead of text analysis.''',
'Intended Audience :: Developers', 'Intended Audience :: Developers',
'Intended Audience :: Science/Research', 'Intended Audience :: Science/Research',
'License :: OSI Approved :: MIT License', 'License :: OSI Approved :: MIT License',
'Topic :: Text Processing',
], ],
) )