git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@199 1aa58f4a-7d42-0410-adbc-911cccaed67c
parent
71defb2272
commit
434720f767
|
@ -19,7 +19,7 @@ Python PDF parser and analyzer
|
||||||
|
|
||||||
<div align=right class=lastmod>
|
<div align=right class=lastmod>
|
||||||
<!-- hhmts start -->
|
<!-- hhmts start -->
|
||||||
Last Modified: Fri Mar 26 11:14:17 UTC 2010
|
Last Modified: Sun Mar 28 07:21:28 UTC 2010
|
||||||
<!-- hhmts end -->
|
<!-- hhmts end -->
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
@ -63,6 +63,9 @@ PDF parser that can be used for other purposes instead of text analysis.
|
||||||
<li> Tagged contents extraction.
|
<li> Tagged contents extraction.
|
||||||
<li> Reconstruct the original layout by grouping text chunks.
|
<li> Reconstruct the original layout by grouping text chunks.
|
||||||
</ul>
|
</ul>
|
||||||
|
<p>
|
||||||
|
On the performance, PDFMiner is about 20 times slower than
|
||||||
|
other C/C++-based software such as XPdf.
|
||||||
|
|
||||||
<a name="source"></a>
|
<a name="source"></a>
|
||||||
<p>
|
<p>
|
||||||
|
|
6
setup.py
6
setup.py
|
@ -7,11 +7,12 @@ setup(
|
||||||
version=__version__,
|
version=__version__,
|
||||||
description='PDF parser and analyzer',
|
description='PDF parser and analyzer',
|
||||||
long_description='''PDFMiner is a suite of programs that help
|
long_description='''PDFMiner is a suite of programs that help
|
||||||
extracting and analyzing text data of PDF documents.
|
extracting and analyzing text data from PDF documents.
|
||||||
Unlike other PDF-related tools, it allows to obtain
|
Unlike other PDF-related tools, it allows to obtain
|
||||||
the exact location of texts in a page, as well as
|
the exact location of texts in a page, as well as
|
||||||
other extra information such as font information or ruled lines.
|
other extra information such as font information or ruled lines.
|
||||||
It includes a PDF converter that can transform PDF files
|
It can also infer its text flow and reconstruct the original layout.
|
||||||
|
PDFMiner includes a PDF converter that can transform PDF files
|
||||||
into other text formats (such as HTML). It has an extensible
|
into other text formats (such as HTML). It has an extensible
|
||||||
PDF parser that can be used for other purposes instead of text analysis.''',
|
PDF parser that can be used for other purposes instead of text analysis.''',
|
||||||
license='MIT/X',
|
license='MIT/X',
|
||||||
|
@ -33,5 +34,6 @@ PDF parser that can be used for other purposes instead of text analysis.''',
|
||||||
'Intended Audience :: Developers',
|
'Intended Audience :: Developers',
|
||||||
'Intended Audience :: Science/Research',
|
'Intended Audience :: Science/Research',
|
||||||
'License :: OSI Approved :: MIT License',
|
'License :: OSI Approved :: MIT License',
|
||||||
|
'Topic :: Text Processing',
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
|
|
Loading…
Reference in New Issue