+documentation.
git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@175 1aa58f4a-7d42-0410-adbc-911cccaed67cpull/1/head
parent
dc6e5c366d
commit
7969feeae1
4
TODO
4
TODO
|
@ -3,5 +3,5 @@ TODOs:
|
||||||
- Better text extraction / layout analysis.
|
- Better text extraction / layout analysis.
|
||||||
- Better API Documentation.
|
- Better API Documentation.
|
||||||
- Robust error handling.
|
- Robust error handling.
|
||||||
- Any special handling for linearized PDFs?
|
- Crypt stream filter support. (More sample documents are needed!)
|
||||||
- Handle crypt filter. (More sample documents are needed!)
|
- CCITTFax stream filter support.
|
||||||
|
|
|
@ -19,7 +19,7 @@ Python PDF parser and analyzer
|
||||||
|
|
||||||
<div align=right class=lastmod>
|
<div align=right class=lastmod>
|
||||||
<!-- hhmts start -->
|
<!-- hhmts start -->
|
||||||
Last Modified: Mon Jan 4 23:23:00 JST 2010
|
Last Modified: Sat Jan 30 16:32:50 JST 2010
|
||||||
<!-- hhmts end -->
|
<!-- hhmts end -->
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
@ -204,6 +204,10 @@ HTML-like tags. pdf2txt tries to extract its content streams rather than inferri
|
||||||
Tags used here are defined in the PDF specification (See §10.7 "<em>Tagged PDF</em>").
|
Tags used here are defined in the PDF specification (See §10.7 "<em>Tagged PDF</em>").
|
||||||
</ul>
|
</ul>
|
||||||
<p>
|
<p>
|
||||||
|
<dt> <code>-I <em>image_directory</em></code>
|
||||||
|
<dd> Specifies the output directory for image extraction.
|
||||||
|
Currently only JPEG images are supported.
|
||||||
|
<p>
|
||||||
<dt> <code>-D <em>direction</em></code>
|
<dt> <code>-D <em>direction</em></code>
|
||||||
<dt> <code>-M <em>char_margin</em></code>
|
<dt> <code>-M <em>char_margin</em></code>
|
||||||
<dt> <code>-L <em>line_margin</em></code>
|
<dt> <code>-L <em>line_margin</em></code>
|
||||||
|
@ -334,6 +338,8 @@ no stream header is displayed for the ease of saving it to a file.
|
||||||
<a href="http://www.python.org/dev/peps/pep-0257/">PEP-257</a> conformance.
|
<a href="http://www.python.org/dev/peps/pep-0257/">PEP-257</a> conformance.
|
||||||
<li> Better text extraction / layout analysis.
|
<li> Better text extraction / layout analysis.
|
||||||
<li> Better API Documentation.
|
<li> Better API Documentation.
|
||||||
|
<li> Crypt stream filter support. (More sample documents are needed!)
|
||||||
|
<li> CCITTFax stream filter support.
|
||||||
<li> Robust error handling.
|
<li> Robust error handling.
|
||||||
</ul>
|
</ul>
|
||||||
|
|
||||||
|
@ -341,6 +347,7 @@ no stream header is displayed for the ease of saving it to a file.
|
||||||
<hr noshade>
|
<hr noshade>
|
||||||
<h2>Changes</h2>
|
<h2>Changes</h2>
|
||||||
<ul>
|
<ul>
|
||||||
|
<li> 2010/01/30: JPEG image extraction supported.
|
||||||
<li> 2010/01/04: Python 2.6 warning removal. More doctest conversion.
|
<li> 2010/01/04: Python 2.6 warning removal. More doctest conversion.
|
||||||
<li> 2010/01/01: CMap bug fix. Thanks to Winfried Plappert.
|
<li> 2010/01/01: CMap bug fix. Thanks to Winfried Plappert.
|
||||||
<li> 2009/12/24: RunLengthDecode filter added. Thanks to Troy Bollinger.
|
<li> 2009/12/24: RunLengthDecode filter added. Thanks to Troy Bollinger.
|
||||||
|
|
Loading…
Reference in New Issue