+documentation.

git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@175 1aa58f4a-7d42-0410-adbc-911cccaed67c
pull/1/head
yusuke.shinyama.dummy 2010-01-30 07:33:18 +00:00
parent dc6e5c366d
commit 7969feeae1
2 changed files with 10 additions and 3 deletions

4
TODO
View File

@ -3,5 +3,5 @@ TODOs:
- Better text extraction / layout analysis. - Better text extraction / layout analysis.
- Better API Documentation. - Better API Documentation.
- Robust error handling. - Robust error handling.
- Any special handling for linearized PDFs? - Crypt stream filter support. (More sample documents are needed!)
- Handle crypt filter. (More sample documents are needed!) - CCITTFax stream filter support.

View File

@ -19,7 +19,7 @@ Python PDF parser and analyzer
<div align=right class=lastmod> <div align=right class=lastmod>
<!-- hhmts start --> <!-- hhmts start -->
Last Modified: Mon Jan 4 23:23:00 JST 2010 Last Modified: Sat Jan 30 16:32:50 JST 2010
<!-- hhmts end --> <!-- hhmts end -->
</div> </div>
@ -204,6 +204,10 @@ HTML-like tags. pdf2txt tries to extract its content streams rather than inferri
Tags used here are defined in the PDF specification (See &sect;10.7 "<em>Tagged PDF</em>"). Tags used here are defined in the PDF specification (See &sect;10.7 "<em>Tagged PDF</em>").
</ul> </ul>
<p> <p>
<dt> <code>-I <em>image_directory</em></code>
<dd> Specifies the output directory for image extraction.
Currently only JPEG images are supported.
<p>
<dt> <code>-D <em>direction</em></code> <dt> <code>-D <em>direction</em></code>
<dt> <code>-M <em>char_margin</em></code> <dt> <code>-M <em>char_margin</em></code>
<dt> <code>-L <em>line_margin</em></code> <dt> <code>-L <em>line_margin</em></code>
@ -334,6 +338,8 @@ no stream header is displayed for the ease of saving it to a file.
<a href="http://www.python.org/dev/peps/pep-0257/">PEP-257</a> conformance. <a href="http://www.python.org/dev/peps/pep-0257/">PEP-257</a> conformance.
<li> Better text extraction / layout analysis. <li> Better text extraction / layout analysis.
<li> Better API Documentation. <li> Better API Documentation.
<li> Crypt stream filter support. (More sample documents are needed!)
<li> CCITTFax stream filter support.
<li> Robust error handling. <li> Robust error handling.
</ul> </ul>
@ -341,6 +347,7 @@ no stream header is displayed for the ease of saving it to a file.
<hr noshade> <hr noshade>
<h2>Changes</h2> <h2>Changes</h2>
<ul> <ul>
<li> 2010/01/30: JPEG image extraction supported.
<li> 2010/01/04: Python 2.6 warning removal. More doctest conversion. <li> 2010/01/04: Python 2.6 warning removal. More doctest conversion.
<li> 2010/01/01: CMap bug fix. Thanks to Winfried Plappert. <li> 2010/01/01: CMap bug fix. Thanks to Winfried Plappert.
<li> 2009/12/24: RunLengthDecode filter added. Thanks to Troy Bollinger. <li> 2009/12/24: RunLengthDecode filter added. Thanks to Troy Bollinger.