Documentation updates.

2013-11-17 15:32:57 +09:00 · 2013-11-17 15:32:57 +09:00 · e39e39fa12
parent cf1e3c9973
commit e39e39fa12
2 changed files with 34 additions and 2 deletions
--- a/README.md
+++ b/README.md
@ -10,6 +10,7 @@ It includes a PDF converter that can transform PDF files
 into other text formats (such as HTML). It has an extensible
 PDF parser that can be used for other purposes than text analysis.

+
 Features
 --------

@ -23,6 +24,7 @@ Features
 * Tagged contents extraction.
 * Automatic layout analysis.

+
 How to Install
 --------------

@ -37,6 +39,7 @@ How to Install

    $ pdf2txt.py samples/simple1.pdf

+
 For CJK Languages
 -----------------

@ -60,6 +63,7 @@ paste the following commands on a command line prompt:
    python tools\conv_cmap.py -c KSC-EUC=euc-kr -c KSC-Johab=johab -c KSCms-UHC=cp949 -c UniKS-UTF8=utf-8 pdfminer\cmap Adobe-Korea1 cmaprsrc\cid2code_Adobe_Korea1.txt
    python setup.py install

+
 Command Line Tools
 ------------------

@ -87,6 +91,21 @@ but it's also possible to extract some meaningful contents (e.g. images).

 (For details, refer to the html document.)

+
+API Changes
+-----------
+
+As of November 2013, there were a few changes made to the PDFMiner API
+prior to October 2013. This is the result of code restructuring.  Here
+is a list of the changes:
+
+ * PDFDocument class is moved to pdfdocument.py.
+ * PDFDocument class now takes a PDFParser object as an argument.
+   PDFDocument.set_parser() and PDFParser.set_document() is removed.
+ * PDFPage class is moved to pdfpage.py
+ * process_pdf function is implemented as a class method PDFPage.get_pages.
+
+
 TODO
 ----

@ -97,6 +116,7 @@ TODO
 * Better documentation.
 * Crypt stream filter support.

+
 Related Projects
 ----------------

@ -105,6 +125,7 @@ Related Projects
 * <a href="http://www.pdfbox.org/">pdfbox</a>
 * <a href="http://mupdf.com/">mupdf</a>

+
 Terms and Conditions
 --------------------

--- a/docs/index.html
+++ b/docs/index.html
@ -9,7 +9,7 @@

 <div align=right class=lastmod>
 <!-- hhmts start -->
-Last Modified: Sat Oct 26 15:03:35 UTC 2013
+Last Modified: Sun Nov 17 06:32:44 UTC 2013
 <!-- hhmts end -->
 </div>

@ -368,7 +368,18 @@ no stream header is displayed for the ease of saving it to a file.

 <h2><a name="changes">Changes</a></h2>
 <ul>
-<li> 2013/10/22: Sudden resurge of interests. 
+<li> 2013/11/13: Bugfixes and minor improvements.<br>
+As of November 2013, there were a few changes made to the PDFMiner API
+prior to October 2013. This is the result of code restructuring.  Here
+is a list of the changes:
+ <ul>
+ <li> <code>PDFDocument</code> class is moved to <code>pdfdocument.py</code>.
+ <li> <code>PDFDocument</code> class now takes a <code>PDFParser</code> object as an argument.
+ <li> <code>PDFDocument.set_parser()</code> and <code>PDFParser.set_document()</code> is removed.
+ <li> <code>PDFPage</code> class is moved to <code>pdfpage.py</code>.
+ <li> <code>process_pdf</code> function is implemented as <code>PDFPage.get_pages</code>.
+</ul>
+<li> 2013/10/22: Sudden resurge of interests. API changes.
 Incorporated a lot of patches and robust handling of broken PDFs.
 <li> 2011/05/15: Speed improvements for layout analysis.
 <li> 2011/05/15: API changes. <code>LTText.get_text()</code> is added.