git-svn-id: https://pdfminerr.googlecode.com/svn/trunk/pdfminer@200 1aa58f4a-7d42-0410-adbc-911cccaed67c
pull/1/head
yusuke.shinyama.dummy 2010-04-06 10:51:16 +00:00
parent 434720f767
commit e2e9adfaf3
1 changed files with 4 additions and 7 deletions

View File

@ -19,7 +19,7 @@ Python PDF parser and analyzer
<div align=right class=lastmod>
<!-- hhmts start -->
Last Modified: Sun Mar 28 07:21:28 UTC 2010
Last Modified: Mon Apr 5 23:15:31 UTC 2010
<!-- hhmts end -->
</div>
@ -46,7 +46,7 @@ extracting some meaningful information out of PDF documents.
Unlike other PDF-related tools, it focuses entirely on getting
and analyzing text data from PDFs. PDFMiner allows to obtain
the exact location of texts in a page, as well as
other extra information such as font information or ruled lines.
other information such as fonts or ruled lines.
It includes a PDF converter that can transform PDF files
into other text formats (such as HTML). It has an extensible
PDF parser that can be used for other purposes instead of text analysis.
@ -131,11 +131,8 @@ W o r l d
<p>
<a name="cmap"></a>
<h3>For CJK languages</h3>
In order to handle CJK languages,
an additional data called <code>CMap</code> is required.
CMap files are not installed by default.
<p>
Here is the additional step you need to take:
In order to process CJK languages, you need an additional step to take
during installation:
<blockquote><pre>
# <strong>make cmap</strong>
python tools/conv_cmap.py pdfminer/cmap Adobe-CNS1 cmaprsrc/cid2code_Adobe_CNS1.txt cp950 big5