README.txt for cmap This directory contains *.pickle.gz files converted from Adobe CMap resources. CMaps are required to decode text data written in CJK (Chinese, Japanese, Korean) language. CMap resources are now available freely from Adobe web site: http://opensource.adobe.com/wiki/display/cmap/CMap+Resources The follwing files were extracted from the downloadable tarballs: cid2code_Adobe_CNS1.txt: http://download.macromedia.com/pub/opensource/cmap/cmapresources_cns1-6.tar.z cid2code_Adobe_GB1.txt: http://download.macromedia.com/pub/opensource/cmap/cmapresources_gb1-5.tar.z cid2code_Adobe_Japan1.txt: http://download.macromedia.com/pub/opensource/cmap/cmapresources_japan1-6.tar.z cid2code_Adobe_Korea1.txt: http://download.macromedia.com/pub/opensource/cmap/cmapresources_korean1-2.tar.z These *.pickle.gz files can be generated by running following commands in the top directory: $ make cmap python tools/conv_cmap.py pdfminer/cmap Adobe-CNS1 cmaprsrc/cid2code_Adobe_CNS1.txt reading 'cmaprsrc/cid2code_Adobe_CNS1.txt'... writing 'CNS1_H.py'... ... On Windows machines which don't have `make` command, paste the following commands on a command line prompt: mkdir pdfminer\cmap python tools\conv_cmap.py -c B5=cp950 -c UniCNS-UTF8=utf-8 pdfminer\cmap Adobe-CNS1 cmaprsrc\cid2code_Adobe_CNS1.txt python tools\conv_cmap.py -c GBK-EUC=cp936 -c UniGB-UTF8=utf-8 pdfminer\cmap Adobe-GB1 cmaprsrc\cid2code_Adobe_GB1.txt python tools\conv_cmap.py -c RKSJ=cp932 -c EUC=euc-jp -c UniJIS-UTF8=utf-8 pdfminer\cmap Adobe-Japan1 cmaprsrc\cid2code_Adobe_Japan1.txt python tools\conv_cmap.py -c KSC-EUC=euc-kr -c KSC-Johab=johab -c KSCms-UHC=cp949 -c UniKS-UTF8=utf-8 pdfminer\cmap Adobe-Korea1 cmaprsrc\cid2code_Adobe_Korea1.txt Here is the license information in the original files: %%Copyright: ----------------------------------------------------------- %%Copyright: Copyright 1990-20xx Adobe Systems Incorporated. %%Copyright: All rights reserved. %%Copyright: %%Copyright: Redistribution and use in source and binary forms, with or %%Copyright: without modification, are permitted provided that the %%Copyright: following conditions are met: %%Copyright: %%Copyright: Redistributions of source code must retain the above %%Copyright: copyright notice, this list of conditions and the following %%Copyright: disclaimer. %%Copyright: %%Copyright: Redistributions in binary form must reproduce the above %%Copyright: copyright notice, this list of conditions and the following %%Copyright: disclaimer in the documentation and/or other materials %%Copyright: provided with the distribution. %%Copyright: %%Copyright: Neither the name of Adobe Systems Incorporated nor the names %%Copyright: of its contributors may be used to endorse or promote %%Copyright: products derived from this software without specific prior %%Copyright: written permission. %%Copyright: %%Copyright: THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND %%Copyright: CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, %%Copyright: INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF %%Copyright: MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE %%Copyright: DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR %%Copyright: CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, %%Copyright: SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT %%Copyright: NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; %%Copyright: LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) %%Copyright: HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN %%Copyright: CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR %%Copyright: OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS %%Copyright: SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. %%Copyright: -----------------------------------------------------------